Suppose to be in a learning to rank scenario.
We have to manage a book catalog in an e-commerce website. Each book has many different features such as publishing year, target age, genre, author, and so on.
A user can visit the website, make a query through some filters selection on the books’ features, and then inspect the obtained search result page.
In order to train our model, we collect all the interactions that users have with the website products (e.g. views, clicks, add to cart, sales..) and create a data set consisting of <query, document> pairs (e.g. the filters selected and the features of the product viewed/clicked/sold/…).
We obtain something like this, where s_feature indicates the selected feature from the website filters and book_feature the feature of the product with which the user interacted: