Main Blog RRE

The first step in any search quality evaluation process is to define the ground truth:
a set of triplets <query, document, rating> that specifies how relevant is a document for a certain query.
The ratings can be produced in an explicit or implicit approach:

  • Explicit – a team of human judges(domain experts hopefully) assigns the rating for the <query, document> pair
  • Implicit – user interactions(clicks, downloads, add to carts, add to favourites…) are collected to estimate the relevance rating for the <query, document> pair

The RRE-Enterprise ecosystem supports both.
Let’s see how in this tutorial:

 

Ratings Tab

To start configuring your ratings, you need to click on the settings button and select the ‘Ratings’ tab.
This opens you to three functionalities:

  1. Upload of a JSON file containing the ratings in the RRE Open Source format
  2. Generate ratings from implicit interactions
  3. View the ratings and edit them
 

Upload


If you have used Rated Ranking Evaluation in the past, you are familiar with the JSON format supported for the rating file:

{
  "tag": "",
  "index": "",
  "corpora_field": "",
  "id_field": "",
  "topics": [
    {
      "description": "",
      "query_groups": [
        {
          "name": "",
          "queries": [
            {
              "template": "",
              "placeholders": {
                "$key": "",
              }
            }
          ],
          "relevant_documents": [
            {
              "document_id": {
                "gain": ""
              }
            }
          ]
        }
      ]
    }
  ],
  "query_groups": [
    {
      "name": "",
      "queries": [
        {
          "template": "",
          "placeholders": {
            "$key": "",
          }
        }
      ],
      "relevant_documents": [
        {
          "document_id": {
            "gain": ""
          }
        }
      ]
    }
  ],
  "queries": [
    {
      "template": "",
      "placeholders": {
        "$key": "",
      }
    }
  ]
}

  • tag: the unique identifier associated to this rating set (you can later retrieve a set of ratings by tag)
  • index: index name in Elasticsearch/collection name in Solr
  • corpora_file: the file containing the corpus of information the ratings are built on (the Solr or Elasticsearch documents to index). This is necessary if you want to spin up an embedded evaluation target search engine.
    If the target of your evaluation is an existing Elasticsearch/Solr(with its index already there) you don’t need this
  • id_field: the unique identifier field in the Elasticsearch mapping/Solr schema
  • topics: optional list of topics and/or query groups
    • description: Title of a topic that groups the inner groups of queries
    • query_groups: list of queries that are grouped with the same name and topic
      • name: query group identifier
      • queries: list of queries to execute for the group
        • template: String name for the template this query uses
        • placeholders: Object of key-value pairs to substitute in the template
    • relevant_documents: list objects with mapping documents to gain or relevance values
  • query_groups: optional list of objects with related queries.
  • queries: required* list of objects for template and placeholder substitutions for evaluations.
    • *If topics and query_groups are not defined

Curious to know how to automate and simplify the explicit rating judging process?
Our Judgement Collector Browser Plugin got you covered!
A blog about this component is coming soon.

 

Generate (from implicit interactions)

We have seen so far how to manage explicit ratings, but what about implicit ratings?
RRE-Enterprise offers you the ability to collect user interactions and generate ratings out of configurable metrics.

A <query, document> interaction is represented in JSON and can be collected by a dedicated endpoint in RRE-Enterprise:

POST http://localhost:8080/1.0/rre-enterprise-api/input-api/interaction

{
	"collection": "papers",
	"query": "interleaving",
	"blackBoxQueryRequest": "GET /query?q=interleaving HTTP/1.1\r\nHost: localhost:5063",
	"documentId": "1",
	"click": 1,
	"timestamp": "2021-06-23T16:10:49Z",
	"queryDocumentPosition": 0
}

  • collection: index name in Elasticsearch/collection name in Solr
  • query: a human-readable query representation. Most of the time it is going to be the free-text query
  • blackBoxQueryRequest: the full http request associated to the query. Most of the time it is a request to search-API layer
  • documentId: the unique identifier for the document (this must be the same used in Elasticsearch/Solr)
  • impression: 1 – if this interaction is an impression (document showed to the user in response to the query)
  • click: 1 – if this interaction is a click of the document in response to the query
  • addToCart: 1 – if this interaction is an add to cart of the document in response to the query
  • sale: 1 – if this interaction is a sale of the document in response to the query
  • revenue: – the amount of revenue associated to the sale of the document in response to the query
  • timestamp: the time when the interaction happened
  • queryDocumentPosition: the position in the search result list for the document in response to the query

Once RRE-Enterprise has been populated with interactions, it is possible to estimate <query, document, rating> triplets, using the generate functionality:

  • The Online Metric specifies what to use to estimate the relevance rating
    e.g.
    selecting Click Through Rate means that RRE-Enterprise will use the impressions and clicks for a certain <query, document> pair to estimate the relevance rating, comparing such value with the minimum and maximum across all queries in the collected interactions
  • The Filter Query allows to filter the interactions by any property (collection, date..)
  • The Tag is the unique identifier associated to the rating set we are generating
 

View

Once you have uploaded or generated your ratings, you can view and edit them.
First of all, you can retrieve the rating set by tag:

Each row is a single <query, document, rating> triplet.
You can edit the rating value and save it automatically by clicking on the rating.
Clicking on the query shows the details of the triplet.

You can also additionally filter your ratings by collection, topic, and query group, to refine your navigation.

// begin your journey into the search quality evaluation

Rated Ranking Evaluator Enterprise

// STAY ALWAYS UP TO DATE

Subscribe to our newsletter

Did you like this post about RRE-Enterprise and How to Manage Your Ratings? Don’t forget to subscribe to our Newsletter to stay always updated on the Information Retrieval world!

Author

Alessandro Benedetti

Alessandro Benedetti is the founder of Sease Ltd. Senior Search Software Engineer, his focus is on R&D in information retrieval, information extraction, natural language processing, and machine learning.

Leave a comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.