Main Blog RRE
id discovery

Previously we described the RREE browser plugin which makes the explicit rating collection more friendly. We strongly suggest having a look at that post, if you didn’t, or at least at the embedded video which illustrates a real example.

The simplification RREE offers for defining and assigning those ratings to a given set of search results, comes with a price. Fortunately, that price has been paid by developers; let’s see what it consists of.

The diagram below illustrates the explicit rating collection in RREE

The overall process is user-oriented: the judge is already familiar with the user interface of their company’s web portal. Apart from the plugin activation (which has to be done once) and the rating creation itself, the user interface is the same as the user is used to seeing.

Let’s move to the last step: when the payload is sent to RREE, it contains, not surprisingly, a set of information related to the user’s ratings. 

We know that a rating is a triple that associate: 

  • a query
  • a document
  • a rating

Although, as you can imagine, the information is sent in the last step, it differs a bit from the information RREE needs for running a search quality evaluation. Specifically, the query and the document pose two challenges, which we will describe in the following sections.

First Challenge: The Query

The query the user expresses in a typical search interface consists of a set of terms entered in a text box: the so-called “simple search”.

On RREE side, what we need to run the evaluation process is not the user query. Instead, we need to collect the request triggered as a consequence of that user query and sent it to the intermediate Search API: in RREE, we call it Black Box API Request. The name indicates that layer is a kind of opaque thing for us, an application layer that intermediates between the client and the server.  

The Black Box API Request (which includes the user query) is then used, within RREE, for discovering the Search Engine Request, which we use for running the evaluation process

We described the Query Discovery process in the previous post about Query discovery in RRE-Enterprise.

Second Challenge: The Document

There’s a logical gap between the document entity representation from a user and a server perspective

For a user, a document is an item in the user interface, the result of a given search, it is assigned to a given rating. Behind the scenes, the technical representation consists of HTML code.

For a search engine, a document is an object, an instance of a class used to denote a search result. It usually consists of a Map-like (i.e., key-value pairs) structure where keys are attribute names and values are attribute values.

The “identity” of those two entities is different. The server-side requires a unique, system-scoped identifier associated with every document.

On the client-side, instead, the identifier: 

  • is optional
  • when present, it is usually page-scoped (i.e., unique across the other identifiers on the current page)
  • most probably differs from the server identifier.  

In such a context the RREE ID Discovery component comes to help. 

The identifier discovery is the first thing that happens when a set of explicit ratings are received.

The component is able to find a correlation between the document as it is received in the incoming payload and the corresponding server-side representation. The correlation is then persisted as part of the rating definition and it is used in the subsequent evaluation process.

The main challenge in the correlation phase is the potential lack of information that could arrive in the payload. We already said, the identifiers there are optional, can differ, and usually follow a completely different logic. 

The key factor for having a good id correlation in RREE cannot be determined in advance; the process relies on an accurate configuration both on the browser plugin and in the discovery engine.    

Recap

ID discovery/correlation is a crucial part of the RREE search quality evaluation process. 

It fills the gap in the identity representation between client and server. The concept typically assumes different shapes in those two worlds, because it observes different rules, and belongs to a different part of the system. 

RREE ID Discovery finds the correlation between them and therefore. The correlation phase is completely transparent to the end user, it allows: 

  • to use a powerful, end-user-oriented tool (the browser plugin) for expressing the ratings
  • to reconcile the user document representation to the internal search engine document (actually used in the evaluation process). 
// BEGIN YOUR JOURNEY INTO THE SEARCH QUALITY EVALUATION

Rated Ranking Evaluator Enterprise

// STAY ALWAYS UP TO DATE

Subscribe to our newsletter

Did you like this post about ID Discovery in RRE Enterprise? Don’t forget to subscribe to our Newsletter to stay always updated in the Information Retrieval world!

Author

Andrea Gazzarini

Andrea Gazzarini is a curious software engineer, mainly focused on the Java language and Search technologies. With more than 15 years of experience in various software engineering areas, his adventure in the search world began in 2010, when he met Apache Solr and later Elasticsearch.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.