Main Blog RRE

RRE-Enterprise: How to Manage Your Data Collections

Once configured the golden truth and the target search engine it’s now time to define the data collections.
Each data collection in RRE-Enterprise is linked to an Elasticsearch index or an Apache Solr collection.
It models a group of data that shares the same data model and structure.


A document in Elasticsearch and Solr is represented as a map field name->values and can potentially contain tens of fields.
To simplify the exploration and debugging of the search result lists in RRE-Enterprise, is possible to define a subset of fields of interest.


Currently, it’s supported to define:
– a unique identifier field for the collection
– the field containing a descriptive title
…and you are good to go.
The name of the collection must be associated with the mapping so that it can be retrieved at evaluation time.

Test Dataset

If you are working with a collection on an Embedded search engine instance, it’s necessary to push some data to Elasticsearch or Solr when you spin them up at evaluation time.

Specifically, you need to provide RRE-Enterprise with the same data that was used when preparing the golden truth rating set.
Each rating set used for embedded evaluations is coupled with exactly one test dataset.

Uploading the test dataset is quite simple:

N.B. you don’t need to worry about the test dataset in RRE-Enterprise if you are evaluating an external search engine.
But you need to be extra careful and make sure the data in it when running the evaluation is aligned with the ratings.

We are now ready to run our first evaluation!

Stay tuned for our next blog post!


Rated Ranking Evaluator Enterprise


Subscribe to our newsletter

Did you like this post about Drop constant features: a real-world Learning to Rank scenario? Don’t forget to subscribe to our Newsletter to stay always updated from the Information Retrieval world!


Alessandro Benedetti

Alessandro Benedetti is the founder of Sease Ltd. Senior Search Software Engineer, his focus is on R&D in information retrieval, information extraction, natural language processing, and machine learning.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.