FOSDEM is a two-day event organised by volunteers to promote the widespread use of free and open source software.
Location: Bruxelles (Belgium)
Date: 2-3 February 2019
Rated Ranking Evaluator: an Open Source Approach for Search Quality Evaluation
Every team working on information retrieval software struggles with the task of evaluating how well their system performs in terms of search quality(currently and historically). Evaluating search quality is important both to understand and size the improvement or regression of your search application across the development cycles, and to communicate such progress to relevant stakeholders. To satisfy these requirements an helpful tool must be:
– flexible and highly configurable for a technical user
– immediate, visual and concise for an optimal business utilization In the industry, and especially in the open source community, the landscape is quite fragmented: such requirements are often achieved using ad-hoc partial solutions that each time require a considerable amount of development and customization effort.
To provide a standard, unified and approachable technology, we developed the Rated Ranking Evaluator (RRE), an open source tool for evaluating and measuring the search quality of a given search infrastructure. RRE is modular, compatible with multiple search technologies and easy to extend. It is composed by a core library and a set of modules and plugins that give it the flexibility to be integrated in automated evaluation processes and in continuous integrations flows. This talk will introduce RRE, it will describe its functionalities and demonstrate how it can be integrated in a project and how it can help to measure and assess the search quality of your search application. The focus of the presentation will be on a live demo showing an example project with a set of initial relevancy issues that we will solve iteration after iteration: using RRE output feedbacks to gradually drive the improvement process until we reach an optimal balance between quality evaluation measures.