Online testing remains the optimal way to prove how your ranking model performs in your real-world scenario. It can lead to many advantages such as having a direct interpretation of the results and confirming the estimation of offline tests. It gives a better understanding of the ranking model behaviour and builds a solid foundation to learn from to improve it.
Nowadays, the available evaluation tools have some limitations and in this talk we will describe an alternative and customised approach for evaluating ranking models through the use of Kibana.
First of all we give an overview of online testing, highlighting pros and cons and describing the state-of-the-art.
We then dive into our Kibana’s implementation and the reasons behind it. We will explore the tools Kibana provides, with their constraints for real-world applications, and show, through practical examples, how to create dashboards (with queries and code) to compare different models.