Search Solutions and Tutorials
Search Solutions is the BCS Information Retrieval Specialist Group’s annual event focused on practitioner issues in the arena of search and information retrieval. It is a unique opportunity to bring together academic research and practitioner experience.
Location: BCS London Headquarters, Ground Floor, 25 Copthall Avenue
Date: 25-26 November 2025
// our talk
Search Quality Evaluation in the Era of Large Language Models: Dataset Generator
26th NOVEMBER | 10:00AM (LOCAL TIME)
Building datasets manually is expensive: queries can be extracted decently easy from production logs (when available) but relate queries to documents with a reasonable relevance rating is hard, requires a lot of time to human experts, agreemnent between them and it’s extremely boring, so it’s likely that a human expert after a while will lose interest and ratings become more approximate.This challenge over the years made offline search quality evaluation a chimaera in the industry; everyone is talking about it, recognising its importance, but for many teams, it remained just a utopia.
This talk aims to explore strategies that leverage LLMs to generate queries and ratings, including an overview of an open-source library we contributed to the community: Dataset Generator.
// our training
Search Quality Evaluation in the era of LLMs and vector search
Half day
In the era of Large Language Models and vector search, this tutorial gives the attendees tools and processes to answer these three questions:
1) (DATASET BUILDING) How can I build a dataset with rated queries to measure the quality of my search engine?
2) (IS THE EMBEDDING MODEL THE PROBLEM?) Does the vector embedding model I chose work as well with my dataset in comparison to public reference datasets?
3) (IS MY IMPLEMENTATION OF APPROXIMATE NEAREST NEIGHBOUR THE PROBLEM?) Is the implementation (and parameters) I’m using (through a search engine/vector database) good enough compared to the exact k-nearest neighbour?Building datasets manually is expensive: queries can be extracted decently easy from production logs (when available) but relate queries to documents with a reasonable relevance rating is hard, requires a lot of time to human experts, agreemnent between them and it’s extremely boring, so it’s likely that a human expert after a while will lose interest and ratings become more approximate.This challenge over the years made offline search quality evaluation a chimaera in the industry; everyone is talking about it, recognising its importance, but for many teams, it remained just a utopia.A related problem happens with vector-based search: it can fail at various points, and it’s hard to fix it.
A generated dataset and vector search evaluation tools can help with that.
This tutorial aims to explore strategies that leverage LLMs to generate queries and ratings, including an overview of an open-source library we contributed to the community: Dataset Generator.
// your speaker and trainer
Alessandro Benedetti
DIRECTOR @ SEASE
APACHE LUCENE/SOLR COMMITTER
APACHE SOLR PMC MEMBER
// your trainer





