Event, Meetup, News

London Information Retrieval & AI Meetup [February 2025]

We are delighted to announce the 23rd London Information Retrieval Meetup & AI, a free evening event aimed at enthusiasts and professionals curious to explore and discuss the latest trends in the field.

This time the Meetup is Hybrid, with a live event in London, being streamed online on Zoom!

// in-presence

Location: Gladwin Tower, nine elms point sw8 2fs London [see on google maps]

Date: 18th February 2025 | open doors from 6:15 PM (GMT)

// online

Location: Zoom [You will receive the link after the registration]

Date: 18th February 2025 | 6:30-8:00 PM (GMT)

// LONDON INFORMATION RETRIEVAL MEETUP

PROGRAM

The event will be structured around 1 technical talk, followed by a Q&A session. The event will end with a networking session.

> Open doors from 6:15 PM GMT (in-presence)

> 6:30 GMT open doors for virtual attendees

> 6:30-6:45 PM Welcome from Alessandro Benedetti (Director @ Sease)

> 6:45-7:30 PM FIRST TALK

> 7:30-8:00 PM Ask us anything

> 8:00-8:30 PM Networking session + buffet

// talk

Question Bank: Adding a Human Touch to Your Search Results

Whether extractive or generative, Question Answering (QA) systems are key to efficiently accessing information. However, they cannot guarantee the optimal answer to every query. In the rapidly evolving financial world, it is particularly important to provide the most relevant and up-to-date results, which poses a unique challenge that Bloomberg must address.

In this talk, we will present our motivations and journey with building a question bank-aided QA system to empower Bloomberg’s in-house financial domain experts to contribute in real time to a collection of high-quality questions and answers related to Bloomberg’s collection of trusted data, news, and insights. We will showcase how we integrate dense vectors and keyword retrieval over the question bank to ensure reliable and efficient answers to the most popular queries in order to help users of the Bloomberg Terminal discover relevant questions and answers to help better inform their business and investment decision-making.

The speakers

Arsany Guirguis

Senior Machine Learning Engineer, AI Platforms

Bruno Taillé

Senior Machine Learning Engineer, AI Experiences

// second talk

Hybrid Search With Apache Solr Reciprocal Rank Fusion

Vector-based search gained incredible popularity in the last few years: Large Language Models fine-tuned for sentence similarity proved to be quite effective in encoding text to vectors and representing some of the semantics of sentences in a numerical form.
These vectors can be used to run a K-nearest neighbour search and look for documents/paragraphs close to the query in a n-dimensional vector space, effectively mimicking a similarity search in the semantic space (Apache Solr KNN Query Parser).
Although exciting, vector-based search nowadays still presents some limitations:
– it’s very difficult to explain (e.g. why is document A returned and why at position K?)
– it doesn’t care about exact keyword matching (and users still rely on keyword searches a lot)

Hybrid search comes to the rescue, combining lexical (traditional keyword-based) search with neural (vector-based) search.
So, what does it mean to combine these two worlds?
It starts with the retrieval of two sets of candidates:
– one set of results coming from lexical matches with the query keywords
– a set of results coming from the K-Nearest Neighbours search with the query vector

The result sets are merged and a single ranked list of documents is returned to the user.
Reciprocal Rank Fusion (RRF) is one of the most popular algorithms for such a task.
This talk introduces the foundation algorithms involved with RRF and walks you through the work done to implement them in Apache Solr, with a focus on the difficulties of the process, the distributed support(SolrCloud), the main components affected and the limitations faced, all updated to the latest days.
The audience is expected to learn more about this interesting approach, the challenges in it and how the contribution process works for an Open Source search project as complex as Apache Solr.

Alessandro Benedetti

FOUNDER @ SEASE

APACHE LUCENE/SOLR COMMITTER
APACHE SOLR PMC MEMBER

Senior Search Software Engineer, his focus is on R&D in Information Retrieval, Information Extraction, Natural Language Processing, and Machine Learning.
He firmly believes in Open Source as a way to build a bridge between Academia and Industry and facilitate the progress of applied research.