Search

London Information Retrieval & AI Meetup [June 2025]

We are delighted to announce the 24th London Information Retrieval Meetup & AI, a free evening event aimed at enthusiasts and professionals curious to explore and discuss the latest trends in the field.

This time the Meetup is Hybrid, with a live event in London, being streamed online on Zoom!

Attention!

Remember to fill out the form to confirm the registration

in LONDON

Location: Gladwin Tower, nine elms point sw8 2fs London [see on google maps]

Date: 24th June 2025 | open doors from 6:15 PM (GMT+1)

VIRTUAL

Location: Zoom [You will receive the link after the registration]

Date: 24th June 2025 | 6:30-8:00 PM (GMT+1)

LONDON INFORMATION RETRIEVAL & AI MEETUP

PROGRAM

The event will be structured around 2 technical talks, each followed by a Q&A session. The event will end with a networking session.

> Open doors from 6:15 PM GMT (in-presence)

> 6:30 GMT open doors for virtual attendees

> 6:30-6:45 PM Welcome from Alessandro Benedetti (Director @ Sease)

> 6:45-7:30 PM FIRST TALK

> 7:30-8:15 PM SECOND TALK

> 8:15-8:45 PM Networking session + buffet

FIRST talk

Building Search Using OpenSearch: Limitations and Workarounds

The rising demand for richer and more intelligent search experiences has accelerated interest in features like hybrid search that blends keyword (lexical) search and vector (semantic) search, semantic highlighting where highlighter relies on semantic meaning, and completing suggestions with autocomplete. While OpenSearch continues to evolve, earlier versions such as v2.x posed several limitations when building production-ready systems.
In this talk, we share our experience working with OpenSearch v2.x on a real-world client project, where we encountered constraints across hybrid search, semantic highlighting, autocomplete, and language analysis. We discuss key challenges—such as the lack of pagination support for hybrid queries, missing semantic highlighter capabilities, and limited fuzziness in autocomplete—and present the practical workarounds we implemented. We also highlight which of these gaps have been addressed in OpenSearch newer versions. This session offers concrete guidance for teams working with older OpenSearch versions and looking to deliver search experiences under real-world constraints.

The speaker

Nazerke Seidan

Information Retrieval / Machine Learning Engineer @ Sease

Nazerke Seidan (she/her) is a software engineer with an experience in search. She previously worked at Salesforce, focusing on scaling search in public cloud. An active contributor to the Apache Solr project, Nazerke has a deep interest in Information Retrieval, particularly in Distributed Search.

second talk

Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained and Fine-tuned Approach

This thesis explores the application of multilingual embedding models to the semantic search for the Italian language, a critical step toward integrating these technologies into Retrieval-Augmented Generation (RAG) frameworks. The work leverages state-of-the-art pre-trained and fine-tuned neural models to address the challenges of document retrieval in both symmetric and asymmetric contexts. Using a variety of datasets, including translated corpora for validation, the study evaluates models such as LaBSE, multilingual-e5-large, and bge-m3 for their ability to generate meaningful embeddings and improve retrieval performance. Performance for the asymmetric framework is assessed using nDCG@10.The fine-tuning phase, where the model is modified by inserting an adapter on top of the query embedding for each pre-trained model, demonstrates the adaptability of two of the aforementioned models to Italian-language tasks. The statistical significance has been assessed with the Wilcoxon signed-rank test, which results in a p-value <0.001 for multilingual-e5-large and bge-m3, beating their counterpart without the addition of the adapter.One of our models, multilingual-e5-large with the linear adapter, achieved superior results to proprietary solutions like OpenAI’s text-embedding-3-small. The significance has been assessed with the same statistical test, resulting in a p-value <0.05.Additionally, our solution demonstrated substantial improvements in document retrieval times, reducing latency of OpenAI’s model with our best-performing model of one order of magnitude. Furthermore, the training process is cost-effective and the lightweight design of the model enables it to operate on local hardware.

The speaker
nicolo rinaldi

Nicolò Rinaldi

Software Engineer/Data Scientist @ Sease

Senior Search Software Engineer, his focus is on R&D in Information Retrieval, Information Extraction, Natural Language Processing, and Machine Learning.
He firmly believes in Open Source as a way to build a bridge between Academia and Industry and facilitate the progress of applied research.

Other posts you may find useful

Sign up for our Newsletter

Did you like this post? Don’t forget to subscribe to our Newsletter to stay always updated in the Information Retrieval world!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.