Search

Sease at Community Over Code EU 2024

Community Over Code EU 2024

Community Over Code EU 2024

Community Over Code is where you will learn about the latest innovations from Apache Software Foundation (ASF) projects and their communities in a collaborative, vendor-neutral environment.

Location: Bratislava, Slovakia
Date: June 3-5, 2024

// our talk

Hybrid Search with Apache Solr

Mon 11:50 am - 12:20 pm

Vector-based search gained incredible popularity in the last few years: Large Language Models fine-tuned for sentence similarity proved to be quite effective in encoding text to vectors and representing some of the semantics of sentences in a numerical form.

These vectors can be used to run a K-nearest neighbour search and look for documents/paragraphs close to the query in a n-dimensional vector space, effectively mimicking a similarity search in the semantic space (Apache Solr KNN Query Parser).
Although exciting, vector-based search nowadays still presents some limitations:

– it’s very difficult to explain – e.g. why is document A returned and why at position K?

– it doesn’t care about exact keyword matching (and users still rely on keyword searches a lot)

To mitigate these problems, combining lexical (traditional keyword-based) search with neural (vector-based) search is possible.

So, what does it mean to combine these two worlds?

Join us as we explore various ways of running hybrid search in Apache Solr, including tricks, suggestions, pros/cons and future works on this exciting new search approach!

// our speaker

Alessandro Benedetti

FOUNDER @ SEASE

APACHE LUCENE/SOLR COMMITTER
APACHE SOLR PMC MEMBER

Senior Search Software Engineer, his focus is on R&D in Information Retrieval, Information Extraction, Natural Language Processing, and Machine Learning.
He firmly believes in Open Source as a way to build a bridge between Academia and Industry and facilitate the progress of applied research.

slides

Other posts you may find useful

Sign up for our Newsletter

Did you like this post? Don’t forget to subscribe to our Newsletter to stay always updated in the Information Retrieval world!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.