Neural Search Comes to Apache Solr [ApacheCon]
As one of the very first Open Source conferences, ApacheCon has brought together users with the largest collection of global Apache project communities through detailed sessions, hands-on workshops, and standalone Apache project tracks.
Deliberately intimate, one of the biggest draws of ApacheCon is access to participants at all levels, from presenters to attendees to sponsors to Apache Members and Committers, Apache Project Management Committee members, ASF leadership, and more, in a collaborative, vendor-neutral environment.
APACHE LUCENE/SOLR COMMITTER
APACHE SOLR PMC MEMBER
Alessandro has been involved in designing and developing search-relevant solutions from the early stages of Apache Solr 1.4 and edismax query parser in 2010. Over the years he has worked on various projects aiming to build search solutions able to satisfy the user information needs, often integrating such solutions with machine learning and artificial intelligence technologies.
Neural Search Comes to Apache Solr: Approximate Nearest Neighbor, BERT and More!
Learning To Rank has been the first integration of machine learning techniques with Apache Solr allowing you to improve the ranking of your search results using training data.
One limitation is that documents have to contain the keywords that the user typed in the search box in order to be retrieved(and then reranked). For example, the query “jaguar” won’t retrieve documents containing only the terms “panthera onca”. This is called the vocabulary mismatch problem.
Neural search is an Artificial Intelligence technique that allows a search engine to reach those documents that are semantically similar to the user’s information need without necessarily containing those query terms; it learns the similarity of terms and sentences in your collection through deep neural networks and numerical vector representation(so no manual synonyms are needed!).
This talk explores the first Apache Solr official contribution about this topic, available from Apache Solr 9.0.
We start with an overview of neural search (Don’t worry – we keep it simple!): we describe vector representations for queries and documents, and how Approximate K-Nearest Neighbor (KNN) vector search works. We show how neural search can be used along with deep learning techniques (e.g, BERT) or directly on vector data, and how we implemented this feature in Apache Solr, giving usage examples!
Join us as we explore this new exciting Apache Solr feature and learn how you can leverage it to improve your search experience!