Learning To Rank has been the first integration of machine learning techniques with Apache Solr allowing you to improve the ranking of your search results using training data.
One limitation is that documents have to contain the keywords that the user typed in the search box in order to be retrieved(and then reranked). For example, the query “jaguar” won’t retrieve documents containing only the terms “panthera onca”. This is called the vocabulary mismatch problem.
Neural search is an Artificial Intelligence technique that allows a search engine to reach those documents that are semantically similar to the user’s information need without necessarily containing those query terms; it learns the similarity of terms and sentences in your collection through deep neural networks and numerical vector representation(so no manual synonyms are needed!).
This talk explores the first Apache Solr official contribution about this topic, available from Apache Solr 9.0.
We start with an overview of neural search (Don’t worry – we keep it simple!): we describe vector representations for queries and documents, and how Approximate K-Nearest Neighbor (KNN) vector search works. We show how neural search can be used along with deep learning techniques (e.g, BERT) or directly on vector data, and how we implemented this feature in Apache Solr, giving usage examples!
Join us as we explore this new exciting Apache Solr feature and learn how you can leverage it to improve your search experience!