Apache Solr Neural Highlighting Plugin
Hi readers and information retrieval enthusiasts,
are you ready to redefine the way you search and extract information in Apache Solr?
We are in the final stages of completing our internal product, an Apache Solr plugin that allows you to use Large Language Models to include highlighted text excerpts in each search result, providing users with quick answers to their search queries.
It is widely known that Google already employs advanced search algorithms and features to provide relevant search results, including featured snippets that display concise answers to specific queries directly in each search result.
Our Neural Highlighting plugin aims to bring similar capabilities to the Apache Solr search engine; leveraging state-of-the-art large language models for question-answering, it empowers your search engine to highlight accurate textual answers for each query right within the search results.
QA (question-answering) models take as input a context and a query and extract the answer from the given context. Using our plugin, the highlighted answer won’t need to match the user query terms exactly.
The snippets generated by the plugin aim to provide immediate information without requiring users to click on a website or explore the entire document content. This saves users time and allows them to gather relevant information at a glance.
What does neural highlighting look like?
To give an idea of how the plugin will work, here is a very simple example showing what the neural highlighting’s output will look like.
Example query:
http://localhost:8983/solr/myCollection/select?q=BBC Japan&hl.neural=true&hl.fl=text&hl.q=When did BBC Japan start broadcasting?
You simply need to enable the neural highlighter like any other Apache Solr highlighter:
– setting the parameter hl.neural to true
– passing the field (fields) to highlight
– (optionally) passing a different query to use for highlighting
For each document returned by the query, the content of the specified field will be fed to the (extractive) question-answering model along with the query to extract and highlight the answer.
Here is the example response:
{
"responseHeader": {
"status": 0,
"QTime": 568
},
"response": {
"numFound": 1,
"start": 0,
"numFoundExact": true,
"docs": [{
"text": ["BBC Japan was a general entertainment channel, which operated between December 2004 and April 2006.\nIt ceased operations after its Japanese distributor folded."],
"id": "1"
}]
},
"neuralHighlighting": {
"1": {
"text": ["BBC Japan was a general entertainment channel, which operated between <em>December 2004</em> and April 2006..."]
}
}
}
The snippets are incorporated in a dedicated section of the query response (the neuralHighlighting
section), and the client can use the formatting clues to determine how to present the snippets to users.
Each returned document will have a corresponding snippet; in this simple example, since we obtain only one document ("id": "1"
), there is only one snippet.
Given the query When did BBC Japan start broadcasting?
, the model was able to extract the right text excerpts from the context to answer the question: december 2004
.

Exploring the Benefits: Why Choose Our Plugin
You have the flexibility to customize the configuration according to your preferences.
Model: You can either choose from a predefined model integrated within our plugin (i.e. a BERT-style pre-trained model fine-tuned for question-answering tasks) or use your own custom model (located in a specified path), thus having the freedom to tailor the plugin to your specific needs.
Fields: the plugin allows you to specify one or more fields, enabling you to extract multiple snippets and highlight answers from different field contents. This feature empowers you to showcase relevant information from diverse sources within the search results, enhancing the accuracy and comprehensiveness of the highlighted answers.
Integrating our Neural Highlighting plugin into your current Apache Solr search infrastructure is a breeze, requiring minimal modifications and avoiding complex setup processes.
Like the existing Solr plugins, it is packaged into a Java jar file and requires a similar installation step(s).
The plugin’s advanced neural highlighting capabilities generate accurate snippets, thereby enhancing the user experience by quickly highlighting relevant information to users.
Conclusion
Overall, the Apache Solr Neural Highlighting plugin improves user experience and offers customization options, ultimately providing a valuable addition to your Solr search infrastructure.
We are confident that our Solr plugin will completely transform the way you search, saving time and effort in finding the answers you seek.
Subscribe to our newsletter
Sign up for our newsletter to be the first to know when our Solr plugin is officially released and get ready to experience the power of our plugin with an exclusive demo instance.