Apache Solr, Main Blog

Apache Solr Neural Highlighting Plugin

Hi readers and information retrieval enthusiasts,

are you ready to redefine the way you search and extract information in Apache Solr?

We are in the final stages of completing our internal product, an Apache Solr plugin that allows you to use Large Language Models to include highlighted text excerpts in each search result, providing users with quick answers to their search queries.

It is widely known that Google already employs advanced search algorithms and features to provide relevant search results, including featured snippets that display concise answers to specific queries directly in each search result.

Our Neural Highlighting plugin aims to bring similar capabilities to the Apache Solr search engine; leveraging state-of-the-art large language models for question-answering, it empowers your search engine to highlight accurate textual answers for each query right within the search results.

QA (question-answering) models take as input a context and a query and extract the answer from the given context. Using our plugin, the highlighted answer won’t need to match the user query terms exactly.

The snippets generated by the plugin aim to provide immediate information without requiring users to click on a website or explore the entire document content. This saves users time and allows them to gather relevant information at a glance.

What does neural highlighting look like?

To give an idea of how the plugin will work, here is a very simple example showing what the neural highlighting’s output will look like.

Example query:

				
					http://localhost:8983/solr/myCollection/select?q=BBC Japan&hl.neural=true&hl.fl=text&hl.q=When did BBC Japan start broadcasting?

You simply need to enable the neural highlighter like any other Apache Solr highlighter:
– setting the parameter hl.neural to true
– passing the field (fields) to highlight
– (optionally) passing a different query to use for highlighting

For each document returned by the query, the content of the specified field will be fed to the (extractive) question-answering model along with the query to extract and highlight the answer.
Here is the example response:

				
					{
	"responseHeader": {
		"status": 0,
		"QTime": 568
	},
	"response": {
		"numFound": 1,
		"start": 0,
		"numFoundExact": true,
		"docs": [{
				"text": ["BBC Japan was a general entertainment channel, which operated between December 2004 and April 2006.\nIt ceased operations after its Japanese distributor folded."],
				"id": "1"
			}]
	},
	"neuralHighlighting": {
		"1": {
			"text": ["BBC Japan was a general entertainment channel, which operated between <em>December 2004</em> and April 2006..."]
		}
	}
}

The snippets are incorporated in a dedicated section of the query response (the neuralHighlighting section), and the client can use the formatting clues to determine how to present the snippets to users.

Each returned document will have a corresponding snippet; in this simple example, since we obtain only one document ("id": "1"), there is only one snippet.

Given the query When did BBC Japan start broadcasting?, the model was able to extract the right text excerpts from the context to answer the question: december 2004.

Exploring the Benefits: Why Choose Our Plugin

Customizable Configuration

You have the flexibility to customize the configuration according to your preferences.

Model: You can either choose from a predefined model integrated within our plugin (i.e. a BERT-style pre-trained model fine-tuned for question-answering tasks) or use your own custom model (located in a specified path), thus having the freedom to tailor the plugin to your specific needs.

Fields: the plugin allows you to specify one or more fields, enabling you to extract multiple snippets and highlight answers from different field contents. This feature empowers you to showcase relevant information from diverse sources within the search results, enhancing the accuracy and comprehensiveness of the highlighted answers.

Seamless Integration

Integrating our Neural Highlighting plugin into your current Apache Solr search infrastructure is a breeze, requiring minimal modifications and avoiding complex setup processes.
Like the existing Solr plugins, it is packaged into a Java jar file and requires a similar installation step(s).

Enhanced User Experience

The plugin’s advanced neural highlighting capabilities generate accurate snippets, thereby enhancing the user experience by quickly highlighting relevant information to users.

Conclusion

Overall, the Apache Solr Neural Highlighting plugin improves user experience and offers customization options, ultimately providing a valuable addition to your Solr search infrastructure.
We are confident that our Solr plugin will completely transform the way you search, saving time and effort in finding the answers you seek.

Interested in this plugin?

If you wish to request a quote based on your project, feel free to complete this form!

request a quote

ai, apachesolr, highlighter, highlighting, languagemodel, machinelearning, neuralmodels, plugin, questionanswering, search, snippet, solr

Sign up for our Newsletter

Did you like this post? Don’t forget to subscribe to our Newsletter to stay always updated in the Information Retrieval world!

One Response

Robert Petersen says:

January 25, 2025 at 2:06 am

This is cool, I was in a discussion with someone on solr-user list and sease jumped in the fray with this!

Loading...

Reply

About the company

about our work

Rated Ranking Evaluator
(RRE)

Rated Ranking Evaluator Enterprise (RREE)

Apache Solr LLM Highlighter plugin

News

Main Blog

TIPS AND TRICKS

LATEST BLOG POST

contact us

Don't miss all the news - subscribe to our newsletter!

Apache Solr Neural Highlighting Plugin

What does neural highlighting look like?