Search

Sease at Berlin Buzzwords 2025

berlin buzzwords

Berlin Buzzwords 2025

Berlin Buzzwords is Germany’s most exciting conference on storing, processing, streaming and searching large amounts of digital data, with a focus on open source software projects.

Location: Kulturbrauerei, Berlin
Date: 15th-17th June 2025

// our first talk

End-to-End Semantic Search with Apache Solr 9.8 LLM Module

16th october, tokyo

Apache Solr 9.8 introduces the LLM module opening the doors of end-to-end natural language query support through vector-backed semantic search (K Nearest Neighbors).
This talk explores the open source contribution from both the indexing and query angles and what’s coming next for Solr in terms of integrations with Large Language Models.

description

Dense vector search was introduced in Apache Solr 9.0 in 2022 and since then it has received substantial adoption from the community.
Text vectorisation had to happen outside Solr, as there was no support to encode text to vector within the search engine transparently.
Apache Solr 9.8 changes this, introducing a module that allows interaction with well-known large language model providers such as OpenAI, Cohere, HuggingFace and Mistral AI via the open-source library 
LangChain4j.
Expect to learn how to configure Solr to access external text vectorisation services and use them to encode and run your queries through the ‘knn_text_to_vector’ query parser and vectorise your documents’ textual fields through the ‘Text To Vector Update Request Processor’.

This is a foundational enabler that speeds up the design and development of end-to-end semantic search solutions.

The talk wraps up with future directions and how the introduction of the LLM module opens the doors for exciting new integrations.

Join us as we dive into the AI future of Apache Solr!

// our speaker

Alessandro Benedetti

DIRECTOR @ SEASE

APACHE LUCENE/SOLR COMMITTER

APACHE SOLR PMC MEMBER

// our second talk

AI-Powered Search Results Navigation with LLMs & JSON Schema

16th october, tokyo

Struggling to identify relevant filters among too many facets and frustrating results navigation? We explore an AI Filter Assistant for statistical data (SDMX) showing how LLMs can be leveraged to suggest the best filters for your natural language query, helping you refine the results in Apache Solr. We share wins, fails, and lessons learned.

description

In this talk, we explore an AI-powered Filter Assistant, designed for the Statistical Data and Metadata eXchange (SDMX) to improve User eXperience in navigating search results efficiently and effectively.

We discuss how LLMs enhance filter suggestions by analyzing both user queries and indexed data.

On the architecture side, we break down:

  1. Data retrieval – how we collected and processed the input SDMX data to build taxonomies used by the model to reconcile the concepts in the natural language query
  2. API structure – a deep dive into our endpoints, what they do, and the responses they return.
  3. Model choice – the process of identifying the best LLM for the task, including our motivations and studies
  4. Structured output & JSON Schema – key benefits, limitations, and lessons learned from extensive testing. We showcase different test results and insights on what works best.
  5. Solr query optimization – how to integrate the assistant’s output into a search query, using different boolean strategies to handle the refinement of both too-many and zero-result scenarios.

Expect real-world insights, practical takeaways, and a discussion on the future of AI-driven filtering!

// our speakers
anna ruggero

Anna Ruggero

R&D SOFTWARE ENGINEER @ SEASE

R&D Search Software Engineer, her focus is on the integration of Information Retrieval systems with advanced Machine Learning, Neural Search models ad Recommender Systems.
She likes to find new solutions that integrate her work as a Search Consultant with the latest academic studies.

ilaria petreti

Ilaria Petreti

R&D SOFTWARE ENGINEER @ SEASE

Ilaria is a Data Scientist passionate about the world of Artificial Intelligence. She loves applying Data Mining and Machine Learnings techniques, strongly believing in the power of Big Data and Digital Transformation.

Edward Lambe

Head of MED Data Engineering @ BIS (Bank for International Settlements)

Since joining the BIS in 2016, Edward has overseen the implementation of several key projects within the IT unit of the Monetary and Economic Department. Notably, he led the delivery of the BIS Data Portal, a core initiative of the BIS 2025 Innovation programme aimed at modernising the dissemination of BIS statistics. Prior to his tenure at the BIS, Edward held various statistical and IT roles at the Central Statistics Office, Ireland, and the Bank of Ireland. 

Other posts you may find useful

Sign up for our Newsletter

Did you like this post? Don’t forget to subscribe to our Newsletter to stay always updated in the Information Retrieval world!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.