Another Berlin Buzzwords is in the books, and we’re back with plenty of fresh ideas, inspiring conversations, and great memories. If you couldn’t make it (or simply want to relive some of the highlights) start by watching our talk, then dive into our recap of the conference, where we share our key takeaways, favourite sessions, and the moments that stood out the most.
Alessandro Benedetti
DIRECTOR @ SEASE
It’s that time of the year again! Berlin Buzzwords has become a thought-provoking constant for the last 5 years at Sease and we’ve always been engaged with profound search discussions, new and familiar faces and our presentations on stage, where we share the best we can from our experience.
This year our talk was all about Apache Solr and the new features we contributed to Solr 10 in the vector search realm.
What about my favourite talks?
Let’s see a list of talks that caught my attention and sparked interesting discussions!
Circular Dependency Fixes when Bootstrapping a Golden Set
from our friends Radu and Rafal, a nice practical overview of how to use our Dataset Generator (thanks for sharing the word!) and Christopher Ball Solr Navigator to untangle the intricacies of exploring your data and generate training/search quality evaluation datasets.
The talk flowed smoothly, it was pragmatic and inspiring, definitely worth it!
Text-to-Struct: Fine-tuning SLMs for Query Intent
by Hugo and Sandra; it was the perfect kind of talk that satisfies my taste: pragmatic, no smoke, only an interesting journey into query parsing with an honest overview of the techniques applied, what worked and what did not. You learn from these talks, you see what companies are doing for their production systems, extremely valuable.
Zero downtime index upgrade in Apache Solr
If I had to pick only three, for my last (but not least) I would go with this talk by Rahul Goswami, one of our latest committers presented an interesting work on Lucene indexes upgrade through major versions, well handled talk, informative and thought-provoking, well done Rahul!
The Three-Body Problem of Inverse Hybrid Search
Special mention to this talk by Ravindra Harige. A talk that explores the intricacies of inverse search (saved alert) in the era of hybrid search (lexical+vector based).
When better retrieval makes agents worse
And another special mention to this talk by Lester Solbakken, a talk that explores when information retrieval systems that perform decently well for humans could actually go very wrong if used by LLM agents (unless we change those systems optimising for different metrics).
I could spend hours talking about how interesting is this conference (and its sibling events, kudos to the Berlin Search Week), I met many interesting people and had many invaluable discussions, you can rarely have: that’s probably the biggest take away, aside talks and panels, what makes Berlin Buzzwords (and conferences in general) so important is the presence of world leading experts, I’m honoured to take part to these kind of events!
See you next year!
ANNA RUGGERO
R&D SOFTWARE ENGINEER @ SEASE
Berlin Buzzwords Again!!
I’m so happy to have had the opportunity to attend this conference again! It’s the fourth time now, and always as a speaker!
For me, this is definitely the best conference in information retrieval that an expert and passionate about the topic can attend.
Organization and location are amazing as usual, many beautiful people to talk with, share ideas, and get insights.
Starting from us! This year, the Sease Team presented the latest available vector search features in Solr 10.0. We were able to make most of these contributions ourselves, thanks to some sponsors. Listing the major ones: Scalar and Binary Quantization in dense vectors, KNN Early Termination, Seeded KNN, ACORN-like implementation for filtered vector search, and the integration of the efSearchScaleFactor for improving recall. Also, a quick look at what’s new in version 10.x, including KNN search on nested vectors and multivalued vectors support.
Let’s move to the talks I liked most. I’ll just talk about the four most beautiful ones, but many others would deserve to be mentioned.
Zero downtime index upgrade in Apache Solr
A very interesting talk by the Apache Solr committer Rahul Goswami, showing a new Solr feature for upgrading an index in-place with zero downtime. This is a very useful integration when dealing with Solr upgrades through major versions.
Circular Dependency Fixes when Bootstrapping a Golden Set
The second one comes from familiar people, Radu Gheorghe and Rafał Kuć, who presented a short talk full of practical gains! They showed a way to enhance the evaluation dataset through automatic LLM-based generation tools such as our dataset generator.
How to Tell If Your Agent Used the Right Stuff
The third is by Apurva Misra. She was able to highlight all the most important factors to monitor when dealing with agents. A lot of takeaways and things to be aware of for a good evaluation and monitoring of the system.
Text-to-Struct: Fine-tuning SLMs for Query Intent
Complex user intent? The search is not returning what you expected? In this talk, you can find the first answer to your questions! Hugo and Sandra illustrated a nice way to do SMLs fine-tuning to generate structured queries from vague inputs. Also, my colleague Ilaria and I presented something related to generating structured queries through LLMs over the last two years, so it was nice to see a talk related to the topic!
And that’s all folks, for this year! See you at the next one!
Ilaria petreti
R&D SOFTWARE ENGINEER @ SEASE
After five consecutive years on the Berlin Buzzwords stage, the event has become much more than a conference for me. It’s a milestone in my professional journey, a place where I continue to learn, share experiences, and engage with an incredible community of experts. Coming back year after year is an honour.
Unfortunately, this year’s Berlin Search Week didn’t get off to the best start. My flight was cancelled, which meant I had to miss both Sunday’s Barcamp and the Speakers’ Dinner, as I didn’t arrive in Berlin until late that evening. I was very sorry to miss them.
The following morning, still a bit tired from the journey, I was nevertheless excited and ready to present this year’s talk together with my colleagues Alessandro and Anna.
Over the past year, we have contributed a lot to Apache Solr’s vector search capabilities, and our talk took the audience on a carousel tour of the new features that will be available starting with Solr 10. While many of these features were developed by us, others came from the Solr community, and we included them all to provide a complete overview of the vector search ecosystem that users can leverage.
Unfortunately, there wasn’t enough time to dive into the details of every feature. However, I hope we managed to provide a useful high-level overview of what is available, when to use each functionality, and the value they can bring to different search use cases.
A big thank you to everyone who attended our talk and contributed to such a large and engaged audience. And a special thanks to my colleagues… it is always a pleasure to share these experiences with you!
After our talk, I was able to enjoy the conference in a much more relaxed way.
Among all the sessions, I particularly appreciated these talks:
Tensor arithmetics in search and ranking for Ecommerce
The talk demonstrates how AI models based on vector embeddings and the Vespa platform can go beyond simply finding similar products or content by dynamically modifying their characteristics through mathematical operations on vectors.
For example, starting from a pair of earrings with a blue stone, the system can find identical models with a red stone, or it can take a Mickey Mouse T-shirt and search for an equivalent version without Mickey Mouse.
I will certainly need to explore the approach further to fully understand its underlying concepts, but I found it particularly interesting and potentially reusable in my own work with clients.
Let LLMs Wander: Engineering RL Environments
I enjoyed the talk because Stefano has a great ability to explain complex concepts through simple examples.
This time, he used the game of Tic-Tac-Toe to demonstrate how a small language model can be improved by combining Supervised Fine-Tuning and Reinforcement Learning. While the example is not directly applicable to our context, it provides a clear and intuitive understanding of the underlying ideas and the potential of these techniques.
He also explained to us after the talk that, while Reinforcement Learning can be more complex to implement and requires carefully designed environments and reward mechanisms, Supervised Fine-Tuning has become much more accessible today. Thanks to libraries such as TRL by Hugging Face, the technical implementation is relatively straightforward and the main challenge often lies simply in creating a high-quality training dataset.
AI is here – time to throw away our search engines?
Finally, I enjoyed the closing panel, where Charlie, Atita, Jo, Dimitrin, and Evgeniya exchanged their views on the future of search in the age of AI. The debate explored whether AI-powered systems are destined not only to replace conventional search engines, but ultimately to replace many of the professionals who use them. I believe that the greatest value comes from combining human expertise with AI capabilities rather than treating them as competing alternatives.
Text-to-Struct: Fine-tuning SLMs for Query Intent
I was able to follow this talk from start to finish because it covered topics that were very familiar to me and closely related to the work we presented at Berlin Buzzwords last year, i.e. transforming natural language queries into structured queries that can be executed by a search engine.
What I particularly appreciated was how the speakers highlighted the limitations of relying only on large general-purpose LLMs, many of which mirror the same issues we encountered in our own work. They then demonstrated how fine-tuning a smaller, domain-specific model can help overcome these challenges, significantly improving latency and reducing costs.
The session was especially relevant to the projects I am currently working on, and it provided several practical insights that I believe could be valuable for our future developments.
Zero downtime index upgrade in Apache Solr from Rahul Goswami
Rahul presented a new feature that he contributed to Apache Solr, enabling index upgrades without downtime and without the need to rebuild indexes from scratch. In Apache Lucene-based systems, indexes are only forward-compatible across a single major version. For example, an index created with Solr 7 can be opened in Solr 8, but not directly in Solr 9.
The only solution was to perform a full reindexing operation from the original data source; this can be time-consuming for large datasets, expensive in terms of infrastructure resources, and, in some cases, impossible if the original data source is no longer available. To address this challenge, a new API called Upgrade Index has been introduced. It enables in-place index upgrades by progressively converting index segments to the format of the current version. I believe this is a very valuable enhancement for the Solr community, as it significantly simplifies major version upgrades that can be performed without service interruption, while continuing to serve both queries and indexing requests. Thanks, Rahul, for this contribution!
As always, Berlin Buzzwords remains one of my favourite conferences. It is exceptionally well organised and provides a great opportunity not only to reconnect with familiar faces, but also to meet new experts.










