We are so happy to announce the eleventh London Information Retrieval Meetup, a free evening meetup aimed to Information Retrieval passionates and professionals who are curious to explore and discuss the latest trends in the field.
This time the meetup will be an online event, due to COVID-19.
Registration required for zoom link: REGISTER HERE
Date: 14th December 2021 | 6:15-8:00 PM (GMT)
The event will be structured with 1 technical talk, with a Q&A session after the talk.
[starting at 6:15]
After a short welcome & latest news speech from our Founder Alessandro Benedetti, we will proceed to the first talk.
JO KRISTIAN BERGUM
Jo works as a distinguished engineer at Yahoo where he spends his time working on Vespa, the open-source big data serving engine.
Taking the neural search paradigm shift to production
Search is going through a paradigm shift, sometimes referred to as the “BERT revolution.” The introduction of pre-trained language transformer models like BERT has brought significant advancements in search and document ranking state-of-the-art.
Bringing these promising methods to production in an end-to-end search serving system is not trivial. It requires substantial middleware glue and deployment effort to connect open-source tools like Apache Lucene, vector search libraries (e.g., FAISS), and model inference servers. However, the open-source serving engine Vespa, which Yahoo has developed since 2003, offers features that enable implementing state-of-the-art retrieval and ranking methods using a single serving engine stack, significantly reducing deployment complexity, cost, and failure modes.
This talk gives an overview of the Vespa search serving architecture and features enabling expressing state-of-the-art retrieval and ranking methods. We dive into Vespa’s implementations of sub-linear retrieval algorithms for sparse and dense representations to produce candidate documents for (re-)ranking efficiently. Vespa allows expressing the end-to-end multi-stage retrieval and ranking pipeline, including inference using transformer models. We also touch on real-world application constraints, such as filtering and search result diversification.