London Information Retrieval Meetup October

After the very warm reception of the first and second edition, the third London Information Retrieval Meetup is approaching (21/10/2019) and we are excited to add more details about our speakers and talks!
The event is free and you are invited to register :

https://www.eventbrite.com/e/london-information-retrieval-meetup-october-tickets-74403100677

Our second speaker is Andrea Gazzarini, our founder and software engineer:

Andrea Gazzarini

Andrea Gazzarini is a curious software engineer, mainly focused on the Java language and Search technologies.
With more than 15 years of experience in various software engineering areas, his adventure with the search domain began in 2010, when he met Apache Solr and later Elasticsearch… and it was love at first sight. 
Since then, he has been involved in many projects across different fields (bibliographic, e-government, e-commerce, geospatial).

In 2015 he wrote “Apache Solr Essentials”, a book about Solr, published by Packt Publishing.
He’s an opensource lover; he’s currently involved in several (too many!) projects, always thinking about a “big” idea that will change his (developer) life.

Music Information Retrieval Take 2: Interval Hashing Based Ranking

Retrieving musical records from a corpus of Information, using an audio input as a query is not an easy task. Various approaches try to solve the problem modelling the query and the corpus of Information as an array of hashes calculated from the chroma features of the audio input.
Scope of this talk is to introduce a novel approach in calculating such hashes, considering the intervals of the most intense pitches of sequential chroma vectors.
Building on the theoretical introduction, a prototype will show you this approach in action with Apache Solr with a sample dataset and the benefits of positional queries.
Challenges and future works will follow up as conclusive considerations.


Our first speaker is Alessandro Benedetti, our founder, software engineer and director:

Alessandro Benedetti

Alessandro Benedetti is the founder of Sease.
Senior Search Software Engineer, his focus is on R&D in information retrieval, information extraction, natural language processing, and machine learning.
He firmly believes in Open Source as a way to build a bridge between Academia and Industry and facilitate the progress of applied research.
Following his passion he entered the Apache Lucene and Solr world in 2010 becoming an active member of the community.
When he isn’t developing a new search solution he is presenting the applications of leading edge techniques in real world scenarios at conferences such as ECIR,  Lucene/Solr Revolution, Fosdem, Haystack, Apachecon and Open Source Summit.

How to Build your Training Set for a Learning to Rank Project

Learning to rank (LTR from now on) is the application of machine learning techniques, typically supervised, in the formulation of ranking models for information retrieval systems.
With LTR becoming more and more popular (Apache Solr supports it from Jan 2017), organisations struggle with the problem of how to collect and structure relevance signals necessary to train their ranking models.
This talk is a technical guide to explore and master various techniques to generate your training set(s) correctly and efficiently.
Expect to learn how to : 
– model and collect the necessary feedback from the users (implicit or explicit)
– calculate for each training sample a relevance label which is meaningful and not ambiguous (Click Through Rate, Sales Rate …)
– transform the raw data collected in an effective training set (in the numerical vector format most of the LTR training library expect)
Join us as we explore real world scenarios and dos and don’ts from the e-commerce industry.

London Information Retrieval Meetup June

After the very warm reception of the first edition, the second London Information Retrieval Meetup is approaching (25/06/2019) and we are excited to add more details about our speakers and talks!
The event is free and you are invited to register :

https://www.eventbrite.com/e/london-information-retrieval-meetup-june-tickets-62261343354

Our first speaker is René Kriegler, freelance search consultant and search engineer :

René Kriegler

René has been working as a freelance search consultant for clients in Germany and abroad for more than ten years. Although he is interested in all aspects of search and NLP, key areas include search relevance consulting and e-commerce search. His technological focus is on Solr/Lucene. René co-organises MICES (Mix-Camp E-Commerce Search, Berlin, 19 June). He maintains the Querqy open source library.

Query Relaxation – a Rewriting Technique between Search and Recommendations

In search quality optimisation, various techniques are used to improve recall, especially in order to avoid empty search result sets. In most of the solutions, such as spelling correction and query expansion, the search query is modified while the original query intent is normally preserved.
In my talk, I shall describe my experiments with different approaches to query relaxation. Query relaxation is a query rewriting technique which removes one or more terms from multi-term queries that would otherwise lead to zero results. In many cases the removal of a query term entails a change of the query intent, making it difficult to judge the quality of the rewritten query and hence to decide which query term should be removed.
I argue that query relaxation might be best understood if it is seen as a technique on the border between search and recommendations. My focus is on a solution in the context of e-commerce search which is based on using Word2Vec embeddings.