After the very warm reception of the first year, the fifth London Information Retrieval Meetup is approaching (23/06/2020) and we are excited to add more details about our speakers and talks!
The event is going to be fully remote (given the COVID-19 situation) and free!
You are invited to register :
Our second speaker is Martin White, Managing Director at Intranet Focus:
Martin White is an information scientist who has been working with IR systems
since 1974. Over the last twenty years at Intranet Focus he has worked on
nearly 100 search-based projects, mainly in the pharmaceutical, engineering,
legal and NGO sectors. He is the author of four books on enterprise search
and has given presentations and workshops in Europe and North America.
He has been a Visiting Professor at the Information School, University of
Sheffield, since 2002, specialising in information management and information
retrieval. In the process he has accumulated a digital library of over 1000
research papers related to enterprise search.
Enterprise Search – How Relevant Is Relevance?
Enterprise search is the outlier in search applications. It has to work effectively with very large collections of un-curated content, often in multiple languages, to meet the requirements of employees who need to make business-critical decisions.
In this talk, I will outline the challenges of searching enterprise content. Recent research is revealing a unique pattern of search behaviour in which relevance is both very important and yet also irrelevant, and where recall is just as important as precision. This behaviour has implications for the use of standard metrics for search performance (especially in the case of federated search across multiple applications) and for the adoption of AI/ML techniques.
Our first speakers are Alessandro Benedetti, our director and Anna Ruggero, one of our R&D software engineers:
Alessandro Benedetti is the founder of Sease.
Senior Search Software Engineer, his focus is on R&D in information retrieval, information extraction, natural language processing, and machine learning.
He firmly believes in Open Source as a way to build a bridge between Academia and Industry and facilitate the progress of applied research.
Following his passion he entered the Apache Lucene and Solr world in 2010 becoming recently an Apache Lucene/Solr committer.
When he isn’t developing a new search solution he is presenting the applications of leading-edge techniques in real-world scenarios at conferences such as ECIR, Lucene/Solr Revolution, Fosdem, Haystack, Apachecon, and Open Source Summit.
Anna Ruggero is a software engineer passionate about Information Retrieval and Data Mining.
She loves to find new solutions to problems, suggesting and testing new ideas, especially those that concern the integration of machine learning techniques into information retrieval systems.
Anna came into contact with search engines during her studies falling in love with this world, therefore she decided to investigate this topic further participating to the 12th European Summer School in Information Retrieval and doing her master degree dissertation on Entity Search.
Thanks to this path, she has expanded and improved her knowledge of Java and Python languages, information retrieval systems, clustering, and word embeddings.
Evaluating Your Learning to Rank Model: Dos and Don’ts in Offline/Online Evaluation
Learning to rank (LTR from now on) is the application of machine learning techniques, typically supervised, in the formulation of ranking models for information retrieval systems.
With LTR becoming more and more popular (Apache Solr supports it from Jan 2017 and Elasticsearch has an Open Source plugin released in 2018), organizations struggle with the problem of how to evaluate the quality of the models they train.
This talk explores all the major points in both Offline and Online evaluation.
Setting up correct infrastructures and processes for a fair and effective evaluation of the trained models is vital for measuring the improvements/regressions of a LTR system.
The talk is intended for:
– Product Owners, Search Managers, Business Owners
– Software Engineers, Data Scientists, and Machine Learning Enthusiast
Expect to learn :
- the importance of Offline testing from a business perspective
- how Offline testing can be done with Open Source libraries
- how to build a realistic test set from the original data set in input avoiding common mistakes in the process
- the importance of Online testing from a business perspective
- A/B testing and Interleaving approaches: details and Pros/ Cons
- common mistakes and how they can false the obtained results
Join us as we explore real world scenarios and dos and don’ts from the e-commerce industry!