The London Information Retrieval Meetup is approaching (19/02/2019) and we are excited to add more details about the speakers and talks!
Learning To Rank: Explained for Dinosaurs
Internet search has long evolved from days when you had to string up your query in just the right way to get the results you were looking for. Search has to be smart and natural, and people expect it to “just work” and read what’s on their minds.
On the other hand, anyone who has worked behind the scenes with a search engine knows exactly how hard it is to get the right result to show up at the right time. Countless hours are spent tuning the boosts before your user can find his favorite two-legged tiny-armed dinosaur on the front page.
When your data is constantly evolving, updating, it’s only realistic that so do your search engines. Search teams thus are in a constant pursuit to refine and improve the ranking and relevance of their search results. But, working smart is not the same as working hard. There are many techniques we can employ, that can help us dynamically improve and automate this process. One such technique is Learning to Rank.
Learning to Rank was initially proposed in academia around 20 years ago and almost all commercial web search engines utilize it in some form or other. At Bloomberg, we decided that it was time for an open-source search engine to support Learning to Rank, so we spent more than a year designing and implementing it. The result of our efforts has been accepted by the Solr community and our Learning to Rank plugin is now available in Apache Solr.
This talk will serve as an introduction to the LTR(Learning-to-Rank) module in Solr. No prior knowledge about Learning to Rank is needed, but attendees will be expected to know the basics of Python, Solr, and machine learning techniques. We will be going step-by-step through the process of shipping a machine-learned ranking model in Solr, including:
- how you can engineer features and build a training data-set as per your needs
- how you can train ranking models using popular Python ML(machine learning) libraries like scikit-learn
- how you can use the above-learned ranking models in Solr
Get ready for an interactive session where we learn to rank!
Sambhav KothariSOFTWARE ENGINEER @ BLOOMBERG
Sambhav is a software engineer at Bloomberg, working in the News Search Experience team
Improving top-k retrieval algorithms using dynamic programming and longer skipping
Modern search engines has to keep up with the enormous growth in the number of documents and queries submitted by users. One of the problem to deal with is finding the best k relevant documents for a given query. This operation has to be fast and this is possible only by using specialised technologies.
Block max wand is one of the best known algorithm for solving this problem without any effectiveness degradation of its ranking.
After a brief introduction, in this talk I’m going to show a strategy introduced in “Faster BlockMax WAND with Variable-sized Blocks” (SIGIR 2017), that applied to BlockMaxWand data has made possible to speed up the algorithm execution by almost 2x.
Then, will be presented another optimisation of the BlockMaxWand algorithm (“Faster BlockMax WAND with Longer Skipping”, ECIR 2019) for reducing the time execution of short queries.
Elia is a Software Engineer passionate about algorithms and data structures concerning search engines and efficiency.
He is currently involved in many research projects at CNR (National Research Council, Italy ) and for personal purpose.
Before joining Sease he worked in Intecs and List where he could experience different fields and levels of computer science, by handling low level programming problems such as embedded and networking up to high level trading algorithms.
He graduated with a dissertation about data compression and query performance on search engines.
He is active part of the information retrieval research community, attending international conferences such as SIGIR and ECIR.
Introduction to Music Information Retrieval
Music Information Retrieval is about retrieving information from music entities.
This high-level definition relates to a complex discipline with many real-world applications.
Being a former bass player, Andrea will describe a high-level overview of Music Information Retrieval and will analyze from a musician’s perspective a set of challenges that the topic offers.
We will introduce the basic concepts of the music language, then passing through different kinds of music representations we will end up describing some useful low-level features that are used when dealing with music entities.
Andrea Gazzarini is a curious software engineer, mainly focused on the Java language and Search technologies. With more than 15 years of experience in various software engineering areas, his adventure in the search world began in 2010, when he met Apache Solr and later Elasticsearch.
Join our Group
Researchers, scientists, and other practitioners in the field of Information Retrieval, Machine Learning, and Data Science… join us, and let’s create a group of passionate and professionals!