Entity Search with graph embeddings – Part 1 – Overview

This series of blog posts wants to describe my master degree dissertation done with the supervision of Prof. Gianmaria Silvello at the University of Padova. The main focus of this project is in the use of graph embeddings in order to create virtual documents for the Information Retrieval Entity Search task. This thesis description is … Continue reading Entity Search with graph embeddings – Part 1 – Overview

Haystack 2019 Experience

This blog is a quick summary of my (subjective) experience at Haystack 2019 : the Search Relevance Conference, hosted in Charlottesville (Virginia, USA) from 24/04/2019 to 25/04/2019.References to the slides will be updated as soon as they become available. First of all my feedback on the Haystack Conference is extremely positive.From my perspective the conference … Continue reading Haystack 2019 Experience

Apache Solr Distributed Facets

Apache Solr distributed faceting feature has been introduced back in 2008 with the first versions of Solr (1.3 according to this jira[1]) . Until now, I always assumed it just worked, without diving too much into the details. Nowadays distributed search and faceting are extremely popular, you can find them pretty much everywhere (in the … Continue reading Apache Solr Distributed Facets

Synonyms and Stopwords: Vademecum

In this post we'll cover two additional synonyms scenarios and we'll try to summarise all previous tips in a coincise form. Following the approach of the previous posts [1] [2] [3], everything can be applied both to Apache Solr and Elasticsearch. Preconditions Synonyms and stopwords at query time: this is not just a "theoretical" constraint; … Continue reading Synonyms and Stopwords: Vademecum

Still Synonyms + Stopwords?? Mamma mia!

The Context Brief recap of where we arrived in the preceding article: we had the following synonyms and stopwords settings: synonyms = {"out of warranty","oow"} stopwords = {"of"} Both of those filters were configured exclusively at query-time; the synonym filter first and then the stopwords filter. Using the built-in StopFilter we had a synonym detection … Continue reading Still Synonyms + Stopwords?? Mamma mia!

Synonyms + Stopwords?? OMG!

The Context The scenario description is quite simple: we want to use synonyms and stopwords. Following the path of our previous article, we will introduce an additional component in the analysis chain: a StopFilter, which, as the name suggests, removes a set of words from an incoming token stream. We will use the following data … Continue reading Synonyms + Stopwords?? OMG!

Apache Solr/Elasticsearch: How to Manage Multi-term Concepts out of the Box?

This flash blog post will address a very specific and common problem : how to manage entities/concepts composed by multiple terms in a vanilla Apache Solr/Elasticsearch instance ( no plugins or extensions to install). The (deployment) context An Elasticsearch or Apache Solr infrastructure where you cannot install third-party components (e.g. plugins, filters, query parsers). This can … Continue reading Apache Solr/Elasticsearch: How to Manage Multi-term Concepts out of the Box?

Rated Ranking Evaluator: Help the poor (Search Engineer)

A Software Engineer is always required to give his customers a concrete evidence about deliverables quality. A Search Engineer deals with a specialisation of such generic Software Quality, which is called Search Quality. What is Search Quality? And why is it so important in a search infrastructure? After all, the "Software Quality" should be omni-comprensive, … Continue reading Rated Ranking Evaluator: Help the poor (Search Engineer)

Apache Solr: orchestrating Known item and Full-text search

Scenario You’re working as a search engineer for XYZ Ltd, a company which sells electric components. XYZ provided you the application logs of the last six months, and some business requirements. Two kinds of customers, two kinds of requirements, two kinds of search The log analysis shows that XYZ has mainly two kinds of customers: … Continue reading Apache Solr: orchestrating Known item and Full-text search

SolrCloud Leader Election Failing

At the time we speak ( Solr 7.3.0 ) SolrCloud is a reliable and stable distributed architecture for Apache Solr. But it is not perfect and failures happen. This lightening blog post will present some practical tips to follow when a specific shard of a collection is down with no leader and the situation is … Continue reading SolrCloud Leader Election Failing