How faceting is calculated in Apache Solr distributed architectures. It presents inner details explanation and practical examples.
Info about our speakers and talk at the Haystack in October 2018, in London (UK).
In this post we’ll cover two additional synonyms scenarios and we’ll try to summarise all previous tips in a coincise form. Following the approach of the previous posts [1] [2] [3], everything can be applied both to Apache Solr and Elasticsearch. Preconditions Synonyms and stopwords at query time: this is not just a “theoretical” constraint; imagine if you…
The Context Brief recap of where we arrived in the preceding article: we had the following synonyms and stopwords settings: synonyms = {“out of warranty”,”oow”} stopwords = {“of”} Both of those filters were configured exclusively at query-time; the synonym filter first and then the stopwords filter. Using the built-in StopFilter we had a synonym detection…
The Context The scenario description is quite simple: we want to use synonyms and stopwords. Following the path of our previous article, we will introduce an additional component in the analysis chain: a StopFilter, which, as the name suggests, removes a set of words from an incoming token stream. We will use the following data…
This flash blog post will address a very specific and common problem : how to manage entities/concepts composed by multiple terms in a vanilla Apache Solr/Elasticsearch instance ( no plugins or extensions to install). The (deployment) context An Elasticsearch or Apache Solr infrastructure where you cannot install third-party components (e.g. plugins, filters, query parsers). This can…
A Software Engineer is always required to give his customers a concrete evidence about deliverables quality. A Search Engineer deals with a specialisation of such generic Software Quality, which is called Search Quality. What is Search Quality? And why is it so important in a search infrastructure? After all, the “Software Quality” should be omni-comprensive,…
// our service Shameless plug for our training and services! Did I mention we do Apache Solr Beginner and Elasticsearch Beginner training?We also provide consulting on these topics, get in touch if you want to bring your search engine to the next level! // STAY ALWAYS UP TO DATE Subscribe to our newsletter Did you like this post about the…
At the time we speak ( Solr 7.3.1 ) SolrCloud is a reliable and stable distributed architecture for Apache Solr.But it is not perfect and failures happen. Apache Zookeeper [1] is the system responsible of managing the communications across the SolrCloud cluster.It contains the shared collections configurations and it has the view of the cluster status.It is…
Scenario You’re working as a search engineer for XYZ Ltd, a company which sells electric components. XYZ provided you the application logs of the last six months, and some business requirements. Two kinds of customers, two kinds of requirements, two kinds of search The log analysis shows that XYZ has mainly two kinds of customers:…