How many of you have happened to execute fuzzy queries in Elasticsearch and not obtain the expected results? Let’s dive together into fuzziness behavior, seeing how it works in Elasticsearch, when to use it, and which elements to pay attention to. Introduction How many of you have happened to execute fuzzy queries in Elasticsearch and…
This blog post is about the performances (time and memory) of our contribution to Apache Lucene to generate synonyms using Word2Vec
This blog post shows an example of how to create an Apache Solr performance test using the Apache JMeter tool.
This blog post will analyze the impact of large stored fields on Apache Solr query performance.
This blog is meant to explain how QueryResultCache and FilterCache are used during the basic query processing in Apache Solr 8.11.0. This blog does not explain how these caches are used during the execution of more advanced components like faceting. Solr caches are associated with a specific instance of an Index Searcher. By default, elements…
Here we are with a new “episode” about managing large JSON, as promised. If you have not yet read the first two blog posts, I suggest making up for them in order to better understand what I’m going to discuss right now: How to manage a large JSON file efficiently and quickly How to manage…
In this blog post we make an experimental analysis to identify the best data type to use when dealing with ids.
How the FeatureLogger works? When the Feature Vector Cache is used in Solr? Is the cache speeding up the rerank process?
We are recently working on contributing knn search in Solr leveraging on the latest Lucene developments. This blog post goal is to give some numbers about the benchmark mesaures gathered during the development process. Setup and collection To benchmark our solution we setup our solr instances using dockerized solr in a t3.large aws machine (2…
How a learning to rank query works in Solr? How we can obtain the required features extraction time from the Solr qTime parameter?