Main blog
Welcome to our Main Blog, the cornerstone of our exploration into information retrieval. This dedicated space serves as a comprehensive repository where we delve into our research, findings, and various topics predominantly centered around information retrieval.

Chess Information Retrieval Insights
See how to set up a collection and queries for chess information retrieval. Here we describe a search engine to improve the chess experience.

Open Source Databases Demystified: A Review of the Top Options
Discover your project’s best open source database options with our comprehensive review. Learn about their strengths and weaknesses.

Image Retrieval Using ViT + Generative Pre-trained Transformer (GPT)
Implementation of image retrieval through a textual query using a Vision Transformer and a GPT for image captioning.

OpenSearch Neural Search Tutorial: Hybrid Search
The new implementation of hybrid search (from OpenSearch 2.10), allows for the combination and normalization of query relevance scores.

OpenSearch Neural Search Tutorial: How Filtering Works
This blog post is about how Filtering works in OpenSearch, exploring its importance, and how it is handle in vector search.

Find and Replace in Elasticsearch Fields
In this blog post we explore how to find and replace a specific value within a field of an Elasticsearch index.

Exploring Sexism in Information Retrieval Systems with NLP and ML
This blog is about the use of NLP and ML techniques to detect and prevent the use of sexism in Information Retrieval Systems

OpenSearch KNN Plugin Tutorial
It explores the OpenSearch k-NN Plugin, which offers 3 different approaches for retrieving the k-nearest neighbors from a vector index.

OpenSearch Neural Search Plugin Tutorial: Additional Useful Tools
The second part of OpenSearch Neural Search Plugin Tutorial for version 2.4.0 where additional tools can be found that might be useful

When and How to Use N-grams in Elasticsearch
This blogpost describes the use cases where n-grams are useful, explain the risks of using them, and present some alternative solutions that should be considered.

Hybrid Search with Apache Solr
This blog shows how to run a hybrid search(keyword-based search + vectors) in Apache Solr with code examples and explanations!

How Does Fuzzy Queries Work in Elasticsearch?
This blog post gives you an overview of how fuzzy queries work in Elasticsearch with examples and references.

How to Use Python API to Index JSON Data in Elasticsearch
This blog post explores how to index Elasticsearch documents from a JSON file using Python API, specifically the Bulk Helpers

Semantic Web & Linked Open Data
An overview of what is Semantic Web, how it uses RDF data, how to query it through SPARQL and what is linked open data.

Time Series Databases: A Hands-On Introduction With InfluxDB
An introduction to time series data and InfluxDB as a popular time series DB. A quick tutorial on how to use it, its advantages, and tools.

Introduction to Property Graphs Using Python With Neo4j
A quick overview of how to effectively use Property graphs (for modeling graphical data) using Python and Neo4j.

Word2Vec Model To Generate Synonyms – Performance Testing
This blog post is about the performances (time and memory) of our contribution to Apache Lucene to generate synonyms using Word2Vec

Elasticsearch Relevance Engine: Combining AI With Elastic’s Text Search
The goal of this blog post is to highlight new vector search capabilities introduced in version 8.8.0, especially ESRE

Tableau & SQL – Visual Analysis
This blog post is a result of our collaboration with the University of Padua, wherein the student Prarthana Ashokmayagappa played a significant role in selecting the

Pick the Best Database Type for Your Next Project
Information about some database types to help readers to decide on what database to use for their projects.

Word2Vec Model To Generate Synonyms on the Fly in Apache Lucene – W2V Algorithm
This blog post aims to explore Word2Vec, the algorithm we used to generate synonyms in our contribution to Apache Lucene

Online Search Quality Evaluation With Kibana – Visualization Examples
This blog post describes an alternative and customized approach for evaluating ranking models through the use of Kibana.

How to Choose the Right Large Language Model for Your Domain – Open Source Edition
Note on “Open Source” claims This blog post was originally written before the publication of the Open Source AI Definition, which provides a much-needed clarification

Apache Solr Neural Highlighting Plugin
Discover in this blog post the Solr Neural Highlighter Plugin, which uses deep learning to highlight essential text for query answering.

Elasticsearch Neural Search Improvements in 8.6 and 8.7
This blog post showcases the vector search improvements that have been introduced in the latest versions of Elasticsearch (8.6 and 8.7)

Word2Vec Model To Generate Synonyms on the Fly in Apache Lucene – Introduction
This blog post aims to explore our contribution to Apache Lucene, which integrates a Word2Vec model to generate synonyms

Online Search Quality Evaluation With Kibana – Introduction
This blog post describes an alternative and customized approach for evaluating ranking models through the use of Kibana.