Event, News

Sease at SDMX Global Conference

October 26, 2023
3 mins read

SDMX Global Conference 2023

The Global Conference is a bi-annual event for the official statistics community worldwide to share information on recent and upcoming SDMX developments.

Location: The Art Hotel | Kingdom of Bahrain
Date: 29 October to 02 November 2023

Our Talk

When SDMX meets AI: Leveraging open source LLMs to make official statistics more accessible and discoverable

This intervention draws on experimentations ongoing in the context of the OECD-led Statistical Information System Collaboration Community (SIS-CC) to enable AI applications with SDMX. One important use case is to use AI for better accessibility and discoverability of the data: whilst UX techniques, lexical search improvements, and data harmonisation can take statistical organisations to a good level of accessibility, however, a structural (or “cognitive” gap) remains between the data user needs and the data producer constraints. That is where AI – and most importantly, NLP and LLM techniques – could potentially make a difference. The “StatsBot” could be this natural language, conversational engine that could facilitate access and usage of the data. The “StatsBot” could leverage the semantics of any SDMX source.

The objective of the presentation is to propose a technical approach and a way forward to achieve this goal and create the StatsBot as a universal, open asset usable by all statistical organisations. In a first step, the concept tested is to use Large Language Models with the Apache Solr index of SDMX objects so as to transform natural language queries into SDMX queries. In a second step, results could be framed as a natural language statement complementing the top-k search results. For the purpose of initial PoCs – aimed to demonstrate functional features and feasibility – a commercial LLM (such as OpenAI GPT-4) will be used; in a later stage substitution with an open source LLM will be analysed. The presentation will include the results of the first experimental work, lessons learnt, and scope future work that should lead to defining the path for production-grade, fully open source, and universal StatsBot.

Our Speaker

Alessandro Benedetti

FOUNDER @ SEASE

APACHE LUCENE/SOLR COMMITTER
APACHE SOLR PMC MEMBER

Slides

Other posts you may find useful

We are Sease, an Information Retrieval Company based in London, focused on providing R&D project guidance and implementation, Search consulting services, Training, and Search solutions using open source software like Apache Lucene/Solr, Elasticsearch, OpenSearch and Vespa.

Sign up for our Newsletter

Did you like this post? Don’t forget to subscribe to our Newsletter to stay always updated in the Information Retrieval world!

About the company

about our work

Rated Ranking Evaluator
(RRE)

Rated Ranking Evaluator Enterprise (RREE)

Apache Solr LLM Highlighter plugin

News

Main Blog

TIPS AND TRICKS

LATEST BLOG POST

contact us

Don't miss all the news - subscribe to our newsletter!

Sease at SDMX Global Conference

SDMX Global Conference 2023

Our Talk

When SDMX meets AI: Leveraging open source LLMs to make official statistics more accessible and discoverable

Our Speaker

Alessandro Benedetti

Slides

Other posts you may find useful

OpenSearch Neural Sparse Search Tutorial

How to calculate aggregations in Elasticsearch as percentages?

Hybrid Search with Reciprocal Rank Fusion in Apache Solr

Lisa Biella

Lisa Biella

Follow Us

Top Categories

Recent Posts

Boosted K-Nearest Neighbor Search

Vector Search Doctor (Part 2): Bridging the Gap Between Theory and Practice in Vector Search

Vector Search Doctor (Part 1): Beyond the MTEB Leaderboard for Custom Datasets

Monthly video

Sign up for our Newsletter

Leave a Reply Cancel reply

Quick Links

Services

Subscribe

About the company

about our work

Rated Ranking Evaluator (RRE)

Rated Ranking Evaluator Enterprise (RREE)

Apache Solr LLM Highlighter plugin

News

Main Blog

TIPS AND TRICKS

LATEST BLOG POST

contact us

Don't miss all the news - subscribe to our newsletter!

Sease at SDMX Global Conference

SDMX Global Conference 2023

Our Talk

When SDMX meets AI: Leveraging open source LLMs to make official statistics more accessible and discoverable

Our Speaker

Alessandro Benedetti

Slides

Other posts you may find useful

OpenSearch Neural Sparse Search Tutorial

How to calculate aggregations in Elasticsearch as percentages?

Hybrid Search with Reciprocal Rank Fusion in Apache Solr

Lisa Biella

Lisa Biella

Follow Us

Top Categories

Recent Posts

Boosted K-Nearest Neighbor Search

Vector Search Doctor (Part 2): Bridging the Gap Between Theory and Practice in Vector Search

Vector Search Doctor (Part 1): Beyond the MTEB Leaderboard for Custom Datasets

Monthly video

Sign up for our Newsletter

Leave a Reply Cancel reply

Rated Ranking Evaluator
(RRE)