As usual, after Berlin Buzzwords it’s MICES day, a packed mini-conference on e-commerce search which always surprises me.
MICES has become THE European community event to bring together experts of different backgrounds – such as IT, product managers, UX designers, search managers, information retrieval specialists, data scientists, and search engine vendors – to discuss challenges, ideas, best practices, and case studies in the e-commerce search domain during this informal one-day event.
This blog post summarises my personal experience.
What I love the most about MICES is the ability to gather software engineers, product managers/owners, User eXperience folks, and anyone in between: it’s a great place to share ideas and blend different perspectives.
I followed the whole day (including the (non) beer and Korean BBQ after that).
All talks were fantastic, showing interesting results achieved and equally thought-provoking “failures”.
And let’s spend a minute on “failures” because, in my opinion, they are equally important as success stories: I believe our community is built of very smart individuals so I find incredible value in sharing what we *thought* would work but didn’t in our use case and domain: it’s an important lesson learned and it’s going to help many more people.
My favorite session has to be the BarCamp: during the days the organizers collected relevant topics to discuss, then groups were formed and each group ended up with a summary of the 30 mins discussions.
I participated to:
The starting question was about the lack of explainability for Semantic search and how we could improve it, my proposal was to align an explainer model to the sentence-similarity model, a sibling, fine-tuned from the main embedding encoder to reason on the motivation of the similarity score produced.
Definitely, something to explore more in the state-of-the-art literature: clearly explainability is a dominant factor in vector-based search.
The overall feeling during the discussion was of uncertainty, it’s definitely too early to understand if vector-based search is the future or just the hype of the moment.
K-NN search returns K results if available, even if N of them are completely irrelevant to the user information need.
The first question was “Is it possible to use a similarity score threshold to filter out such results in Lucene/Solr?“
We ended up with the conclusion that technically it’s not a big deal, it’s possible to add a query time parameter in Lucene, to filter out results that are below a certain numeric threshold and the same can be done even now, just post-processing the results.
But how can a user identify this threshold?
Unfortunately, with the current approaches in vectorization, we may have queries where 0.2 means relevant and others where 0.8 is actually relevant.
Even if the score is probabilistic, it’s hard to normalize across queries and domains.
So three approaches were proposed:
- to use multiple embeddings models and return the intersection of their results (if more than one model agrees there’s a higher probability the result is really relevant
- to leverage online users’ interactions to monitor and then apply the threshold based on clicks and engagement
- to leverage clustering to groups of similar vectors and potentially clearly separate relevant from irrelevant groups
An additional point of discussion was around a possible bump in scores, that may help identifying where to put the threshold, but on a quick empirical check on some available examples, this bump was hard to be found.
The last discussion I joined was around how vector based search is/should affect the search user experience and user interfaces.
We came up with the answer that actually it’s the other way around:
Vector search is an enabler that allow UX and UI to experiment wild and innovative ideas far from the classic grid/list approach:
And that was the closing of an amazing experience!
Did you attend the MICES conference?
Let us know what you thought of the conference and which talks you enjoyed the most!
We can’t wait to hear your thoughts in the comments section below.
Subscribe to our newsletter
Did you like this post about Our experience at MICES? Don’t forget to subscribe to our Newsletter to stay always updated in the Information Retrieval world!