At Haystack 2019, Tito Sierra and Tara Diedrichsen made the case for a Human Rated Testing program in improving search.
In this talk we’ll share an update on adding support for multiple judgements from multiple raters to Quepid, an open source tool for supporting HRT programs. We’ll talk about some of the analytics to measure how aligned the raters are, and solicit feedback from the community on next steps.
Lastly, we’ll do some live rating with the audience, to demonstrate some of pitfalls of human judgements.