We are delighted to announce the twentieth London Information Retrieval Meetup, a free evening event aimed at enthusiasts and professionals curious to explore and discuss the latest trends in the field.
This time the Meetup is Hybrid, with a live event in Southampton, thanks to CMI and the University of Southampton, being streamed online on Zoom!
ATTENTION
Remember to fill out the form to confirm the registration
// in-presence
Building 100 Room 4013 at Highfield Campus Southampton
University of Southampton SO17 1BJ, United Kingdom
[Parking at the visitor car park is free from 5pm next to building 100 (see map). On entering University Road from Burgess Road, turn right into Salisbury Road. The Visitors car park is the first car park on the right and has a grey attendant’s cabin at the entrance. The bus stops are in front of the building 100.]
Date: 15th February 2024 | open doors from 5:45 PM (GMT)
ATTENTION: Remember to fill out the form to confirm the registration:
https://forms.gle/BYe2Cnx6cDMCB6mJ8
// online
Location: Zoom [You will receive the link after the registration]
Date: 15th February 2024 | 6:00-8:00 PM (GMT)
// LONDON INFORMATION RETRIEVAL MEETUP
PROGRAM
The event will be structured around 2 technical talks, each followed by a Q&A session. The event will end with a networking session.
> Open doors from 5:45 PM GMT (in-presence)
> 6:00 GMT open doors for virtual attendees
6:00-6:05 PM Welcome from Haiming Liu (Director of the Centre for Machine Intelligence (CMI))
6:05-6:20 PM Welcome from Alessandro Benedetti (Director @ Sease)
> 6:20-7:00 PM FIRST TALK
Large Language Models for Information Extraction and Information Retrieval – Stuart Middleton (Associate Professor at University of Southampton)
> 7:00-7:40 PM SECOND TALK
Word Embeddings Compression for Neural Language Models – Amit Kumar Jaiswal (Postdoc at University of Surrey and Honorary Research
Fellow, UCL)
> 8:00-9:00 PM Networking session + buffet
// first talk
Large Language Models for Information Extraction and Information Retrieval
In this talk I will present some of my recent research applying Large Language Models (LLMs) to Information Extraction (IE) and Information Retrieval (IR). I will first discuss research using LLMs for classification and extraction of information from online conversations in the context of supporting forum moderators and mental health professionals such as Kooth Plc. Then I will discuss research using LLMs and IR for question answering and code generation in an aerospace engineering context with partners including Airbus.
Stuart Middleton
Associate Professor at the University of Southampton.
Dr Stuart Middleton is an Associate Professor at the University of Southampton. He has more than 60 peer reviewed publications, many inter-disciplinary in nature, focussing on the Natural Language Processing (NLP) areas of information extraction and human-in-the-loop NLP. His research interests are focussed on socio-technical NLP approaches, including large language models, few/zero-shot learning, rationale-based learning, adversarial training and argument mining. He has worked in domains including law enforcement, defence and security, mental health, environmental science, legal and misinformation. He has won over £8M grant income for University of Southampton as a PI and CoI. He is a CoI of the £5.8M MINDS Centre for Doctoral Training, Defence & Security sector lead for the £11M UKRI TAS Hub, Turing Fellow (2021 – 2023), board member of the Centre for Machine Intelligence and a full member of the EPSRC peer review college. He has served on organising committees for several international conferences and workshops, including Area Chair of ACL-2023, chair of workshops at AIUK-2024, AIUK-2023 and WebSci-2020 and short paper & poster chair of IEEE Intelligent Environments 2016. He has been an invited expert at various meetings including UK Cabinet Office Ministerial AI Roundtable 2019 on “Use of AI in Policing”.
// second talk
Word Embeddings Compression for Neural Language Models
Conventional neural word embeddings typically rely on a more diverse vocabulary. However, language models tend to encompass major vocabularies through word embedding parameters, particularly in multilingual models that incorporate a substantial portion of their overall learning parameters. Established techniques for compressing general neural models involve low-precision computation, quantisation, binarisation, dimensionality reduction, network pruning, SVD-based weight matrix decomposition, and knowledge distillation. In the context of neural language models, emphasis has been placed on compressing extensive word embedding matrices. Despite the prevalent use of tokenisation (SentencePiece) in practical Transformer implementations, the effectiveness of tokenisation as a form of compression. Most importantly, does it actually always guarantee the retaining the performance across varied natural language understanding task? This talk will provide you with the answers and an understanding of why and how language models compression in a meaningful way.
Amit Kumar Jaiswal
Postdoc at University of Surrey and Honarary Research Fellow, UCL





