Exploration of Large Language Models as the Basis for Natural Language Query Interfaces to Big Data Systems

Prof. Richard Sinnott1, Mr Luca Morandini1

1University Of Melbourne, Melbourne, Australia

Biography:

Professor Richard O. Sinnott is Professor of Applied Computing Systems and Director of the Melbourne eResearch Group at the University of Melbourne. He has been lead software engineer/architect on an extensive portfolio of national and international projects, with specific focus on those research domains requiring finer-grained access control (security) and those dealing with big data challenges. He has over 450 peer reviewed publications across a range of applied computing research areas.

Mr Luca Morandini is a Cloud architect in the Melbourne eResearch Group and an Honorary Fellow in the School of Computing and Information Systems at the University of Melbourne. He heads the development of the AIReD platform.

Abstract:

One of the key challenges of big data systems is development of flexible and intuitive user interfaces to explore potential patterns in the data. This is especially challenging for big data which is commonly typified by variety, velocity and volume. Within the context of the Australian Research Data Commons (ARDC – www.ardc.edu.au) funded Australian Internet Observatory (AIO – https://internetobservatory.org.au/) and specifically within one key component of the AIO: the Australian Internet observatory Research Dashboard (AIReD – https://www.aio.eresearch.unimelb.edu.au/) web-based filters offering logic-based searches have been realised for searching through large quantities of diverse social media posts. Such filters are realised by web forms that offer ways for researchers to find data of interest, e.g. within a particular time period, on a particular topic. However such forms are often difficult to create and not aligned with the ways in which serendipitous discovery of patterns in big data can arise.

Large language models (LLMs) offer an alternative approach to querying data. The ability to write natural language as the basis for exploring big data offers many potential advantages – especially with regards to expressiveness and usability. This talk will showcase the use of LLMs as the basis for natural language query interfaces to AIReD in allowing researchers to explore the diverse and evolving social media data sets. The talk will include a demonstration of exemplar case studies highlighting the advantages and disadvantages of LLMs within AIO.

 

 

Categories