Deep Learning for Text analysis
Sachin Wasnik1,2, Ameeta Jain2, Raghu Tirumala3 1Intersect, Sydney, NSW, Australia2Deakin University, Burwood, VIC, Australia3University of Melbourne, Parkvile, VIC, Australia
Abstract
Introduction
Qualitative studies often involve interviewing participants. The interviews are then transcribed into text from audio or video data. If there are many participants, it becomes cumbersome to read all the interviews and perform the required text analysis. One such study being conducted by the Department of Finance in the Faculty of Business and Law at Deakin University is seeking to understand the perceptions of employers, key workers, and other stakeholders regarding current and novel pathways to address the shortage of essential worker housing in regional areas in Australia. For this project, a total of 30 interviews were conducted.
Methods
To perform text analysis on the transcription of these interviews, we have used deep learning computational models. The text corpus of each group of stakeholders was summarized using the Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformer (GPT). The summaries created by these models were read by researchers to identify the common themes in the text.
Results
Using BERT and GPT has enabled the researcher to review the large corpus text in a very short time. The researchers were able to investigate the perceptions of stakeholders on housing availability and affordability in regional areas in Australia. Finally, it was possible to identify the challenges and opportunities to create an affordable housing program for essential workers in regional Australia.
Conclusion
BERT and GPT have augmented the researcher’s ability to perform text analysis and answer the research questions. The presentation will demonstrate the BERT and GPT models on sample text.
Biography
Sachin Wasnik is a Digital Research Analyst based at Deakin University. He has broad experience in the design and development of Data Science projects. He has worked as a Digital Research trainer leading Python, R and Data visualisation courses. He has finished his PhD in applying Deep Learning to EEG time series data at the University of Sydney.