Extracting and Analysing Twitter and Facebook Data: Development of an User-Friendly Platform

Dr Yoong Kuan Goh1, Dr Jerry Lai1, Dr Christopher McAvaney2

1Intersect Australia, Sydney, Australia
2Deakin University, Melbourne, Australia

Social media platforms, such as Twitter and Facebook, have in recent years become the primary space whereby influencers can share their thoughts, ideas and feelings about different topics, and over the past few months, the global COVID-19 pandemic. By examining the data from these platforms, researchers are able to discover patterns and interesting insights into real-world problems. The biggest challenge that many researchers encounter when conducting Twitter and Facebook research – How do I extract data from these platforms without having to manually scrape data page-by-page, post-by-post? In addition to the cost consideration, another problem with social media research is the difficulty associated with cleaning, transforming, filtering and analysing the data.

Solution: Using an open source Twitter API, Deakin eResearch and Intersect Australia are developing a user-friendly tool to help researchers collect, clean, transform, filter and analyze online social data from Twitter and Facebook. Since the availability of public social media data has also presented ethical challenges, we will consider the ethical aspect too when designing the tool.

Methodology/Development: We began the development process by gathering requirements from a number of researchers specialised in the discipline of journalism, social science, and digital humanities. With this information, we are developing the user-friendly platform in Python, which focuses on three stages that are data collection, data manipulation and data analysis.


Yoong (Andrew) has a strong background in computer science and mathematics. His work interests are natural language processing, data analysis and machine learning.

Jerry has a background in psychological science and statistics. Jerry works with multiple research teams in Deakin on survey design; statistical programming; data-analytics and visualisation and promoting statistical literacy.

Christopher McAvaney is Services Manager, eResearch at Deakin University. He is responsible for establishing an eResearch program of work at Deakin which includes REDCap.  Christopher has extensive research administration experience at Deakin and is keen to leverage as much as possible from REDCap via its extensible design.

Recent Comments