Dr Serryn Eagleson1, Steven McEachern2, Michael Rigby1, Josh Clough1, Amir Fila1, Xavier Goldie1, Phil Greenwood1, Rob Hutton1, Ivan Widjaja1
1 AURIN, University of Melbourne, Melbourne, Australia, firstname.lastname@example.org
2 ADA, Australian National University, Australia, email@example.com
Future decision-making processes are often limited by our ability to comprehend the impacts of decisions made in the past. Currently cities across Australia are rapidly growing and good decisions are needed to ensure that people have access to adequate housing, services and jobs. Though planners today have access to a range of high value datasets describing the current state of society, they also require an understanding of the impacts of historical decisions and how environments have changed. Over the last 200 years, Australia’s census products have collected a wealth of information on the growth and development of its population, society and economy, with much of this historical data now housed within the Australian Data Archive (ADA). If this vast archive were to be both spatially and temporally enabled, this rich resource would become available and usable by researchers and decision makers, facilitating greater understanding how regions have grown and changed in response to activities such as the provision of infrastructure. Further, this might also reveal a myriad of unknowns to learn from in areas such as health, built and natural environment, education, and energy. This presentation outlines new workflows developed to expose historical data from the ADA through to modern mapping systems via the Australian Urban Research Infrastructure Network (AURIN). It will present an example investigation into population changes within the City of Logan in Queensland over the past 30 years.
REQUIREMENTS AND METHODOLOGY
The publishing of historical census products required data relationships to be established between the Australian Bureau of Statistics (ABS), the ADA and AURIN at both human and machine levels. The initial phase consisted of the identification of points of contact, existing data infrastructures, formats, standards, etc. This was followed by agreement on the project’s high-level purpose, understanding the data’s context and eventual context of use (Lloyd and Dykes, 2011). The resulting baseline considered the translation of data into a usable format, where usability was considered in different dimensions.
Based on these requirements, an agile workflow was designed (Figure 1) and iteratively refined with project milestones. As raw data from the ABS was originally stored on technology using legacy formats, significant effort was required by the ADA to extract, translate and load the data into a version that reflected the ABS’s data structure and AURIN’s publishing requirements. Once formalized, statistical software was used to summarise the numerical data, defining types and categories. Next, the open-source Dataverse Project (https://dataverse.org/) was used for data sharing (King, 2007), outputting both the processed and numerical summary data. This provides the first entry point for users to access the authoritative tabular data from the ABS. Next, AURIN consumed historic boundaries from the ABS, which had undergone cleaning, re-alignment and re-projection for compatibility with its data infrastructure and web solutions. Following this, the boundary geometry was generalized/optimized for web use, and joined with the processed census data from the ADA. The summary data was then combined with other human readable material to curate each census product’s metadata according to the ISO 19115 standard (ISO, 2014), to ensure that the data is FAIR (findable, accessible, interoperable and re-usable) (FORCE11, 2014). This implementation was extended to address additional data citation principles (Data Citation Synthesis, 2014) and provenance requirements (World Wide Web Consortium, 2013). Once prepared, essential legal matters were identified and agreed, and final products were registered and published via AURIN’s web applications: Map, Portal and API, as the second access point for visualization and analysis tasks.
Using the new census products, analyses were performed on population density over Logan in Queensland. The region is the centre of new data exploration by the Griffith University (Regional Innovation Data Lab) leveraging the AURIN data infrastructure in conjunction with Queensland Cyber Infrastructure Foundation. To encourage participation from stakeholders, an early prototype visualisation was designed to demonstrate the power of spatio-temporal data (MacEachren and Taylor, 1994). The purposes of this is to examine how visual thinking and exploration may be used in the design of an interactive dashboard for the Logan community. This prototype is currently undergoing user testing with results fed back to guide the design of subsequent releases.
Figure 1: Workflow for publishing historical census data via AURIN
This project was funded by the National Collaborative Infrastructure Scheme (NCRIS) and the Australian National Data Service (ANDS). It aims to assist collaboration across the following HASS platforms: eResearch SA Limited, Australian Data Archive, Alveo, Griffith University, AARNet and TROVE as part of the Humanities, Arts and Social Sciences (HASS) Data Enhanced Virtual Lab (DEVL) project (https://hasscloud.net.au/). Additionally, we acknowledge the support of Griffith University (Regional Innovation Data Lab) and the Queensland Cyber Infrastructure Foundation (QCIF) and the Australian Bureau of Statistics for providing data and contextual information.
1 Lloyd, D. and Dykes, J. (2011) “Human-centred approaches in geovisualisation design: Investigating multiple methods through a long-term case study”, IEEE Transactions on Visualization and Computer Graphics, vol. 17, no. 12, pp. 2498-2507
2. King, g. (2007) “An Introduction to the Dataverse Network as an Infrastructure for Data Sharing” Sociological Methods and Research, vol. 36, pp. 173–199. Available online: http://j.mp/2owjurr
- World Wide Web Consortium (2013) “PROV Model Primer”, W3C Working Group Note 30 April 2013. Available online: https://www.w3.org/TR/2013/NOTE-prov-primer-20130430/
- FORCE11 (2014) “Guiding Principles for Findable, Accessible, Interoperable and Re-usable Data Publishing version B1.0”, FORCE11: The Future of Research Communications and e-Scholarship. Available online: https://www.force11.org/fairprinciples
- Data Citation Synthesis Group (2014) “Joint Declaration of Data Citation Principles”. Martone M. (ed.) San Diego CA: FORCE11. Available online: https://doi.org/10.25490/a97f-egyk
- International Standards Organization (2014) “ISO 19115-1:2014 Geographic information — Metadata — Part 1: Fundamentals”, ISO. Available online: https://www.iso.org/standard/53798.html
- MacEachren, A. M., & Taylor, D. (Eds.) (1994) Visualization in Modern Cartography. First edition, London, Great Britain: Pergamon.
Serryn is Deputy Director at AURIN and Assistant Research Director at the CRC for Spatial Information. Serryn completed her PhD in Geographical Information Systems (GIS) and the design of administrative boundaries at the University of Melbourne in 2003. This research contributed to changing in the allocation of administrative boundaries across Australia, in recognition for her work was awarded a prestigious Victorian Fellowship. In addition to academic qualifications Serryn has over 15 years of applied experience in spatial modelling and applying this expertise to inform urban planning for local and state governments. Serryn has also actively worked in consulting and provided advice as an Expert Witness for the City of Melbourne at the Victorian Planning Panel.