Dr Steven McEachern1, Ms Janet McDougall1, Ms Marina McGale1
1Australian Data Archive, Acton, Australia
The Australian Data Archive (ADA) has been supporting access to sensitive data since 1981, to support fine-grained access to confidentialised information on the Australian population. As expectations for open data access via the web become increasingly widespread, ADA has continued to develop systems and processes to meet these expectations, while supporting the privacy and confidentiality expectations of the participants in the research. However meeting FAIR data expectations is particularly challenging when the data itself is sensitive and confidential. The potential for breaches of privacy and confidentiality means that access to sensitive data means that access to such data needs to be restricted. New models of managing such access, such as the Five Safes model (Ritchie, 2017), have been developed to provide a framework for enabling release of such data. The Five Safes framework proposes five areas of emphasis in developing data access models:
- Safe people: Can the researchers be trusted to use the data in an appropriate manner?
- Safe projects: Is this an appropriate use of the data?
- Safe settings: Does the access facility limit unauthorized use?
- Safe data: Is there a disclosure risk in the data itself?
- Safe outputs: Are the statistical results non-disclosive?
The elements of the Five Safes framework can be implemented in varying combinations, with different emphasis applied to each of the Five Safes, and different combinations of administrative and technical systems, in order to enable safe data access for sensitive content. The Australian Bureau of Statistics, for example, prioritises safe settings, safe data and safe outputs in enabling access to sensitive microdata for government and academic researchers (Webster, 2016).
This presentation will present the recent developments at the ADA to support access to sensitive data through the Dataverse data repository system (http://dataverse.org). Dataverse, developed at Harvard University since 2006, is now used across more than 30 organisations internationally to support open access to research data around the world. In order to provide improved support for sensitive data through Dataverse, the ADA has been working with the Australian Department of Social Services to enable access to key data holdings within the DSS National Centre for Longitudinal Data. These include key longitudinal data assets held by the NCLD, including the widely used Household Income and Labour Dynamics in Australia survey (HILDA – https://melbourneinstitute.unimelb.edu.au/hilda).
This presentation will provide an overview of the Five Safes framework and the Dataverse software. It will then present a walkthrough of the extensions ADA has made to the Dataverse environment to improve support for sensitive data using the Five Safes model, and the technical and administrative processes ADA has adopted in order to enable access to the DSS data through the Dataverse environment. The presentation will conclude with proposed future plans and additional requirements for future research needs in the social sciences, humanities and population health.
Ritchie, F. (2017) The ‘Five Safes’: a framework for planning, designing and evaluating data access solutions. University of the West of England, Bristol and Administrative Data Service, UK. Available from: https://zenodo.org/record/897821#.Wxp_QVOFPOS
Webster, A. (2016) The Five Safes Framework: How the ABS is supporting use of public sector data. Available from: http://www.nss.gov.au/nss/home.NSF/533222ebfd5ac03aca25711000044c9e/b691218a6fd3e55fca257af700076681/$FILE/The%20Five%20Safes%20Framework.%20ABS.pdf
Steve is Director of the Australian Data Archive at the Australian National University. He has research interests in data management and archiving, community and social attitude surveys, new data collection methods, and reproducible research methods. Steve holds a PhD in industrial relations and a Graduate Diploma in Management Information Systems, and has been involved in the management of data archives in the social sciences, business and economics for over 15 years. He is currently a member of the executive for the International Federation of Data Organisations in Social Science (IFDO), and chair of the Executive Board of the Data Documentation Initiative (DDI), an international metadata standard for the management of social science research data used in over 80 countries.