Sensitive Data Combined BoF

Dr Steven McEachern1, Dr. Frankie Stevens2, Dr.  Nichola Burton3, Dr. Kristan Kang4

1Australian Data Archive, Acton, Australia
2AARNet, Gold Coast, Australia
3ARDC, Perth, Australia
4ARDC, Canberra, Australia

 

Combining: Recent developments in platforms and services for sensitive data access in Australia

(“unit record data”) to address major research and policy questions in Australia in recent years. This interest spans a variety of research domains – including linked administrative data for social policy analysis; integration of patient, health service and biomedical data to study health outcomes; and integration of long-term population data with longitudinal cohort data to study gender role attitudes.

This growth in demand however has resulted in an identified need for platforms and facilities capable of supporting such data in a secure and transparent way. There is a need for such facilities to enable research with sensitive data to be completed effectively and efficiently while meeting the privacy and confidentiality expectations of both subjects (i.e. people) and data custodians (across academic, government and elsewhere).

This need has not been well-represented within the eResearch community in general. The Population Health Research Network providing relevant support within the health domain, but in general the problem has been neglected. Recent efforts however have begun to address this. The purpose of this BoF is to present an overview of a number of new projects and activities intended to support and enable increased use of sensitive data in the Australian research community. Projects and activities to be presented include CADRE  (Australian Data Archive), Sensitive Data Support Proof of Concept (AARNet) and the forthcoming Commonwealth Data Access and Transparency Bill.

The BoF will then open up discussion to consider establishment of a national framework for sensitive data support and future support requirements.


With: The Sensitive Data Access Community of Practice – recent activities and future directions

The Sensitive Data Community of Practice (CoP) was established in 2018, following on from the 2018 eResearch conference BoF “Sensitive Data – How do you do yours?” (https://conference.eresearch.edu.au/2018/08/sensitive-data-how-do-you-do-yours/). In the two years since it’s establishment, the CoP has built a growing community of eResearch practitioners with a common interest in sharing approaches, practices and experiences in managing sensitive data. The community includes technical, governance and domain specialists across academia, government, health and medical facilities and private sector providers.

This BoF is intended to introduce the Sensitive Data CoP to the broader eResearch community. The CoP facilitators (from AARNet, ADA and ARDC) will present an introduction to the purpose and scope of the CoP. The team will review the activities of the CoP, and the breadth and coverage of the CoP membership. They will then lead a facilited discussion among participants on potential future directions for the CoP, future topics and speakers for CoP events, and interest in collaborative projects and investments to enable research using sensitive data.

The Sensitive Data Access Community of Practice – recent activities and future directions

Dr Steven McEachern1, Dr. Frankie Stevens2, Dr.  Nichola Burton3, Dr. Kristan Kang4

1Australian Data Archive, Acton, Australia
2AARNet, Gold Coast, Australia
3ARDC, Perth, Australia
4ARDC, Canberra, Australia

The Sensitive Data Community of Practice (CoP) was established in 2018, following on from the 2018 eResearch conference BoF “Sensitive Data – How do you do yours?” (https://conference.eresearch.edu.au/2018/08/sensitive-data-how-do-you-do-yours/). In the two years since it’s establishment, the CoP has built a growing community of eResearch practitioners with a common interest in sharing approaches, practices and experiences in managing sensitive data. The community includes technical, governance and domain specialists across academia, government, health and medical facilities and private sector providers.

This BoF is intended to introduce the Sensitive Data CoP to the broader eResearch community. The CoP facilitators (from AARNet, ADA and ARDC) will present an introduction to the purpose and scope of the CoP. The team will review the activities of the CoP, and the breadth and coverage of the CoP membership. They will then lead a facilited discussion among participants on potential future directions for the CoP, future topics and speakers for CoP events, and interest in collaborative projects and investments to enable research using sensitive data.

REVIEWER NOTE: This BOF is intended to run separately from, but extending upon, the co-submitted “Recent developments in platforms and services for sensitive data access in Australia”.


Biography:

Steve McEachern is the Director of the Australian Data Archive at the Australian National University

Recent developments in platforms and services for sensitive data access in Australia

Dr Steven McEachern1, Dr. Frankie Stevens2, Dr.  Nichola Burton3, Dr. Kristan Kang4

1Australian Data Archive, Acton, Australia
2AARNet, Gold Coast, Australia
3ARDC, Perth, Australia
4ARDC, Canberra, Australia

(“unit record data”) to address major research and policy questions in Australia in recent years. This interest spans a variety of research domains – including linked administrative data for social policy analysis; integration of patient, health service and biomedical data to study health outcomes; and integration of long-term population data with longitudinal cohort data to study gender role attitudes.

This growth in demand however has resulted in an identified need for platforms and facilities capable of supporting such data in a secure and transparent way. There is a need for such facilities to enable research with sensitive data to be completed effectively and efficiently while meeting the privacy and confidentiality expectations of both subjects (i.e. people) and data custodians (across academic, government and elsewhere).

This need has not been well-represented within the eResearch community in general. The Population Health Research Network providing relevant support within the health domain, but in general the problem has been neglected. Recent efforts however have begun to address this. The purpose of this BoF is to present an overview of a number of new projects and activities intended to support and enable increased use of sensitive data in the Australian research community. Projects and activities to be presented include CADRE  (Australian Data Archive), Sensitive Data Support Proof of Concept (AARNet) and the forthcoming Commonwealth Data Access and Transparency Bill.

The BoF will then open up discussion to consider establishment of a national framework for sensitive data support and future support requirements.


Biography:

Steve is the Director of the Australian Data Archive at the Australian National University.

Mapping the Australian Geospatial Data/Services Landscape and Future Tasks for the Geospatial Capabilities Community of Practice

Michael Rigby1, Kheeran Dharmawardena2, Melanie Barlow3, Siddeswara Guru4

1AURIN, Melbourne , Australia
2Cytrax Consulting, Melbourne, Australia
3ARDC, Canberra, Australia
4TERN, Brisbane, Australia

Australia’s geospatial data and services landscape is rapidly changing with new capabilities, updated programs and responsibilities. In response, researchers and practitioners are expressing needs to understand what is available where and how to connect, so as to analyse and deliver outcomes at different levels. To address this a human and machine-readable representation of the data and services landscape is required, one that can effectively represent objects and link these together in a graph-based representation that allows users to traverse the landscape, identify gaps and potential opportunities. Drawing on materials captured from ‘The Case for National Coordination in Spatial information’ birds of feather session at eResearch 2019, the ARDC Geospatial Capabilities Community of Practice (GeoCapCoP) was established to coordinate activities towards the development of national coordination in geospatial information. To understand the landscape the GeoCapCoP performed a mapping exercise to chart the flow of data and metadata between different organisations. A working group was established to wrangle the map into a formal register, fleshing out the details of objects such as data, data store, metadata and metadata registry, delivered by services such as catalogues or API endpoints. The BoF will discuss the outcome of this exercise and progress towards the development of the registry. The interactive session will take community feedback to plan future directions towards the development of national coordination in the spatial information and linkage with other national initiatives such as Australia’s Foundation Spatial Data Framework LINK platform maintained by ANZLIC.


Biography:

Bio to come

Perfecting and Propagating the Power of PIDs in Research

Mr Melroy Almeida1, Ms Natasha Simons2, Dr Adrian Burton3

1Australian Access Federation (AAF), Brisbane, Australia
2Australian Research Data Commons (ARDC), Brisbane, Australia
3Australian Research Data Commons, Brisbane, Australia

Persistent Identifiers (PIDs) such as ORCIDs and DOIs are critical to enabling FAIR research data and outputs as they provide global methods to uniquely identify and connect entities in the research system (e.g. researchers, funders, organisations, articles, datasets, software, and samples). By linking these entities and enabling research provenance and attribution, PIDs lay the foundation for improved tracking of research impact. By precisely identifying the inputs and outputs of research, PIDs contribute to research integrity. By linking scientific concepts across systems, PIDs enable research innovation and efficiency. So how can we perfect and propagate the power of PIDs in Research?

Join this BoF to participate in a lively discussion about the power of PIDs including questions such as: What PIDs are useful in research and how can you access these? What is the value of PIDs and how can the value proposition be improved? What are the biggest challenges in the adoption of PIDs? What are recent developments in PID services locally and internationally? What might the future of PIDs look like?

This BoF will be of interest to those implementing, maintaining and supporting PID services including repository managers, software developers and librarians. Participants should come along prepared to exchange knowledge, share experiences and contribute to discussions about optimising the ‘power of PIDs’.


Biography:

Melroy Almeida is Technical Support Analyst (ORCID) at the Australian Access Federation (AAF).

Research Data on Ice: frozen data storage at AARNet

Mr David Jericho1, Mr Gavin Kennedy1, Mr Denis Lujanski1

1AARNet, Brisbane, Australia

Edwards Deming, an American statistician and engineer, once said: “Without data, you’re just another person with an opinion”, and researchers have been calling to the wider community to support their data needs with large scale data solutions that are both easy to use, and cost effective, encouraging wider reuse of the data.

As data generation increases exponentially, so do opportunities to reuse data sets in all fields of research, and consequently the importance of accessible archival data storage grows. Yet the expertise, let alone the financial capacity, to run the infrastructure for the required data management plans continues to remain outside the focus of a researcher’s discipline.

AARNet has been deploying a large scale secure frozen data archival storage service, with rapid retrieval, while focusing on a low annualised cost and higher standards compliance and protections of data.

A key desisn criteria of this service is not to use punitive charging models for accessing and retrieving the data, encouraging reuse and as-needed data access.

Our frozen data archival storage service is based on a model developed by CERN, storing over 300 petabytes, growing over 30% per year, and still archiving decades old data sets. The frozen data archival storage service is investigating alternate protocol options, permitting users to use their discipline specific or preferred tool according to their data management plans.

AARNet is piloting this service by running it as an in-house confirmation across their CloudStor bulk data stores before taking it to the wider community in 2020.


Biography:

Bio to come

Development of a DMP-driven Research Data Platform: A Case Study based on NII Research Data Cloud in Japan

Dr Mao Tsunekawa1, Dr Ikki Fujiwara1, Dr Yusuke Komiyama1, Dr Kazutsuna Yamaji1

1Research Center for Open Science and Data Platform, National Institute Of Informatics, Tokyo, Japan

Managing the research data consistently with the data management plan (DMP) ensures the essential requirements of research compliance and make research data FAIR. Machine-actionable DMP helps researchers to keep track of their actions along with a research data life cycle. In Japan, NII Research Data Cloud (NII RDC) is developed under the mission to provide nationwide service for facilitating data-centric science and open science. To support the different phases of the research, the NII RDC consists of three basic platforms: a research data management platform (GakuNin RDM), a repository platform (WEKO3), and a discovery platform (CiNii Research). The next challenge is to orchestrate these platforms for enabling DMP compliant transition between processes. After the careful comparison of the existing open source software so called DMP tools, ReDBoX2.0 has been chosen because of its concept. The preferable function is that the ReDBoX2.0 is originally intended to develop not only just writing DMP itself but also deploying an appropriate research environment associated with DMP. We expanded ReDBoX2.0 APIs to command our existing platforms in a machine-actionable manner. As an initial development, we succeeded in providing researchers with a general research environment compliant with DMPs by functions to deploy projects on GakuNin RDM associated with DMPs and generate a manifest for Jupyter Notebooks. In the future, we are planning to develop a supportive function in the publication process between GakuNin RDM and WEKO3, which allow researchers to contribute open science with a lower barrier.


Biography:

TSUNEKAWA Mao is Project Manager for DMP System at Research Center for Open Science and Data Platform(RCOS), National Institute of Informatics(NII), Japan. Mao is also working with a project about creating training materials for research data management as a member of the working group of research data, Japan Consortium for Open Access Repository(JPCOAR).

ORCID iD: https://orcid.org/0000-0002-2132-6232

An open, composable standards–based research eResearch platform: Arkisto

Dr Peter Sefton1, Dr Nick Thieberger2, Dr Marco La Rosa2, Mr Michael Lynch1

1University of Technology Sydney, Ultimo, Australia
2University of Melbourne, Melbourne, Australia

Research data from all disciplines has interest and value that extends beyond funding cycles and must continue to be managed and preserved for the long term. However much of the effort in eResearch goes into building systems which provide functionality and services that operate on data but which actually put data at risk, that is, by loading data into a particular tool so that  the data is not be easily retrievable if the service cannot be sustained, or, at worst, the data is lost.

The Arkisto (https://arkisto-platform.github.io/why/) approach is to work with a set of standards which make data available for long term access. Using the Oxford Common File Layout (OCFL) to organize data in a repository and Research Object Crate to describe data down to the file or even variable level Arkisto supports the safeguarding of data for the long term. A growing set of Arkisto-compatible software tools allow data ingest into repositories, and the creation of data discovery portals that connect data to analytical, visualisation and computing tools.

In this presentation we will introduce the standards based platform and show a number of examples from multiple disciplines of current Arkisto deployments, including an institutional Research Data Portal, a snapshot of the Expert Nation history project, crowd-sourced data from historical criminology , and the Pacific And Regional Archive for Digital Sources in Endangered Cultures (PARADISEC).


Biography:

Peter Sefton is the Manager, eResearch Support at the University of Technology, Sydney (UTS). Before that he was in a similar role at the University of Western Sydney (UWS). Previously he ran the Software Research and development Laboratory at the Australian Digital Futures Institute at the University of Southern Queensland. Following a PhD in computational linguistics in the mid-nineties he has gained extensive experience in the higher education sector in leading the development of IT and business systems to support both learning and research.

Research Data Management service: a health check

Jacky Cho, Adele Haythornwaite, Peter Sefton, Dr Kwun Lun Cho1, Dr Nichola Burton2, Mr Keith  Russell2, Ai-Lin Soo, Katy Toufexis

1UNSW, UNSW Sydney, Australia
2ARDC, Caulfield East, Australia

Presenters: Jacky Cho, Keith Russell, Nichola Burton, Adele Haythornthwaite, Helena Lynn, Peter Sefton

With the increasing complexity, cost, and risk in managing research data, many universities have sought to improve and align research data management (RDM) practices at their institution through a mix of IT infrastructure uplifts alongside policy development, engagement, and training.

At UNSW, after 2 years, this presents an opportunity for us to evaluate the current state of RDM at our institution as we deal with strategic planning, resourcing, and sustainability. With this in mind, an aligned RDM service framework facilitated by the ARDC and universities will be invaluable in allowing us to benchmark within an Australasian context.

This 60 min BoF will involve a series of presentations by institutions on an overview of their current state of RDM services in reference to the RISE framework, currently used by the ARDC to facilitate workshops in self-assessing institutional RDM capabilities.

Policy

Data Management Planning

Training

Advisory Service

Active Data Management

Appraisal/Risk Assessment

Preservation

Access and Publishing

Discovery

We will then facilitate an open discussion to draw upon input from the audience on wider experiences of other universities with a view on discussing the following questions:

–              Does the RISE framework cover the breadth of RDM services required?

–              Is there benefit in an ongoing exchange between universities on defining an RDM maturity model?

–              What would be a useful outcome of such an exercise?

This BoF is aimed at those interested in the design and delivery of research data policies, procedures, and services including institutional directors, administrators, support, and library staff.


Biography:

Jacky Cho is a project officer in the office of PVC-Research Infrastructure at the University of New South Wales. Prior to this role, he was a researcher in physical and surface chemistry. In his current role, he is responsible for a variety of projects to uplift research data services and support at UNSW.

How do we build a robust data sharing and analysis ecosystem to drive innovation in cardiovascular research.

Richard Cooke2, Dr Adam Hill1, Hank Levsen2, Dr Chai Ng1, Prof Jamie Vandenberg1, Soren Wohlthat2

1Victor Chang Cardiac Research Institute, Darlinghurst, Australia
2Telstra, Sydney, Australia

Introduction:

Australia is home to a number of leading medical research organisations generating large amounts of biomedical and clinical data that could better support health research if shared and utilised more effectively. However, in this increasingly data-intensive research environment, cultural, technical and policy barriers limit cross-organizational data sharing.

Methods:

Victor Chang Cardiac Research Institute and Telstra Health have partnered to create a national ecosystem for analysis and sharing of cardiovascular disease datasets using the Telstra Data Hub (TDH). The TDH, a partnership with Microsoft, leverages Azure services plus open source technologies like Comprehensive Knowledge Archive Network (CKAN) and Jupyter notebooks.  Our proof of concept focused on genomic and phenotypic data related to Long QT syndrome, one of the most common causes of sudden cardiac death in young people.

Results

The pilot brought together a cross-disciplinary team to test and validate approaches for analysis and sharing of cardiovascular datasets including: i) Leveraging cloud-based platforms for hosting, managing, linking and analysing data across multiple organisations; ii) Streamlining mechanisms related to the data sharing governance; and iii) Application of FAIR Principles to enhance the value of data.

Conclusions:

As part of the showcase, we will present our technological solutions, outcomes and lessons learnt from the pilot that have broad applicability to datasets related to any number of diseases or healthcare applications. Our ultimate goal is to encourage development of a national data sharing ecosystem to position Australia as a global leader in data-driven innovation in research and healthcare.


Biography:

Adam Hill: Adam is head of the Computational Cardiology Research Group at Victor Chang Cardiac Research Institute research. His research is focussed on understanding mechanisms of drug-induced and inherited arrhythmias and risk stratification for sudden cardiac death.

Richard Cooke: Richard is the commercial lead for the Telstra Data Hub, which is a cloud-based platform capability developed by Telstra to reduce the friction and cost of data sharing and collaboration.

Recent Comments

    Categories

    About the conference

    eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

    Conference Managers

    Please contact the team at Conference Design with any questions regarding the conference.

    © 2018 - 2020 Conference Design Pty Ltd