AARNet’s Sensitive Data Service: A Proof of Concept Journey

Dr Frankie Stevens1, Robert Pocklington1, Michael D’Silva1, Gavin Kennedy1, Mike Baker3, Dr Adele Haythornthwaite2, Dr Ilka Kolodziej2

1AARNet, , Australia
2The University of Sydney, Sydney, Australia
3Children’s Medical Research Institute, Sydney, Australia

As a critical element of Australia’s research infrastructure landscape, Australia’s Academic and Research Network (AARNet) is intricately involved in providing research relevant infrastructure solutions to Australia’s research community. Such services include CloudStor, the national storage, analysis and management platform in use across Australia by over 90K researchers. AARNet has a strategic priority to support and invest in health and medical research infrastructure, and has been approached by the research community to provide sensitive data services, given the perceived gap in services that provide both the security and privacy required for sensitive data research, and the need to collaborate around this. This presentation will describe the journey AARNet has undertaken with respect to maturing CloudStor to be sensitive data appropriate, not only for the health and medical community, but also for other communities around Australia who deal with sensitive data, such as ecologists, cultural and human genomics researchers. The sector analysis and subsequent Proof of Concept (POC) project will be described in detail during the presentation, with rich demonstrations of the platform, including authorisation workflows, multi-factor authentication, audit trails and more. AARNet’s plans for the subsequent pilot and production systems will also be described. This presentation will inform current CloudStor users of the upcoming new features, and give institutions a preview of functionality that they have been requesting to host and manage their institution’s sensitive research data assets.


Biography:

Frankie Stevens, AARNet – Dr Frankie Stevens is AARNet’s Associate Director, eResearch, and leads the AARNet’s Health and Medical Strategic Priority. Previously, Frankie has held roles with the Australian Research Data Commons, the NSW state body for eResearch, the Research Data Storage Infrastructure (RDSI) Project and was eResearch Programme Manager at the University of Sydney. Frankie has 20 years’ experience in the Higher Education Sector, having worked in both the Australian and overseas university sectors. She is on the Technical Advisory Board for the Global Research Data Alliance.

Adele Haythornthwaite leads a team of research data consultants at Sydney Informatics Hub (University of Sydney), and formulates research data policy and strategy. Having a background in ecology and IT, Adele has a particular interest in helping researchers work with sensitive data.

Ilka Kolodziej is the Clinical Data Systems Manager at the NHMRC Clinical Trials Centre at The University of Sydney. She leads the team that develop clinical data systems for a variety of clinical trials and health-related research projects. These projects often involve international collaboration and require careful consideration towards the collection and management of participant data.

Mike Baker is Head of IT for Children’s Medical Research Institute. Mike has a long history of working with transformational national research infrastructure via the eScience program and the University of Edinburgh in the UK and the University of Sydney and AARNet in Australia.

Robert Pocklington is a full-stack software developer at AARNet working on the Sensitive Data Project.

Iterative cost optimisation when deploying an open source digital health platform

Mr Chris Williams1, Mr Mitchel Haring1, Prof. Matthew Bellgard1

1Queensland University of Technology, Brisbane, Australia

A common need by researchers is to deploy a digital platform to support and enable their research. The digital solution might be developed in-house or available as open source software. In addition, quite often the solution will be used by external stakeholders. From a deployment perspective, there are various levels of sophistication on how this can be managed to comply with institutional polices, community expectations around security, FAIR data principles, privacy and so forth, in order to develop a sustainable deployment strategy. A key approach that has come to prominence is the opportunity to deploy the solution in a Public Cloud to leverage the wealth of infrastructure and services that would otherwise need to be developed in-house. Pragmatically, the challenge then becomes how to optimise and minimise the deployment costs whilst not adding operational overhead.

In this presentation, we will share our eResearch journey of deploying a production global digital health platform [1, 2] that has evolved from simply running a few dockers on an EC2 instance to a full CI/CD pipelined serverless deployment. The solution is made with a blue/green deployment strategy of docker containers into a Fargate environment. We will also demonstrate security enhancements using GuardDuty, Web Application Firewall and CloudFront to keep ahead of the “Bots and Hackers” when deploying a platform that contains sensitive identifiable health data. Finally, we will provide an overview of how we moved our non-production deployments to what we are referring to as “dynamically available” that enabled to uncover an unexpected cyber security threat.

References

  1. Bellgard, M.I., Snelling, T. & McGree, J.M. RD-RAP: beyond rare disease patient registries, devising a comprehensive data and analytic framework. Orphanet J Rare Dis 14, 176 (2019). https://doi.org/10.1186/s13023-019-1139-9.
  2. Foundation for Angelman Syndrome Therapeutics: Global Angelman Syndrome Registry: https://cureangelman.org/global-angelman-syndrome-registry-moves-platform-to-qut

Biography:

Chris Williams in an experienced Information Technology Professional skilled in AWS Solution Architecture, Cloud Migration and Deployment, Linux and Windows System Administration, Databases, Automation , Mentoring team members and IT Strategy.

Mitchel Haring is a DevOps Engineer with intersts in Kubernetes, Automation, Cloud Computing, Python and Tool Development

Professor Matt Bellgard is the inaugural eResearch Director at Queensland University of Technology. He has personally attracted over $45m in research funding, is co-inventor of 5 full/20 provisional patents, co-designed and commissioned a world’s top 100 supercomputer, co-authored over 152 peer reviewed articles in areas including human/animal/plant genomics, bioinformatics, health informatics, AI, biosecurity, eResearch, HASS, remote sensing and radio astronomy. He has led the design and development of digital health solutions for government, industry and academia and is Chair of the APEC Rare Disease Network.

Re-Thinking How We Share Data: A National Approach to Genomic Data Sharing in The Australian Genomics Health Alliance

Dr Marie-Jo Brion2

1QIMR Berghofer Medical Research Institute, Brisbane, Australia (AUS)
2Australian Genomics Health Alliance, Melbourne, Australia

Introduction: Data sharing is vital in genomics, where research and new discoveries typically rely on the availability of large and diverse datasets. However, there are many barriers to successful data sharing, including technical and governance challenges. The Australian Genomics Health Alliance (Australian Genomics) is developing national approaches to sharing genomic data and exploring different ways in which data sharing can be achieved.

Method: Australian Genomics comprises 80 organisations and 18 clinical flagship projects on rare disease and cancer. It has centralised, nationally co-ordinated processes around consent and ethics, with participant recruitment and clinical data collection operational at 32 clinical sites across Australia. From this, its Data Program is ingesting, storing, and supporting the re-use and sharing of genomic data for clinical and research purposes.

Results: Australian Genomics has implemented three approaches to sharing genomic data: 1) a centralised repository for access to individual genomic data files, for approved researchers with ethically-approved projects (‘Genomic Data Repository’); 2) a controlled access, web-based, interactive data platform with real-time queries and visualisations of aggregated, summary-level genomic data (‘Variant Atlas’);  3) a cloud-based variant-sharing platform for automated sharing of clinically-reported classifications and supporting evidence, from diagnostic genomic testing (‘Shariant’).

Conclusion: Valuable forms of genomic data sharing can be attained with platforms that support sharing different categories of genomic data and information. This includes individual-level datasets, aggregated summary data, and classifications and evidence summaries, for the benefit of different user groups.


Biography:

Dr. Marie-Jo Brion is the Data Program Manager for the Australian Genomics Health Alliance. Australian Genomics is a national network of over 400 members and 80 partner organisations, conducting research to inform use of genomics in mainstream healthcare. Marie-Jo manages the delivery of its national infrastructure for collecting, storing and using clinical and genomic data.  She has a PhD in Epidemiology from the University of Bristol UK, and has conducted cross-population and genetic cohort studies in the USA, Brazil and Australia. Marie-Jo is passionate about ethical data sharing for medical research.

Benchmarking and improvement opportunities for data management practices in health research

Dr Michelle Krahe1, Julie Toohey2, Malcolm Wolski3, Professor Paul Scuffham4,5, Professor Sheena Reilly1,5

1Health Group, Griffith University, Gold Coast, Australia
2Library and Learning Services, Griffith University, Gold Coast, Australia
3eResearch Services, Griffith University, Nathan, Australia
4Centre for Applied Health Economics, Griffith University, Nathan, Australia
5Menzies Health Institute Queensland, Griffith University, Gold Coast, Australia

Introduction

Research Data Management (RDM) best practice is imperative to higher academic institutions involved in the development of training programs that support researchers. Therefore, understanding researcher RDM practices will help articulate planning strategies for services and support, and highlight areas for future investment and development. This study sought to understand the current RDM practices of health and medical researchers from an academic institution in Australia.

Methods

Participants were drawn from a research institute and invited to complete an online survey to about: RDM practices, data storage and retention, data sharing practices and RDM training and development.

Results

Overall, our evaluation indicates that RDM practices which varied greatly, are likely to be influenced by level of experience or RDM practices carried out within teams or by supervisors. Only 1 in 3 researchers had a data management plan, almost 70% sourced their data from surveys and 53% collected consent for specific data use. The majority (80%) collected data using personal storage devices and 65% stored their data on removable media. Willingness to share data with colleagues, and the public significantly increased after being published (p<0.05). Collaboration, advancing knowledge and public benefit were the top reasons for sharing data.

Conclusion

Evaluating the data management practices of health and medical researchers, contextualised by tasks associated with the research data lifecycle, is effective in informing RDM services and support. This study recognises that targeted institutional strategies will strengthen researcher capacity, instill good research practice, and overall improve health informatics and research data quality.


Biography:

Bio to come

Recent Comments

    Categories

    About the conference

    eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

    Conference Managers

    Please contact the team at Conference Design with any questions regarding the conference.

    © 2018 - 2020 Conference Design Pty Ltd