Secure eResearch Platform to support research during COVID-19

Mr Mat Ishac1, Mr Jerico Revote1, Mr Daniel Waghorn1, Ms Komathy Padmanabhan1, Professor Belinda Gabbe1, Mr Chris Mac Manus1, Ms Anitha Kannan1

1Monash University, Melbourne, Australia

Monash SeRP is a secure research platform that enables the storage, analysis and sharing of research data, in a well governed environment. During the COVID-19 pandemic researchers are working remotely due to site closures and physical distancing.

The Secure eResearch Platform (SeRP) has been developed by Swansea University and is the gold standard for managing, sharing and collaborating over sensitive data. Monash has collaborated with Swansea University to adapt SeRP for the Australian context and this is being operated as Monash SeRP.

Prof Belinda Gabbe has been an early adopter of Monash SeRP through three key research projects; the NHMRC funded VIBES-Junior study, population linkage of state-based trauma datasets, and an international bench-marking study of trauma outcomes. Each of these projects has had critical reporting deadlines to key funders and stakeholders.

The availability of Monash SeRP and other remote tools for collaboration have ensured continuity during the COVID-19 pandemic. The availability of this platform has provided access to high capacity computing for complex analytics by multiple analysts in Australia and internationally, with exceptional security and data custodian control.  Monash SeRP has accelerated analysis of complex data, and collaboration between analysts, in a time-frame that would not otherwise be possible.

Monash SeRP has been approved for hosting de-identified data and certification for hosting identified and re-identifiable data is in progress, and is envisaged to support a significant proportion of research at Monash University that require secure collaboration environments for sensitive research data.


Biography:

Mathew Ishac is Project Support Officer at Helix, Monash University with a background in facilitating the implementation and training of new technology within hospital and health research fields.

Galaxy Australia – a 2020 update

Dr Gareth Price1, Mr Simon Gladman2, Professor Andrew Lonie3

1QCIF Facility for Advanced Bioinformatics, St Lucia, Australia
2University of Melbourne, Carlton, Australia
3Australian BioCommons, Carlton, Australia

Introduction

Galaxy Australia is in its third year of operation as a national platform, backed by years of institutional operation. The platform has shown user number growth in the thousands per year. The platform has matured from year-on-year funding to long term secured support and is now looking to “industrialise” itself to further meet the needs of Australian researchers that are continuing to turn to Galaxy Australia as a reliable, reproducible and transparent platform for data analysis in the life sciences.

Methods

Optimisation of the Galaxy deployment, by the Galaxy Australia team and infrastructure providers have increased service uptime over the last year. Additionally policies and procedures for operation of the broadening tool set, reference data and user request types have strengthened operational efficiency.

Results

Galaxy Australia has seen growth in the last two years of; 4,000+ users and 500+ tools, with 70+ associated peer reviewed publications. The platform has deployed new Pulsar’s at Pawsey and University of Melbourne. The platform is now establishing high-memory nodes, more Pulsars, a new head node configuration, to allow new analysis modalities (long-read sequencing, plant and mammalian genome assemblies, proteomics and metabolomics). Presented here is a schematic review of the roadmap for Galaxy for the next 3 to 5 years.

Conclusion

Galaxy Australia is planning on a tripling of user numbers in 3 years, with eight times more data being analysed. To achieve these goals the platform is building on its robust deployment to strengthen availability, utility and ongoing relevance to Australian researchers.


Biography:

Dr Gareth Price is Head of Computational Biology at QCIF Facility for Advanced Bioinformatics. In this role Gareth manages the diverse spectrum of researcher lead questions involving genomic data, provides training in genomic data analysis, as well as leading Galaxy Australia (https://usegalaxy.org.au) as Platform Manager.

Gareth has 20 years’ experience as a Bioinformatician and Genomics Scientist.

Gareth’s view is that research is at its best when coupled with the most accurate, highest throughput and innovative technology and analysis modalities.

LawTech and the building of eResearch capability in the interdisciplinary domain of legal natural language processing

Mr Geordie Zhang1

1The University Of Melbourne, Parkville, Australia

In this talk, we present the experiences of developing a brand new interdisciplinary research program at the University of Melbourne, LawTech, from the perspective of Melbourne Data Analytics Platform (MDAP), a new academic specialist unit at the University of Melbourne specifically designed to supercharge computationally and data intensive research at the university.  LawTech applies cutting-edge computational natural language processing (NLP) models, based on machine learning and deep learning, to the problem of classifying texts from court judgements.  As the common law legal system in Australia relies heavily on lawyers and judges reading large volumes of court judgements of previous cases, extracting the legal principles propounded in these judgements, and applying them to new cases, the development of machine-learning based technologies for computer assisted information extraction from court judgements has a profound effect on the Australian legal community, as well as legal communities from other common law countries (e.g. UK, US, India, Singapore).  This talk will cover what we have found to have been critical to the initial development of LawTech, how the different eResearch services at the University of Melbourne contributed towards the research program, the challenges along the way, the lessons learnt, and how the undertaking of such an interdisciplinary research program has changed the eResearch and researcher communities at the university who have been directly involved with LawTech.

Below is a list of research collaborators in the LawTech research programme:

Timothy Baldwin1, Daniel Beck1, Hui Chia2, Emily Fitzgerald3, Rohit Gupta3, Saket Khandelwal3, Kabir Manandhar Shrestha3, Meladel Mistica1, Jeannie Patterson2, Priyanka Pillai3, Geordie Zhang3
1School of Computing and Information Systems, The University of Melbourne
2Melbourne Law School, The University of Melbourne
3Melbourne Data Analytics Platform, The University of Melbourne


Biography:

Geordie Zhang is a Research Data Specialist at the Melbourne Data Analytics Platform (MDAP), the University of Melbourne.  His research interests include digital humanities and social sciences, computational economics, network information theory, and natural language processing.  His eResearch interests include research cloud and high-performance computing, databases for humanities and social sciences datasets, and knowledge translation in data-intensive research.

New infrastructure and services for the ARDC’s Nectar Research Cloud

Dr Paul Coddington1

1Australian Research Data Commons, Adelaide, Australia

The Nectar Research Cloud has been in operation for 8 years, and has supported more than 4000 projects and 18000 users, and hosts many services used by thousands more researchers. In 2018 it transitioned to being supported by the Australian Research Data Commons (ARDC). This presentation will provide an overview of ARDC’s plans for the expansion of Nectar Research Cloud infrastructure and services in 2020/21 and beyond, in collaboration with the Research Cloud Node partners.

A major refresh of the Research Cloud infrastructure is underway that is expected to be completed by the end of 2020. Additional infrastructure investment in 2021 will prioritise the requirements of the ARDC Platforms projects and the provision of high-end infrastructure such as GPUs and large memory servers. We will also be exploring the integration of additional capacity from commercial clouds.

The ARDC is leading a national collaborative effort to develop the Australian Research Container Orchestration Service (ARCOS), to provide expertise, best practices and standards for interoperable deployment of cloud applications on multiple cloud platforms through the use of containers and container orchestration using Kubernetes.

Another new service being developed is an interactive data analytics platform that will make it easier to use the Research Cloud to run standard data analytics tools such as R Studio and Jupyter notebooks, and provide a simpler virtual desktop interface. This platform will be based on the requirements and implementations of several existing Virtual Laboratories, and be designed so that it can be customised to develop similar domain-specific Platforms.


Biography:

Paul Coddington is the Associate Director, Research Cloud and Storage at the Australian Research Data Commons. He has been responsible for the Nectar Research Cloud since 2017 and has 30 years of experience in the eResearch sector.

1 year on – ARDC in 2020 and beyond!

Mr Ian Duncan1

1ARDC, St Lucia, Australia

The Australian Research Data Commons, or ARDC, is a result of the integration of the ANDS, NeCTAR, and RDS NCRIS projects, with that integration being completed in 2019.  The ARDC engages in activities and partnerships with institutions, communities, infrastructure providers, our NCRIS colleagues, and researchers to provide Australian researchers with an international research advantage through data.

In 2020 we have embarked on a range of programs across data collections, data retention, cloud computing, research platforms, and workforce data skills development and we have a range of upcoming programs building on and expanding these priority areas.

This session will provide an update and review of our activities thus far; our strategy and vision, and the status of current and proposed ARDC programs and we will use this session to describe the underlying aims and targets and provide an overview of the timetables and processes in the upcoming calls and expressions of interest which will be available for the sector across a range of data elements.


Biography:

Ian Duncan is the Director of Outreach for the ARDC.  Ian has previously held the roles of Director, Infrastructure and Services at ARDC and was the Director of the RDS project, one of the three projects which integrated to become the ARDC.

Modelling of Virus Spread Using Information from Biased Testing

Dr Fanel Donea1

1CSIRO – Scientific Computing, Melbourne, Australia

This work aims to contribute to the fight against viruses by understanding how they may spread in reality, considering the limited information available from biased testing.

Non-medical methods (mathematical, computational, simulational) are a necessary supplement to direct medical approaches, especially for new viruses, such as covid-19, for which the medical research isstill incipient.

A combination of practical methods is used, including Monte Carlo simulations, agent-based modelling and analytic techniques, based on the classical SIR (susceptible-infected-removed) model. An immediate goal is to use establish an estimate of the real number of cases, in the situation where testing has not been randomised. The same techniques can then be applied in the future for conceptual proofs or disproofs for various claims that have been circulating in the media (effects of ignoring quarantine, the possibility of achieving herd immunity, the effectiveness of mass testing and others).


Biography:

Fanel Donea is a scientific software engineer in the Modelling and Data group of CSIRO’s Scientific Computing department. His current interests are in the fields of computational modelling and quantum computing. In a previous life, he worked in astrophysics, and  he obtained a PhD in Physics, studying accretion discs around black holes.

The Sensitive Data Access Community of Practice – recent activities and future directions

Dr Steven McEachern1, Dr. Frankie Stevens2, Dr.  Nichola Burton3, Dr. Kristan Kang4

1Australian Data Archive, Acton, Australia
2AARNet, Gold Coast, Australia
3ARDC, Perth, Australia
4ARDC, Canberra, Australia

The Sensitive Data Community of Practice (CoP) was established in 2018, following on from the 2018 eResearch conference BoF “Sensitive Data – How do you do yours?” (https://conference.eresearch.edu.au/2018/08/sensitive-data-how-do-you-do-yours/). In the two years since it’s establishment, the CoP has built a growing community of eResearch practitioners with a common interest in sharing approaches, practices and experiences in managing sensitive data. The community includes technical, governance and domain specialists across academia, government, health and medical facilities and private sector providers.

This BoF is intended to introduce the Sensitive Data CoP to the broader eResearch community. The CoP facilitators (from AARNet, ADA and ARDC) will present an introduction to the purpose and scope of the CoP. The team will review the activities of the CoP, and the breadth and coverage of the CoP membership. They will then lead a facilited discussion among participants on potential future directions for the CoP, future topics and speakers for CoP events, and interest in collaborative projects and investments to enable research using sensitive data.

REVIEWER NOTE: This BOF is intended to run separately from, but extending upon, the co-submitted “Recent developments in platforms and services for sensitive data access in Australia”.


Biography:

Steve McEachern is the Director of the Australian Data Archive at the Australian National University

Recent developments in platforms and services for sensitive data access in Australia

Dr Steven McEachern1, Dr. Frankie Stevens2, Dr.  Nichola Burton3, Dr. Kristan Kang4

1Australian Data Archive, Acton, Australia
2AARNet, Gold Coast, Australia
3ARDC, Perth, Australia
4ARDC, Canberra, Australia

(“unit record data”) to address major research and policy questions in Australia in recent years. This interest spans a variety of research domains – including linked administrative data for social policy analysis; integration of patient, health service and biomedical data to study health outcomes; and integration of long-term population data with longitudinal cohort data to study gender role attitudes.

This growth in demand however has resulted in an identified need for platforms and facilities capable of supporting such data in a secure and transparent way. There is a need for such facilities to enable research with sensitive data to be completed effectively and efficiently while meeting the privacy and confidentiality expectations of both subjects (i.e. people) and data custodians (across academic, government and elsewhere).

This need has not been well-represented within the eResearch community in general. The Population Health Research Network providing relevant support within the health domain, but in general the problem has been neglected. Recent efforts however have begun to address this. The purpose of this BoF is to present an overview of a number of new projects and activities intended to support and enable increased use of sensitive data in the Australian research community. Projects and activities to be presented include CADRE  (Australian Data Archive), Sensitive Data Support Proof of Concept (AARNet) and the forthcoming Commonwealth Data Access and Transparency Bill.

The BoF will then open up discussion to consider establishment of a national framework for sensitive data support and future support requirements.


Biography:

Steve is the Director of the Australian Data Archive at the Australian National University.

Dynamic allocation of computational resources using Airflow and Terraform

Ms Aniek Roelofs1,2, Dr James Diprose2, Dr Richard Hosking1,2, Dr Rebecca Handcock1,2, Professor Cameron Neylon2, Associate Professor Lucy Montgomery2, Dr Alkim Ozaygen2, Dr Katie Wilson2, Dr Chun-Kai (Karl) Huang2

1Curtin Institute for Computation (CIC), Perth, Australia
2Curtin Open Knowledge Initiative (COKI), Perth, Australia

Situation

The Observatory Platform, developed by the Curtin Open Knowledge Initiative (COKI), is a FAIR principles environment for fetching, processing and analysing data to understand how well universities operate as Open Knowledge Institutions. Around 15 different data sources are currently used and this number is growing. To manage these data sources, many different computational resources are required that need to be maintained in a clear and repeatable way. This is especially important as external researchers should be able to easily deploy their own Observatory Platform and collect data as desired.

Task

Both workflow- and infrastructure-management systems are desired to address this. The different datasets vary immensely in size and complexity, therefore the required storage space and computational power also varies. To prevent wastage of any resources these two systems should interact with each other and adjust accordingly.

Action

Terraform is used to deploy our system on the Google Cloud Platform. The Virtual Machine (VM) created by Terraform contains Docker containers that host Airflow, the workflow management system. When a more complex workflow is scheduled, Airflow will run an additional workflow that calls the Terraform API to create another, larger, VM that runs the complex workflows and shuts down automatically when they are finished.

Result

We created an environment that can be easily be set-up remotely while being managed within a team. Both Terraform and Airflow are open-source tools, ensuring that the Observatory Platform stays FAIR. Cost savings are achieved by making use of computational resources as efficiently as possible.


Biography:

Aniek Roelofs is a developer at Curtin University where she is a part of the COKI team. She obtained a Master of Science in Bioinformatics from the University of Amsterdam and has experience setting up workflows and processing big data.

Previously working with sequencing data, she’s now focused on bibliometric data to assist the Curtin Open Knowledge Initiative with their research on how well universities operate as Open Knowledge Institutions.

Supporting REDCap at scale at Monash University

Mr John Liman1, Mr Chris Mac Manus1

1Monash University, Melbourne, Australia

Research platforms that are easy-to-use, self-governed and available from little-to-no-cost mean that research can be undertaken without the traditional barriers of high-cost, complex-workflows and a dependency on experts to operate. REDCap is a great example in this space, and we describe the niche that REDCap is carving out that is fast becoming the first option for any small to medium size research projects.

With a fast growth of user base and projects within Monash University, a sustainable method is needed to support them.  By partnering with various groups within Monash University, Helix is able to support 600 projects and 3000+ users which are growing daily.  The activities that Helix lead and support includes an Introduction to REDCap hands-on workshop, users and projects creation, technical consultation, and project customisations, as well as building a local REDCap community within Monash which meet regularly.

A recent audit of REDCap found that whilst Medicine was a major user of the platform, Helix found that other Faculties were increasingly using the platform, and one of the largest non-Medicine groups is Pharmacy.  Helix is now looking to provide REDCap and other platforms for use by all Faculties within the University that require access to platforms for sensitive data capture, storage and processing. Establishing a strong Community of Practice that will foster the need to provide support at scale, and by developing formal training will help to educate a future generation of researchers that see REDCap on the critical path to high quality research.


Biography:

John Liman is the Senior Software Engineer at Helix, Monash University.  John is the Monash REDCap administrator supporting researchers with REDCap design, training, project consultation and customisations.

123414

Recent Comments

    Categories

    About the conference

    eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

    Conference Managers

    Please contact the team at Conference Design with any questions regarding the conference.

    © 2018 - 2020 Conference Design Pty Ltd