BoF Science Clouds

Mr Kheeran Dharmawardena1, Mr Hamish Holewa2, Dr Siddeswera Guru3, Mr Brendan Davey4, Mr Alan Lo5

1NeCTAR, Parkville, Australia, Kheeran.d@nectar.org.au

2QCIF, St Lucia, Australia, hamish.holewa@quadrant.edu.au

3TERN, St Lucia, Australia, s.guru@uq.edu.au

4TPAC, Hobart, Australia, brendan.davey@utas.edu.au

5University of Melbourne, Parkville, Australia, alan.lo@unimelb.edu.au

 

DESCRIPTION

The Science Clouds enable research communities have a domain focused ecosystem of shared data, tools, platforms and computing resources and lower the barrier to doing research by improving access and interoperability.  It provides a model for investment and collaboration for a research community led cloud environment.  Three domains, Biosciences, Ecosciences and Marine science led by Bioplatform Australia (BPA), Terrestrial Ecosystem Research Network (TERN) and Integrated Marine Observing System (IMOS) respectively have been involved in establishing a science cloud for their domains in collaboration with NeCTAR.  This BoF will discuss what has happened so far and seek to understand what other domains might be in a position to benefit from their own science cloud.

In 2016, NeCTAR partnered with a number of NCRIS capabilities and eResearch Infrastructure providers to establish three research domain focused cloud platforms, the Australian BioSciences Cloud, the Australian Marine Sciences Cloud and the Australian Ecosystems Science cloud.

These community focused cloud platforms enable research communities have a domain focused ecosystem of shared data, tools, platforms and computing resources and lower the barrier to doing research by improving access and interoperability.

It provides a model for research domain led investment and collaboration, and a vehicle for:

  1. NCRIS domain capabilities to partner with eResearch infrastructure providers to:
  • influence the design and delivery of eResearch infrastructure to better support their strategic planning and changing infrastructure needs;
  • to guide the prioritisation of resource allocations and services to better address national and research community priorities;
  • The eResearch Infrastructure providers to deliver greater value by supporting higher-level services in partnership with research communities

The purpose of this BoF is share some of the experiences is developing science cloud and present some of the outcomes. The session is also intended to bring together those interested in Science clouds as a model and explore how it can be extended or expanded into other research domains.

Format: Brief presentations followed by an open discussion.

Duration: 60 min


Biography

Mr. Kheeran Dharmawardena, MBA, B.Comp., joined NeCTAR as the Coordinator of the Science Clouds initiative in October 2016.  Prior to this he has been responsible for the delivery of many ICT services at Monash University including infrastructure delivery, service delivery, data management, IT & enterprise architecture and eResearch.  He has a special interest in the socio-technical aspects involved in the delivery of effective services.

(orcid.org/0000-0002-4292-7475)

AAF – LOOKING TOWARDS THE FUTURE

Mr Bradley Beddoes1, Mr John  Scullen1, Mr Melroy Almeida1, Elleina Filippi1

Australian Access Federation, Brisbane, Australia, bradleybeddoes@aaf.edu.aujohn.scullen@aaf.edu.aumelroy.almeida@aaf.edu.au

 

Australian researchers have been accessing online services and research capabilities for the past eight years using a single login solution provided by the Australian Access Federation (AAF). This login process often goes unnoticed by the end user (researchers) as they seamlessly access hundreds of national research services.

As the national provider of access and authentication for the Research and Higher Education sector the AAF provides national and international access to essential research capabilities. As a core underpinning research capability, the AAF is continually evolving identity management solutions for the Australian research community.

As we look towards the future and the possibilities for access and authentication the AAF is leading the way to ensure greater uptake and accessibility to research services, data, and facilities.

As Thought Leaders in Identity Management, the AAF team will be discussing the developments and evolving needs of the Australian and international research community. The AAF is proposing a session consisting of three lightning talks, each covering topics that Australian researchers have discussed with the AAF over the past 12 months. The presentations will explore:

 

INTERNATIONAL CONNECTIVITY – EDUGAIN   

eduGAIN is a global initiative to connect federations around the world. It enables researchers, educators and students in one country to collaborate with colleagues and access online applications, in other countries. Connecting Australian research capabilities to the rest of the world has been a significant initiative undertaken by the AAF in 2017. By participating in eduGAIN[1], the AAF is able to provide Australian researchers with global collaboration opportunities to advance their research. eduGAIN is the mechanism which enables international collaboration partners to access Australia’s NCRIS capabilities (via participating international federations).

Linking the AAF to international federations through the global inter-federation initiative (eduGAIN) is a major step to realising the Australian government’s objectives around international research collaboration.

 

EXPLORING OPENID CONNECT

OpenID Connect (OIDC) is an emergent specification in the identity space which provides a simple identity layer on top of the OAuth 2.0 authorization framework.

OIDC is an attractive option for developers (in particular the NCRIS capabilities) as it is applicable for multiple types of applications, including mobile, and can be integrated directly into application code. Interest in OIDC is increasing, driven by the adoption of OIDC by large commercial players such as Google and Microsoft.

The AAF is exploring OIDC and how it can enhance access and authentication for the Australian Research Community.

 

CONNECTING RESEARCHERS AND RESEARCH OUTPUTS – ORCID

As a key service provider for the national research infrastructure, the AAF also operates the Australian ORCID Consortium, which supports and connects our national research outputs. ORCID (Open Researcher and Contributor ID), a 16 digit unique identifier is free for researchers and can be integrated with organisational systems to streamline reporting for both researchers and organisations. Over time ORCID will help to drive research recognition, impact  and align with national reporting responsibilities.

[1] edugain.org


Biographies

Mr Bradley Beddoes| Technical Lead — Innovation, Software

Development and Infrastructure Bradley has worked on delivering secure, standards focused web applications and redundant, scalable IT architectures for 11 plus years within Australia and abroad.

Bradley is responsible for the design and implementation of AAF software and system architectures. He leads the software development and operational infrastructure within the AAF.

Mr John Scullen Manager | Strategic Initiatives

John joined AAF in February 2016 to lead the development of new processes and tools in the Next Generation AAF project. His role has since expanded to oversee the AAF’s project portfolio.

With more than 20 years’ experience in the higher education sector, John has a strong track record of successfully delivering large, complex projects and in developing innovative approaches to delivering technology solutions. John holds qualifications in information technology (BIT), business (MBA) and project management.

Mr Melroy Almeida | Technical Support Analyst

Melroy joined the AAF at the start of 2016 as an ORCID Technical Support Analyst. Melroy works with the support team who provide a high level of technical support to AAF subscribers and members of the Australian ORCID Consortium.

Melroy is an IT professional with formal qualifications in information technology (M.InfoTech) and engineering (B.Tech). With over 10 years’ experience delivering information technology solutions within the higher education sector.

 

 

A FAIR Data Platform to Support the Next Generation of Transdisciplinary Research

Lesley Wyborn1, Benjamin Evans1, Clare Richards1, Carina Wyborn2

1National Computational Research Infrastructure, Canberra, Australia, Firstname.Lastname@anu.edu.au
2Luc Hoffmann Institute, School of Forestry & Conservation, University of Montana, Montana, USA cwyborn@wwfint.org

INTRODUCTION

Exciting opportunities have emerged to undertake new scientific research across multiple domains at scales and/or resolutions never before possible. This is due to several factors and in particular: the growing alignment of large data with powerful, intelligent processing capability, as well as the emergence of large national facilities that are flexible enough to address the challenges of several domains simultaneously. As more data becomes accessible on these new research infrastructures, innovative ways of combining data and software are being trialled that were hitherto difficult due to the previously separated and facility-based approach to research infrastructure and data provision.

The National Computational Infrastructure (NCI) has championed a transdisciplinary approach to the data, data services and analysis tools to offer significant opportunities for integrative research platforms across multiple domains. This has demonstrated that it is becoming a reality that we can integrate data of any type from any source across different scales. This vision requires research domains to better address fundamental issues such as transforming data from incompatible formats; evaluating data for both direct access as well as network protocols, better defining their vocabularies, semantics and data structures; and updating their software to take advantage of these improvements.

Many of these time consuming activities are beyond the reasonable timeframe of an individual research project or even a single research community. Instead, they require a concerted effort by national collaborations to work within and across the domains so as to improve both the quality of the data from individual domains, as well as working to improve shared capability across domain silos.

A CASE STUDY: THE NCI HIGH PERFORMANCE DATA INTEROPERABILITY PLATFORM

NCI has assembled over 10 Petabytes of reference data collections that span the Earth System Sciences, Environmental Sciences, Climate and Weather, Geosciences, Astronomy, Genomics, and Social Sciences. These disparate collections have been increasingly harmonised under NCI’s National Environmental Research Data Interoperability Platform (NERDIP) [1]. The data are sourced from major government research agencies (Bureau of Meteorology, CSIRO, Geoscience Australia), as well as from National and International Research institutions and then transformed with the aim of providing an integrated platform for research across multiple domains.

The key to enabling the next generation data-intensive transdisciplinary research is interoperability. In developing NERDIP, a range of data management policies and procedures were developed based on standards for organising the data and to establish a research environment and software practices that enable common access, via both in-situ and remotely connected computer programs, as well as supporting domain specific requirements.

VARIOUS CATEGORIES OF INTEROPERABILITY

The terms intradisciplinary, multidisciplinary, crossdisciplinary, interdisciplinary, and transdisciplinary are often used loosely and interchangeably within the physical sciences. In contrast, in the social sciences, each of these terms has a more precise definition to categorise approaches to collaborative research projects and programs, so that individual researchers can improve common understandings (e.g., [2, 3]). At NCI, we are seeing various interactions within and between research groups and domains over data sharing, and this has led us to characterise a series of behaviours that relate to how separate groups working on the same project share and interface their data. We also see these terms as part of a spectrum that defines an evolutionary pathway of increasing complexity of data integration.

We propose to adapt/extend terms as used by the social sciences (e.g., [2, 3]) for research data integration as follows:

  • Intradisciplinary: Researchers work within a single discipline or data silo with all participants using the same standard and hence no reformatting or translation of data is required;
  • Multidisciplinary: Researchers from different discipline silos work together and share knowledge and results, but are not actually integrating at the data level – outputs are combined at the research paper/report level;
  • Crossdisciplinary: Researchers participating on a project to integrate data across the groups decide to reformat their datasets to a single agreed suite of specific standards and formats;
  • Interdisciplinary: Researchers from each domain integrate their data using customized brokers that cross walk between the different domain silos: the data of each participant remains unchanged in the back-end; and
  • Transdisciplinary: Data is born connected to international standards that enable online interaction across the discipline boundaries and beyond academia: researchers participate with stakeholders who can also contribute data.

Today, most projects conducting research that require data integration across one or more domains are either crossdisciplinary or interdisciplinary integrations. However, as the number of research groups that participate increase, it becomes apparent these approaches are limited. Transdisciplinary data integration is clearly the way forward, but will be dependent on the adoption of international standards that increase data interoperability (e.g., W3C, OGC, IEEE), groups supporting integration (e.g., Research Data Alliance) and the improvement in software re-usability.

APPLYING FAIR DATA PRINCIPLES: A PATHWAY TO TRANSDISCIPLINARY RESEARCH

Major international infrastructure investments have promoted the development of FAIR data. The FAIR guiding principles for Findable, Accessible, Interoperable and Reusable data publishing [4] were developed by the FORCE 11 community in 2016 to enable optimal use of research data across multiple stakeholders. The following describes how the FAIR principles have been applied at NCI:

  • Findable: The datasets on the NCI NERDIP have catalogue entries that are accessible via human and machine harvestable interfaces. The metadata standard used is conformant with the ISO 19115 standard for discovery of geospatial information and can be cross-walked with the RIF-CS profile of ISO 2146 used by ANDS Research Data Australia, as well as the Dublin Core and Data Catalog (DCAT) metadata standards used by data.gov.au. Conforming with multiple metadata standards and profiles significantly increases the discoverability of NCI datasets, both nationally and internationally.
  • Accessible: Datasets on the NCI platform are made accessible for general research access (e.g., data download for small file sizes), as well as being suitable for advanced techniques and multiple applications, including virtual laboratories, portals, common desktop tools, and programmatic access via well-known network protocols.
  • Interoperable: Wherever possible international data standards for interoperability are applied including metadata standards at both data services and at the data level; controlled vocabularies and interchangeable self-describing data formats (e.g., NetCDF4/HDF5); and accessible via network protocols and community standard APIs.
  • Reusable: Rigorous QA/QC procedures are used to validate the data against standards so that users are assured that the data can be accessed in consistent ways. The QA/QC validation also demonstrates that the data works across different (non-domain specific) packages, tools and programming languages deployed by the various user communities thus extending the use of the data across domain silos.

CONCLUSION

At NCI we are steadily building a trustworthy, transdisciplinary High Performance Data Platform. Researchers are able to share, use and reuse significant data collections that were previously difficult to both discover and access. Users can access these datasets in a consistent manner, which supports cross-domain and discipline specific access. 
The data collections are suitable for use within a high‐end computational and data‐intensive environment with programmatic access enabling new analysis techniques while supporting access for more traditional analytical techniques.

Transdisciplinary research is increasingly required for high impact research. How quickly it can progress will depend on (1) national funding to support cross-domain infrastructure development, (2) the ongoing adoption and improvement of international standards to support such research, and (3) continual software improvements to take advantage of these infrastructures. In the interim, organization of data around the FAIR principles [4] is enabling new and innovative data-intensive research. The NCI NERDIP infrastructure is a well-used example of this, freeing researchers’ from time-consuming data wrangling, and thus enabling them to spend more effort on ground breaking, integrative research.

REFERENCES

  1. The NCI National Environmental Research Data Interoperability Platform https://nci.org.au/services/vdi/nerdip/ accessed 30 June 2017.
  2. Stember, M., 1990. Advancing the Social Sciences Through the Interdisciplinary Enterprise. The Social Science Journal, 28, 1-14.
  3. Stock, P., and Burton, R.J.F., 2011. Defining Terms for Integrated (Multi-Inter-Trans-Disciplinary) Sustainability Research. Sustainability, 2011, 3, 1090-1113; doi:10.3390/su3081090
  4. The Force 11 FAIR data principles. Available from https://www.force11.org/fairprinciples, accessed 30 June 2017.

 


Biography

Lesley Wyborn is a geochemist by training and worked for BMR/AGSO/GA for 42 years in a variety of geoscience and geoinformatics positions. In 2014 she joined the ANU and currently has a joint adjunct fellowship with National Computational Infrastructure and the Research School of Earth Sciences. She has been involved in many NCRIS funded eResearch projects over the years. She is Deputy Chair of the Australian Academy of Science ‘Data for Science Committee’ and is co-chair of several RDA Interest Groups as well as a member of the AGU Earth and Space Science Executive Committee.

ARDC 2025: The Australian Research Data Cloud in 2025

Mr Peter Elford1, Dr Markus Buchhorn2, Dr Rhys Francis3

1AARNet, Yarralumla, Australia, Peter.Elford@AARNet.edu.au

2AeRO, Canberra, Australia, markus@aero.edu.au

3eResearch Futures, Melbourne, Australia, rhys_francis@icloud.com

DESCRIPTION

One of the major themes addressed at the 2017 AeRO Forum [1] was the Australian Research Data Cloud (ARDC) proposed in the (then draft) 2016 Research Infrastructure Roadmap [2]. This BoF is intended to provide an opportunity to continue the dialogue started at the Forum with a specific focus on collecting together expectations, concerns and opportunities for the ARDC. Led by members of the AeRO executive and others, the discussion is expected to explore how the ARDC could/should fit into the overall Australian e-Infrastructure landscape. There will be a particular focus on how the ARDC might interact with existing eResearch providers including NCI, Pawsey, AARNet, and the AAF, and with the universities, institutions and organisations that host the research communities we are collectively trying to serve. Attendees will be encouraged to consider what the ARDC will be like in 2025. It is hoped the BoF can focus on what might be possible rather than immediate next steps and its outputs might then assist the facilitation process identified in [2], and provide some pointers to how future investment will provide the greatest national research impact.

This BoF aligns to The Connected Researcher theme, but is likely to be of wide interest to all attendees.

[1] AeRO – Australian eResearch Organisations 2017 Forum, www.aero.edu.au/forum-2017/

[2] 2016 Research Infrastructure Roadmap, https://www.education.gov.au/2016-national-research-infrastructure-roadmap


Biography

Peter Elford manages AARNet’s relationships across a broad range of Federal and state government agencies, and AARNet’s engagement with the Australian research community. He is a strong and passionate advocate for the role Information and Communications Technology (ICT) plays in enabling globally collaborative and competitive research through ultra-high speed broadband connectivity.

ARDC Establishment

Mr Rob Cook1

1ARDC Establishment Team, Bardon, Australia

DESCRIPTION

The 2016 Research Infrastructure Roadmap set as a Digital Data and eResearch Platform priority the integration of the ANDS, Nectar and RDS Projects into a single Project called the Australian Data Research Cloud (ARDC) in the Roadmap.  As a result a team has been formed to plan the ARDC establishment.

This session will be conducted by the ARDC establishment team to communicate timeframes and work undertaken to date, and to engage the eResearch sector in a discussion about the ARDC, its mission and the way that the mission might be accomplished.

Further detail will be provided as and when it becomes available.

The ARDC Establishment team envisage a 60 minute session at this stage.

REFERENCES

  1. Finkel A, 2016 Research Infrastructure Roadmap, Department of Education and Training

Biography

Rob Cook is a consultant to the ARDC Establishment team formed to guide this aspect of the Research Infrastructure Roadmap.  Most recently Rob has been CEO of QCIF Limited, the Queensland-based eResearch services provider and part of the Nectar and RDS national research cloud and data storage.  Rob operates Pangalax Pty Ltd, a consultancy firm in the field, that has a number of major eResearch projects.  Prior to that Rob was the chief executive of Astracon Inc. in Denver, CO and of CiTR Pty Ltd in brisbane.

About the conference

eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.

© 2018 - 2019 Conference Design Pty Ltd