Kate Snow1, Clare Richards2, Aurel Moise3, Claire Trenham4, Paola Petrelli5, Chris Allen6, Matthew Nethery7, Sean Pringle8, Scott Wales9, Ben Evans10
1Australian National University, Canberra, Australia, firstname.lastname@example.org
2Australian National University, Canberra, Australia, email@example.com
3Bureau of Meteorology, Melbourne, Australia, firstname.lastname@example.org
4Commonwealth Scientific and Industrial Research Organisation (CSIRO), Aspendale, Australia, email@example.com
5University of Tasmania and ARC Centre of Excellence for Climate Extremes, Hobart, Australia, firstname.lastname@example.org
6Australian National University, Canberra, Australia, email@example.com
7Australian National University, Canberra, Australia, firstname.lastname@example.org
8Australian National University, Canberra, Australia, email@example.com
9University of Melbourne and ARC Centre of Excellence for Climate Extremes, Melbourne, Australia, firstname.lastname@example.org
10Australian National University, Canberra, Australia, email@example.com
A major focus of the Australian climate research community currently is the preparation for and contribution to the World Climate Research Programme (WCRP) Coupled Model Intercomparison Project phase 6 (CMIP6). CMIP6 is an internationally coordinated research activity that provides climate model output from a series of carefully designed and targeted experiments. The analysis of CMIP6 data will form the basis for assessments by the Intergovernmental Panel on Climate Change (IPCC) and inform policy- and decision-makers around the world.
For Australia, CMIP6 will underpin research into historical climate variability as well as future projections research into the timing, extent and consequences of climate change and extreme events. This work may be used to assist Australian government, business, agriculture and industry to manage climate risks and opportunities related to climate variability, change and extremes.
Climate research is computationally-demanding and requires data-intensive High Performance Computing (HPC). More than 20 PBytes of CMIP6 data are expected globally, the largest collection of climate data ever produced, of which a substantial portion will be made available and analysed at NCI. The complexity and volume of CMIP6 means that data management is an impossible task without a national infrastructure approach and deeply collaborative effort, and NCI is an essential component in realising climate research in Australia. The Climate DEVL addresses the software- and data-management aspects of these needs, while NCI and the leaders from the Climate community work to secure funding for sufficient data storage infrastructure needed for the CMIP6 endeavour.
The Climate Data Enhanced Virtual Laboratory (DEVL) has focused on some key components of the infrastructure to manage this massive data archive and make accessible for CMIP6-based research in Australia. It builds on previous Australian e-infrastructure programs, the Climate & Weather Science Lab, and the National Earth Systems Data Collection and Data Services programs. It also supports NCI’s leading role in international collaborations, most notably the Earth Systems Grid Federation (ESGF) that provides the international federated capability for CMIP data. The value of this work over a long time has required the funding from various parties including other NCRIS funding programs ANDS, RDS, and NeCTAR NCRIS programs. This infrastructure directly supported other major investments from government-funded research from CAWCR (Collaboration for Australian Weather and Climate Research), NESP (National Environmental Science Program) and the ARC CoE for Climate System Science (ARCCSS) and ARC CoE for Climate Extremes (CLEX).
The Climate data at NCI is provided using the principles of FAIR: Findable, Accessible, Interoperable and Reusable. Providing a FAIR data service for such a large and complex data collection exposes significant data management challenges. NCI’s Data Quality Strategy (DQS) delivers data curation practices that permit FAIR standards and interdisciplinary data availability. This service permit streamlined access and analysis of CMIP6 data, enabling efficient state-of-the-art climate science research to be undertaken.
The unique challenges of the CMIP in both size and complexity has required new services to be developed and then made available as well managed operational services. The Climate DEVL has defined and developed the mechanisms for improved accessibility and usability of the data. One example is the need to find what data is available at NCI for use in analysis. This need has been addressed through the NCI’s Metadata Attribute Search (MAS). MAS provides consistent access to the information contained in the climate data collections by harvesting the metadata within the millions of self-describing files that constitute the CMIP data collection. The MAS also underpins a python-based API called CleF, developed by ARCCSS/CLEX, which provides command line search tools for accessing this data. CleF provides researchers with an easy interface to use the ESGF search API to discover what CMIP data has been published that match their specified requirements (experiment, variable, etc.) but is not yet available at NCI. The tool will be extended to enable users to then submit a data download request to add to the NCI CMIP6 replica service.
Another aspect of the Climate DEVL has been to focus a community approach to define the highest priority CMIP6 data needing to be replicated in Australia for local analysis, to permit timely development and publication of scientific research papers analysing the CMIP6 data as it becomes available. The DEVL also supports the evaluation of various model analysis tools, which provides an opportunity for the community to develop standardised workflows for data analysis contributing to the aforementioned research papers.
The Climate DEVL also provides a home for coordinating the ongoing development and availability of training materials necessary for a streamlined user experience. The extensive knowledge and interdisciplinary topics that span CMIP mean that effective training is needed, including face-to-face tutorials, online self-paced learning materials, and trainer training. The combined effort of NCI, CLEX, CSIRO and BoM permit such collaborative training efforts to benefit the entire Australian climate science community.
Dr Kate Snow: I began at the National Computational Infrastructure (NCI) at the Australian National University in November 2017 as a Research Data Management Specialist. Prior to my position at NCI I completed a PhD in physical oceanography at the ANU and a two-year post-doc position researching Antarctic ice-sheet dynamics at Edinburgh University, Scotland. I am able to apply my research skills form the climate sciences at NCI to help inform data management practices to benefit climate research in Australia. My current role focuses on aiding in providing the support, tools and infrastructure to manage the Coupled Model Intercomparison Project phase 6 (CMIP6) to help provide Australian climate scientists with the capabilities to undertake state-of-the-art climate science.