A Cloud Native Framework for Efficient Data Management in the Great Barrier Reef Conservation

A Cloud Native Framework for Efficient Data Management in the Great Barrier Reef Conservation

Eduardo Klein Salas1, Nick Forbes-Smith, Leo Ghignone, Julia Yun, Michaela Miller

1Integrated Marine Observing System IMOS, Hobart, Tasmania, Australia

Abstract

A fit-for-purpose Data Management System is critical to the success of the Reef2050 Monitoring and Reporting Program (RIMReP) and Great Barrier Reef (GBR) management. The complexity and maturity associated with +160 datasets from +60 providers impose a challenge in organising and delivering usable data and metadata to the user.

We developed a Cloud Native infrastructure capable of handling Analysis-Ready, Cloud-Optimised data using standards and open-source technologies. A set of data-agnostic harvesting pipelines collects data from the providers and transforms it into Zarr (gridded data) or geoParquet (tabular data) formats. The collections are exposed through a data API (pygeoapi) and metadata API (STAC-fastAPI). The data can be discovered and explored using keywords, geographical and temporal filters in the metadata catalogue browser or API. The catalogue provides a link to access the data through the data API, the collection’s URI in an S3 bucket for direct access, and a link to the metadata point-of-truth.

It takes less than 20s to compute water temperature averages on a 30-year, +150 million rows dataset containing records for more than 300 sites in the GBR. Similarly, extracting a time series from a single pixel across +18,000 layers of satellite-derived sea surface temperature takes less than 12s when accessing the Zarr collection from an AWS S3 bucket.

Providing a unique access point that delivers the most relevant data for the GBR in standard formats eliminates “data fishing” and reduces data wrangling significantly. The proposed system will dramatically improve the creation and management of decision support systems.

Biography

With over 30 years of experience in tropical marine ecology and conservation, Eduardo Klein has worked extensively in the Caribbean region and more recently in the GBR. He has established strong connections with multilateral organisations that specialise in biodiversity conservation and data, such as OBIS and the marine program of the Convention for Biological Diversity. Currently, he leads the Reef Data Services & Infrastructure program at IMOS.

Categories