Simon Cox1, Erin Robinson2, Adrian Burton3, Ben Evans4, Lesley Wyborn5, Tim Rawling6
1CSIRO Land and Water, Melbourne, Australia, simon.cox@CSIRO.au
2Earth Science Information Partners, Boulder, USA, email@example.com
3Australian Research Data Commons, Canberra, Australia, firstname.lastname@example.org
4National Computational Infrastructure, Canberra, Australia, email@example.com
5National Computational Infrastructure, Canberra, Australia, firstname.lastname@example.org
6AuScope Ltd, Melbourne, Australia, email@example.com
Globally, a significant amount of government investments in Australia, USA and Europe are building solid Earth and environmental science eResearch infrastructures to facilitate the next generation of transdisciplinary research to address pressing geoscience and environmental science issues within the constraints of social impacts and sustainable development. These eResearch investments are developing best practices for both cyberinfrastructure development and data issues such as data management/stewardship, vocabularies and common data services. In more recent years there has also been an additional emphasis for both data collection and software development projects that are publicly funded to be in line with the FAIR principles (Findable, Accessible, Interoperable and Reusable) Wilkinson et al. . The FAIR principles are not easy to implement, particularly those for interoperability and many find them challenging.
Fortunately, the rules for almost all research funding schemes are that developments, wherever possible, will be open source and that access will primarily be on merit. These rules create an ideal environment to share any developments in software, tools, data services, vocabularies, etc., particularly as general solid Earth and environmental science data and supporting infrastructures have many common and exploitable patterns that cross institutional, community, national and continental boundaries.
Some of the existing major solid Earth and Environmental eResearch Infrastructure initiatives in Australia, USA and Europe include:
- AuScope, an Australian National Collaborative Research Infrastructure Strategy (NCRIS) funded capability which provides research infrastructure to the Australian solid Earth science communities;
- The Australian Integrated Marine Observing System (IMOS) NCRIS capability has a portfolio of ten facilities that undertake systematic and sustained observing of Australia’s vast marine areas. Any data collected by these facilities can be discovered, accessed, and downloaded via the Australian Ocean Data Network;
- The Australian Terrestrial Ecosystems Research Network (TERN) NCRIS capability provides open access for researchers to Australia’s land-based ecosystem monitoring infrastructure, data and research tools and thus helping to contribute to a broader understanding and long-term sustainable management of Australia’s ecosystems over three key themes of measurement: biodiversity, carbon & water, and land & terrain;
- The National Computational Infrastructure (NCI), partly funded by NCRIS, with operational funding provided through a formal collaboration with CSIRO, the Bureau of Meteorology, The Australian National University, Geoscience Australia, the Australian Research Council and others. NCI has built a major integrated research data platform (10+ PBytes) of national reference data collections spanning climate, coasts, oceans, and geophysics through to astronomy, bioinformatics, and the social sciences domains. This platform is connected to a 1.7 PFlop HPC and services managed on tightly-integrated high-performance cloud infrastructure to support the next generation of data-intensive science;
- The Australian Research Data Commons (ARDC), established in 2018, is building on and strengthening the work of ANDS, Nectar, and RDS that supported Australian researchers across multiple domains with the Data Enhanced Virtual Laboratory and Research Data Cloud programs. ARDC will focus on partnering with other capabilities to improve modern data-intensive, cross-disciplinary research in Australia within the context of global collaborative research. ARDC has a particular focus on sector-wide collaborative action, information sharing and community building;
- Earth Science Information Partners (ESIP), a US-based independent forum for the Earth science data and technology communities which has built a community of practice in USA, supported by NASA, NOAA and the USGS, to addresses topics such as data stewardship, data citation and documentation. It has become a brains trust and professional home for the Earth Science data and informatics community where both peer-led education and training and the co-development of conventions, practices and guidelines have helped make Earth science data more interoperable;
- EarthCube was initiated by the US National Science Foundation (NSF) in 2011 to transform geoscience research by developing cyberinfrastructure to improve access, sharing, visualization, and analysis of all forms of geosciences data and related resources. As a community-governed effort, EarthCube’s goal is to enable geoscientists to tackle the challenges of understanding and predicting a complex and evolving solid Earth, hydrosphere, atmosphere, and environment systems. An important project is the EarthCube Council for Data Facilities which seeks increased coordination, collaboration, and innovation in the acquisition, curation, preservation, and dissemination of geoscience data, tools, models, and services across existing and emerging geoscience data facilities;
- European Plate Observing System (EPOS) is a European Union’s (EU) Horizon 2020 research and innovation program which supports integrated use of data products and facilities from distributed research infrastructures for European solid Earth science. EPOS brings together Earth scientists, national research infrastructures, ICT experts, decision makers, and the public to develop new concepts and tools for addressing questions concerning geo-hazards and those geodynamic phenomena relevant to the environment and human welfare; and
- ENVRIplus, also an EU Horizon 2020 project, which brings together environmental and Earth system research infrastructures, projects and networks together with technical specialist partners to create a more coherent, interdisciplinary and interoperable cluster of environmental research infrastructures across Europe.
There are many parallels across these nine existing eResearch Infrastructure initiative initiatives, but currently they are somewhat disconnected. Each is focused more on national/continental scale issues, in part because most funding initiatives are nationally generated. Clearly there are common technological and science challenges that each is trying to solve in isolation and although standards, vocabularies, formats, etc., are cohesive within each community, there are sufficient differences that make it hard to integrate data across them.
The time is ripe to synchronise efforts to create globally connected networks of solid Earth and environmental science data, information infrastructures, software and researchers with a goal of making scarce eResearch funding more effective by reducing duplication, increasing efficiency, and promoting partnerships and adoption across communities initially within the solid Earth and environmental sciences and then potentially to other domains. Already there are embryonic proposals to create integrated international networks to coordinate and harmonize these efforts. If Australia is to be part of the trend towards globalisation of high-quality solid Earth and environmental research projects then it needs to be a key collaborator in their development, to ensure Australian perspectives are included.
Already the recently formed ESIP/RDA Earth Space and Environmental Sciences Interest Group is starting to coordinate and harmonize efforts internationally in the international Solid Earth and Environmental research community. In Australia, associated satellite activities such as the Australian ESIP downunder (E2SIP), has formed an ESIP cluster in collaboration with the National Earth and Environment Sciences Facilities Forum.
An additional consideration is how to extend efforts from the research sector into government and industry initiatives and create a truly global network of solid Earth and environmental science data infrastructures to underpin fundamental research into global geoscience processes within the context of societal impacts and sustainable development. Currently many government/industry initiatives are poorly connected to equivalent activities in the research sector.
- Wilkinson, M.D., Dumontier, M., Aalbersberg, IJ.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J-W., Silva, Santos L.B. da, Bourne, P.E., Bouwman, J., Brookes, A.J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C.T., Finkers, R., Gonzalez-Beltran, A., Gray, A.J.G., Groth, P., Goble, C., Grethe, J.S., Heringa, J., Hoen, P.A.C. ‘t, Hooft, R., Kuhn, T., Kok, R., Kok, J.N., Lusher, S.J., Martone, M.E., Mons, A., Packer, A.L., Persson, B., Rocca-Serra, P., Roos, M., Schaik, R. van, Sansone, S-A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M.A., Thompson, M., Lei, J. van der, Mulligen., E. van, Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K.J., Zhao, J., Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3. https://doi.org/10.1038/sdata.2016.18 Accessed 18 August, 2018.
Simon has been researching standards for publication and transfer of earth and environmental science data since the emergence of the world wide web. Starting in geophysics and mineral exploration, he has engaged with most areas of environmental science, including water resources, marine data, meteorology, soil, ecology and biodiversity. He is principal- or co-author of a number of international standards, including Geography Markup Language, and Observations & Measurements. The value of these is in enabling data from multiple origins and disciplines to be combined more effectively, which is essential in tackling most contemporary problems in science and society. His current work focuses on aligning science information with the semantic web technologies and linked open data principles, and the formalization, publication and maintenance of controlled vocabularies and similar reference data.
Dr Cox is the author of over 40 journal articles, 30 technical specifications and international standards, and 150 conference papers.