How and where are you preserving your raw machine observable data for future science?

Dr Angus Nixon1,2, Dr Bryant Ware2,3, Prof Lesley Wyborn4,5

1Department of Earth Science, The University of Adelaide, Adelaide, Australia, 2AuScope Geochemistry Network, Australia, 3John de Laeter Center, Curtin University, Bentley, Australia, 4Australian National University, Canberra, Australia, 5Australian Research Data Commons, Australia

Biography:

Angus completed a PhD in Geology examining the low-temperature evolution of northern Australia and the relationship between plate boundary tectonics and structural and chemical signatures in the continental interior. He subsequently joined the AuScope Geochemistry Network (AGN) and is currently an EarthBank Fellow at the University of Adelaide, and co-chair of the OneGeochemistry initiative. His work includes establishing standards for areas including fission track thermochronology, inorganic geochemistry, beta-decay geochronology systems, as well as integrating data infrastructure with research community needs and workflows.

Abstract:

Pertinent in a wide range of scientific disciplines is the advent of ‘next generation’ instrumentation routinely collecting data at up to 16 times current data volumes. These data stem from a wide array of data types (including images), acquisition methods, analytical scales, sample or observed materials, and research applications. While many disciplines are developing best practice reporting schema and vocabularies for the data outputs, these are typically restricted to processed data most suitable for end user interaction, application and publication. Primary Observable Datasets (PODs), the fundamental observations and analyses underpinning the published data, generally remain inaccessible and unlinked to the published outcome, limiting opportunities for future repurposing or recalculation with evolving techniques and knowledge. This session will explore how frequently PODs remain uncaptured across disciplines, what forms of PODs are generated, and if existing systems for capturing PODs may inform or be translated to other domains. We invite a diverse audience from across the research ecosystem, from researchers who utilise the final data products (are there instances that PODs data has been required, yet unavailable/lost?), analytical facilities which generate data, databases and repositories, and research or data communities and infrastructure to attend and contribute in discussions. The session will be conducted as an open-floor discussion with attendees led by representatives of AuScope, Atlas of Living Australia, and the Australian Plant Phenomics Network. Key takeaways will inform data infrastructures on the state of PODs storage, necessity and complexity across disciplines, and how this may be addressed to benefit the community.

 

Categories