The Macro-View: Research Data and Researcher Files

Ai Lin Soo1, Dr Steve Quenette1, Dr Rhys Francis3, Dr Max Wilkinson2, Dr Nichola Burton2

1Monash University, , , 2Australian Research Data Commons, , , 3Australian BioCommons, ,

The Research Data Culture Conversation (RDCC) is an ongoing discussion held between Monash University, University of Melbourne, University of New South Wales, University of Queensland and the University of Sydney aiming to understand and improve research data culture. In 2021 the RDCC constructed a “Macro View” of their research data holdings (https://doi.org/10.26180/20235570.v1).

Primary findings of this first attempt at a Macro View were:

– The five Universities held in total 72 PetaBytes (PB) of unique research data in 2021

– Data holding showed a compound growth rate over the last six years of ~31%

– Extrapolating to all Universities the 2021 volume of unique data is estimated between 137 – 176 PB

In 2022 the RDCC in partnership with the ARDC Data Retention and the Institutional Underpinnings programmes is extending the Macro View to more of Australia’s research data landscape including national research infrastructures, CSIRO, Medical Research Institutes and a broader set of universities. That work is revealing a key distinction that may exist between the approach to retaining and curating researcher files and retaining and curating research data.

A quick recap of the Macro View will be given, then the floor will be opened to topics arising such as:

  1. When and how do researcher files become research data?
  2. What are the key differences between curating files and curating data?
  3. How do research repositories and services such as FigShare compare on this topic?
  4. If data curation is content or domain specific can institutions curate data independently?

Biography:

Ai-Lin Soo

Ai-Lin Soo has a background in Commerce, with a focus on sustainability and BioMedical Science and is currently part of the Monash eResearch Centre, Monash University team. Ai-Lin is involved in a number of cross cutting initiatives within the Centre, applying her project management skills to a diverse range of projects from marketing and communications to research data management. Ai-Lin is also heavily involved in the Research Data Culture Conversation and manages the Monash University response to the ARDC Data Retention and Institutional Underpinnings programmes.

Nichola Burton Bio:

Dr Nichola Burton has a research background in quantitative psychology and now works as a Data Technologist in the ARDC Expertise team, coordinating activities and providing guidance around the management of sensitive data in research. Nichola manages the ARDC’s Institutional Underpinnings program.

Rhys Francis Bio:

Rhys has contributed to Australian eResearch through many activities including developing the initial investment plan in eResearch for the National Collaborative Research Infrastructure Strategy; proposing the eResearch investments in the scale up created by the Super Science Initiative and serving as the Executive Director of the Australian eResearch Infrastructure Council for seven years. Since retiring he has developed a revised eResearch Framework for government, assisted the University of Melbourne develop its Petascale Campus Initiative and facilitated the Research Data Culture Conversation among research intensive universities. Today Rhys has returned to work as part of the team developing the Australian BioCommons.

Max Wilkinson Bio

Max has a comprehensive background in research data management, research data governance and research infrastructure operations. For the last 3 years he established the Australian Research Data Commons Data Retention Project which is designing scalable and sustainable investment models for nationally significant research data collections. Prior to this, he has worked with the National eScience Infrastructure (NeSI), Council of New Zealand Research Librarians (CONZUL) and MBIE. He worked in the UK for two decades, most recently as Head Of Research Data and Network Services at University College London, the Datasets Programme Manager at the British Library and Informatics coordinator at Cancer Research UK.  He received his PhD in Molecular Nephrology from UCL in 2003.

Categories