Data or Debris?

Ai-lin Soo1, Dr. Rhys Francis2, Dr. Claire Rye3, Dr. Isabel Ceron4, Dr. Frankie Stevens5, Mr. Luc Betbeder-Matibet1

1UNSW Sydney, 2eResearch Futures, 3NeSI, 4Australian Academy of Social Science, 5eResearch Services, Tweed Heads Australia

Biography:

Rhys was an academic researcher in parallel and distributed computing through the 1980s. Then, from 1990 through to 2005 his roles extended into strategic leadership in information and communication technologies for the CSIRO. From 2006 Rhys facilitated the development of a national investment plan in eResearch infrastructure for the Australian Government’s National Collaborative Research Infrastructure Strategy that shaped the foundations of the national e-infrastructure landscape visible in Australia. Today Rhys is part of the team developing the Australian BioCommons that is accelerating the adoption of digital technology in Australian life science research and also facilitates the Research Data Culture Conversation (researchdataculture.org).

Abstract:

Over the last three years, workshops by AeRO and at eResearch Australia have explored challenges in data stewardship and the data volume under management. During which, the concept of ‘data debris’ emerged as a new element in our research data management narrative.

Of course, as knowledge advances, we expect a steady accretion of debris in the form of incorrect, revised or surpassed ideas, methods, experiments, techniques, instruments and papers. Can data be immune to this overarching reality? We could reasonably assume that some data will have sustained value (just as some ideas do) but the remainder will become debris; some quickly, some more slowly.

A panel will briefly outline how the ‘debris’ concept evolved, the reality of debris in research, reasons data becomes debris, examples of data that isn’t debris, and relevant measurements of our ever expanding data holdings.

The BoF attendees will be invited to provide their experience on the primary two outcomes for data (permanence or debris) and suggest ways in which data destined to be on either track could be identified earlier in its life cycle.

We know that our institutions hold a growing corpus of uncurated and little used data, a significant fraction of which is likely to be debris. And yet our community’s conversation about that data is only about its permanence. We seek to begin to change that conversation.

Everyone engaged in previous workshops and everyone concerned to focus our limited curation capacity onto valuable data is invited to attend.

 

Categories