Journeying towards digital asset nirvana in CSIRO with scientists – data management challenges and tales from the trenches

Dr Jonathan Yu1, Dr David Lemon2, Mr Peter Fitch2, Mr Paul Box4, Dr Simon Cox1, Mr Benjamin Leighton1, Mr Andrew Freebairn2, Mr Ashley Sommer3, Mr Matthew Stenson3

1CSIRO, Clayton, Australia,

2CSIRO, Black Mountain, Australia,

3CSIRO, Brisbane, Australia,

4CSIRO, North Ryde, Australia

 

CSIRO Land and Water (CSIRO L&W) has been on a journey in search of ‘digital asset nirvana’. There has been an increasing recognition of the complexity and fast changing digital landscape and its influence in how science is undertaken today. Data used in science analyses are increasing in volume and complexity and managing this across projects becomes a key challenge across the organisation. There is an increasing requirement for science workflows to be more agile, repeatable, reproducible, and reusable across projects and initiatives globally.  Therefore, there is a desire from researchers and management to realise the value from these digital assets (internal and external) created and used for research projects in CSIRO L&W (see Figure 1).

Figure 1. Realising value from L&W Digital Assets

Digital asset management in CSIRO provides a particular context for scientific data management:

  • Scientific data is generated in the context of time-bound projects, which are mostly not part of ongoing programs such as those undertaken by agencies such as GA, BoM, ABS
  • Most of CSIRO’s proximate customers include those from the private sector, and therefore tend not to be part of ‘the traditional research community’ at large
  • CSIRO provides tools for staff and partners to publish data via an CSIRO institutional data repository called the Data Access Portal (DAP). DAP is positioned primarily for data publication for public access, however it also features other access arrangements. Challenges exist around internal research project data lifecycle management and providing research project officers with enough incentives, know-how, tools, and low enough costs for researchers to push to DAP if and when appropriate.

Overall these have contributed to a culture in which systematic management of scientific data assets has not been a high priority for most of the researchers who generate digital assets.

In this presentation, we share learnings from challenges and successes while recognising the complex multi-dimensional nature of the CSIRO L&W journey as social, technical and informational. We discuss the influence of social architectures [1, 2] and its application in the journey in CSIRO L&W. A key outcome was the establishment of a data council in L&W called the Digital Asset Management Committee (DAMC). DAMC has been designed as a standing committee to enable recommendations on digital asset management initiatives across projects. DAMC has enabled the CSIRO L&W unit to develop particular solutions identified by DAMC as priority areas through a project called Project DAMbusters.  Figure 2 provides a description of the role of DAMC and DAMBusters and their interaction with L&W staff, external partners, and the L&W Leadership Team (LWLT). Specifically, we present specific informational and technical implementations developed via the DAMbusters project, such as a digital asset registry based on a customised CKAN implementation [3] to enable greater discovery and access of digital assets supporting an audit capability across multiple sources. We also discuss future directions and next steps in CSIRO, and potential opportunities to collaborate with the broader eResearch community.

Figure 2. CSIRO L&W’s Digital Asset Ecosystem

References

  1. Box, Paul. Social Architecture: cultivating environmental data ecosystems. In: Jens Klump, Natalia Atkins, Nicholas Car, Simon Cox, et al, editor/s. Linking Environmental Data and Samples; 29 May – 2 June 2017; CSIRO Black Mountain, Canberra. CSIRO; 2017. 38-39.

    2. Box, Paul; Lemon, David. The Role of Social Architecture in Information Infrastructure: A report for the National Environmental Information Infrastructure (NEII). NEII Website – neii.gov.au: CSIRO; 2015. csiro:EP152134. https://doi.org/10.5072/83/5849a28b08365

  2. The CKAN Project, http://ckan.org, Accessed 22/6/2018

Biography:

Dr Jonathan Yu is a data scientist researching information and web architectures, data integration, Linked Data, data analytics and visualisation and applies his work in the environmental and earth sciences domain. He is part of the Environmental Informatics group in CSIRO Land and Water. He currently leads a number of initiatives to develop new approaches, architectures, methods and tools for transforming and connecting information flows across the environmental domain and the broader digital economy within Australia and internationally.

About the conference

eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.

© 2017 - 2018 Conference Design Pty Ltd