Australian Data Archive as a Trusted Digital Repository, The Data Seal of Approval in the Australian Context

Dr Heather Leasor1, Dr Steven McEachern2

1Australian Data Archive, The Australian National University, Canberra, Australia, heather.leasor@anu.edu.au

2Australian Data Archive, The Australian National University, Canberra, Australia, steven.mceachern@anu.edu.au

 

INTRODUCTION

Data archives and funding agencies are increasingly interested in certification of data archives and repositories as “trusted digital repositories”.  This trustworthiness can be inclusive of the ‘FAIR’ principles but how do you ensure that ‘FAIR’ data is from a trusted source? There has been a recent interest in Australia in understanding certification models for Australian archives and repositories. In the social sciences, the Data Seal of Approval (DSA) established by DANS was the standard for certification until recently. The recent convergence among certification bodies to develop a more coherent framework for certification, has lead the DSA and the International Council for Science World Data System (ICSU-WDS) to establish a Partnership Working Group within the Research Data Alliance. They have aligned their guidelines to make one body for base level of certification.[1] The European Framework for Audit and Certification (EFAC) of Digital Repositories, incorporates three levels of certification which are detailed in Table 1, this is now used as the basis for certification within Consortium of European Social Science Data Archives (CESSDA).

Table 1: Types of Assessments for Trusted Digital Repositories [2]

  Core Extended Formal
Data Seal of Approval/ICSU WDS DIN 31644 ISO: International Organization for Standardization
Number of guidelines 16 100+
Type of certification Free Self-assessment evaluated by 2 accessors and a board review Paid assessment Paid assessment

 

The Australian Data Archive (ADA) was established in 1981 to provide a national service for the collection and preservation of digital research data and to make these data available for secondary analysis by academic researchers and other users [3]. The ADA holds over 5000 datasets from around 1500 studies sourced from academic, government and private sectors. The ADA has had a long standing relationship of trust with its users and depositors but this has never had a formal process or certification. The Australian National Data Service funded the ADA to undertake the DSA, to evaluate it for Australian requirements, and to implement selected changes to ADA policies, procedures and systems to enable DSA certification. This paper will provide an overview of both the process and outcomes of this project, an overview of the key findings of the evaluation of the EFAC model, and recommendations for both the DSA and for Australian and international repositories interested in certification.

PROCESS & OUTCOME

The Australian Data Archive undertook a self-review based on the newly combined DSA/ICSU-WDS Data Seal of Approval 16 guidelines. Initially we assessed the changes from the DSA to the combined DSA/ICSU-WDS and found some of the new criterion were not represented in the former DSA assessment. We did a self-assessment based on the guidelines and the ADA was found to be at a level 4 of compliance with 12 of the 16 guidelines and the remaining 4 guidelines were at a level 3. The review process formerly only had one assessor but the combination of the agencies now requires two assessors and board review. One of the reviewers assigned believed that some of the levels of self-assessment were not accurate and requested more information. The final outcome of the assessment was not complete at the time of submission of this extended abstract.

ASSESSING THE DSA/ICSU-WDS IN AUSTRALIA

The ADA was one of the first globally to undertake the combined DSA/ICSU-WDS certification,  test to base our self-assessment upon.

The conclusions we have drawn based upon the criterion are not necessarily only specific to Australia but are reflections of the diversity of the repository sector.

  • the complexity of institutions and repositories in the Australian context made assessment against the standard guidelines complex.
  • The national frameworks, infrastructure frameworks and funding frameworks used in Australia also lead to challenges in fulfilling requirements for the assessment and have unique relevance to Australian bodies.
  • Governance frameworks Australia
  • specific areas in which  could improve our own organization and delivery of information to our community of interestself-assessment process has lead the ADA to some redevelopments of delivery systems. It has also assisted in shaping policy and procedure documents and illuminated which items should be more clearly accessible in the public domain.

REFERENCES

  1. Research Data Alliance Repository Audit and Certification. https://www.rd-alliance.org/group/repository-audit-and-certification-dsa%E2%80%93wds-partnership-wg/outcomes/dsa-wds-partnership , accessed 22/06/2017
  2. Digital Preservation Handbook. Available from http://www.dpconline.org/handbook/institutional-strategies/audit-and-certification accessed 26/06/2017.
  3. Australian Data Archive. http://ada.edu.au/ , accessed 22/06/2017

 

Biographies

Dr Heather Leasor is a data archivist at the Australian Data Archive. She project managed the project for trusted digital repository for ADA with funding from ANDS.

Dr Steven McEachren he is Director  and Manager  of the ADA at ANU. He is responsible for daily operations, technical and   strategic  development of the archive. He has high level expertise  in survey methodology  and data archiving for over 15 years he has been actively involved in the development and application of survey research   methodology and technologies  in the Australian  university  sector.

Orcid 0000-0001-7848-4912

Investigation into RDM solutions within a South African context

Mr Herman Jacobs1, Mr Andre Kleynhans2

1 University of Pretoria, Tshwane, South Africa, andre.kleynhans@up.ac.za

2 University of Pretoria, Tshwane, South Africa, herman.jacobs@up.ac.za

BACKGROUND

Research funders in South Africa are increasingly requiring researchers to publish their research findings and supporting data in an open access repository.  In its pursuit to remain a leading research university, the University of Pretoria has executed a project to investigate the most appropriate hardware and software infrastructure for meeting this requirement.

ABSTRACT

The scope of our project was to evaluate products (commercial and open source) which could be utilised as a Research Data Repository (RDR) as part of a total Research Data Management RDM solution at UP.

Figure 1: DATA FLOW within the RESEARCH DATA LIFE CYCLE

A total RDM solution includes all phases of the Research data life cycle, but for purposes of our project the focus was on identifying a potential RDR solution which could satisfy the requirements of the dissemination phase of the research data life cycle.

Given the scope of this initiative, various stakeholders within the university had to be consulted including some external entities. The project team agreed upon a methodology to gather information from all the stakeholders and this input was used to generate detailed criteria which was incorporated into the product evaluation.

 

Selection criteria was broken down into categories, each with an agreed weight for evaluation purposes and its stake in the overall cost of implementation. From the beginning it was clear that a RDM/RDR solution is much more than merely software (RDR) and hardware to host the data, but rather a plethora of skillsets supporting the research initiative.

What initially started out as an institutional initiative has developed into something much more far reaching and has branched out into a national initiative.


Biographies

ANDRE KLEYNHANS IS THE DEPUTY DIRECTOR: CAPABILITY DEVELOPMENT AT THE UNIVERSITY OF PRETORIA (UP). OUR ERESEARCH CAPABILITY FALLS WITHIN THE UNIT FOR ACADEMIC IT, WHOM REPORTS INTO ANDRE.

HERMAN JACOBS IS THE ERESEARCH SUPPORT MANAGER WITHIN THE UNIT. THE UNIT FOR ACADEMIC IT WAS ESTABLISHED JUST OVER TWO YEARS AGO, TO PROMOTE DEDICATED SUPPORT TO RESEARCHERS AND AID THE UNIVERSITY IN ACHIEVING ITS RESEARCH OUTPUTS. IT HAS BEEN AN EXCITING JOURNEY THUS FAR.

Piloting the Research Data Manager system at UQ: DMRs, PiDs and tears of joy

Dr Andrew Janke1,2, Ms Helen Morgan2

1 Research Data Services/The University of Queensland, Brisbane, Australia, andrew.janke@uq.edu.au

2 The University of Queensland, Brisbane, Australia helen.morgan@uq.edu.au

INTRODUCTION

The University of Queensland has invested in data management systems and is building a reputation as a trusted provider of research data. This is for a number of reasons, but is driven by the need to ensure that all research at UQ abides by the Australian Code for the Responsible Conduct of Research[1], and that UQ research data meets the FAIR data principles [2].

The Research Data Manager (RDM) system has been developed with researchers and aims to be useful, enabling best practice research data management with minimal disruption to normal workflows. To meet this need a small project team designed and implemented a minimal viable metadata Data Management Record (DMR) based system [3, 4].

The system that has been built is centred on research projects rather than individuals, and looks to solve the working research data problem at a national level for research that involves UQ collaborators. This is achieved by defining access to data based upon AAF [5] credentials, thus easing the path for collaboration.

In order to further promote and ease collaboration the UQ RDM system is integrating with the Research Data Services (RDS) led Research Activity identifier (RAiD) project [6]. In the system each DMR has a unique, persistent identifier – a RAiD – associated with it. This will allow integration of the UQ system with both other institutions and service providers in Australia and in time internationally. The combination of a DMR with a RAiD allows for trusted published research outputs, tracking back from publications to the source research data and to the project that originally generated the data.

THE PILOT

The new system is currently being piloted across the university with researchers including with HDR students and their supervisors. They are able to use the system in the course of their ongoing research projects and provide feedback to the project team to ensure the system works for their discipline specific requirements.

We ask them to provide a small amount of information about how they currently manage their research data, if they are using any existing UQ systems, if there are any perceived ‘gaps’ or if there is anything we can do to make things easier for them to work collaboratively. We then follow up with them after using the system to see how they are finding it.

THE REACTION

The reaction to the new pilot system has been unexpectedly emotional. We’ve had tears of joy, relief the frustration is finally over, elation, threats of hugs, and dusty external USB drives flashed at us. So why the massive and emotional response?

The RDM system was designed from day one to solve real problems people were facing on a daily basis. Researchers, administrators and executive level staff all face challenges relating to research data and how it is managed. Fundamentally, all are engaged in the process of scientific advance which is aided by the highest quality of data, and they care that this data is FAIR into the long term.

This meant engagement with researchers and end users early in development and getting researchers to engage with the RDM system by making it all about them (but not really!). As such, the RDM system has goals to allocate data within 5 minutes of submitting a request and to not interrupt existing (good) workflows that have been developed over many years. The barrier of entry needed to be so low that it was easier to use it than not use it, this then dictates that the system can only collect minimal data.

CURRENT WORK

Based on evaluations of the system received to date, we are prioritizing efforts around a couple of different pieces of development work. One is enabling international collaborators to access the system, another is developing workflows to allow for the automated archiving and publishing of research data.

This presentation will demonstrate the new RDM system in context of the pilot, researcher’s responses, and how we are addressing them. We hope by continuing to listen to researcher’s feedback, and to integrate solutions where they make sense, that we will continue to receive the positive responses we have seen to date!

REFERENCES

  1. Australian Code for the Responsible Conduct of Research https://www.nhmrc.gov.au/guidelines-publications/r39
  2. FAIR data principles https://www.force11.org/group/fairgroup/fairprinciples
  3. ANDS blog post http://andscentral.blogspot.com.au/2017/05/dmrs-making-dmps-relevant-again.html
  4. ARMS paper – “accepted” http://www.arms2017.org.au/program.html
  5. http://aaf.edu.au
  6. raid.org.au

 


Biography

Andrew and Helen lead the RDM@UQ project, they have a broad combined skill set of Clinical Trials, Reproducible Research, Preclinical and Clinical Imaging, Project Management, Research Impact, Policy and Guideline Development, Research Data Management and BioInformatics.

Data management plans aren’t working and here’s how to fix them

Mr Nick Smale1, Ms Kathryn Unsworth2, Dr Daniel Barr3

1The University of Melbourne, Parkville, Australia, smalen@unimelb.edu.au

2Australian National Data Service, Caulfield East, Australia, kathryn.unsworth@ands.org.au

3RMIT, Melbourne, Australia, danielpbarr@gmail.com

ABSTRACT

Data management plans (DMPs) have increasingly been seen as a key component of data infrastructure, placing increasing demands on researchers and institutions in Australia and abroad.

The driving force behind the popularisation of DMPs in the UK and USA is funding agency induced compliance demands, where 86% of 7 UK Research Councils and 63% of 15 US funding bodies broadly require DMP submission with funding applications. Investigation of these requirements reveals that the predominate reason for this is as a means to promote the sharing and reuse of research data by researchers, in order to maximise return on investment.

No major Australian funding body requires submission of DMPs, yet DMP mandates have permeated the Australian research landscape, where 37% of 38 Australian institutions mandate DMPs, and many further still strongly promote their use. This raises the question of why Australian institutions have embraced DMPs in the absence of funding body compliance requirements mirroring those overseas. Given that there is minimal evidence of DMP efficacy, why have DMPs become the go-to tool for advocates of good data practice in Australia? Further, if DMPs are to be used in this manner, the differing driving forces behind DMP use must be delineated. We have identified multiple institutional use-cases, highlighting the need for DMPs to be fit for intended purpose and questioning the view that a good DMP is that which encompasses the entire data lifecycle of a project.

This presentation will explore the international history of DMPs, describe institutional and international DMP mandates, discuss the evidence base for DMP efficacy, and consider potential alternative or complementary means of achieving the aims of DMPs.

REFERENCES

Unsworth, K., Smale, N. (2017). Overview of Australian institution and UK/USA funding body data management plan mandates. University of Melbourne. https://doi.org/10.4225/49/5986bde74f8f5


Biography

Nick Smale is a Project Officer in Research Ethics and Integrity at the University of Melbourne. Nick has facilitated the University’s roll-out of a flipped-blended research integrity training program. Nick trains researchers in data management, and from this has developed an interest in data management plan theory and practice. In 2016 Nick was awarded a Master of Biomedical Science with first class honours for his research into developing an objective method of cochlear implant programming.

http://orcid.org/0000-0003-3434-6192

Kathryn Unsworth is a Data Librarian with the Australian National Data Service (ANDS) based in Melbourne at Monash University’s Caulfield campus. Kathryn engages with a number of Australian universities, providing research data management related advice, support and training. Additionally, Kathryn works in partnership with institutions to deliver ANDS-funded projects. She has many RDM-related interests including, DMP implementations and their value in changing researcher behaviours and practices, IP and licensing issues for data, ethics and informed consent, and upskilling data librarians to transition into data science roles.

http://orcid.org/0000-0002-5407-9987

A National Research Data Management Strategy for South African Universities

Dr Dale Peters1, Dr Mark Hahnel2
1University of Cape Town, Cape Town, South Africa. dale.peters@uct.ac.za
2Figshare, London

ABSTRACT

Amidst a growing number of mandates for data sharing and reuse, South African universities are scrambling to provide services to the research community to facilitate compliance with the various requirements of numerous international funding agencies. The impending burden posed on institutions to fund this requirement was brought sharply into focus with the introduction of a similar mandate by the national research funding agency, the very lifeblood on which the academic enterprise is reliant.

With little incentive or support to meet the obvious crisis, a few research- intensive universities set out independently on the arduous journey of software evaluation and service development. This paper will outline that journey, and share the insights of a process of converging interest in a national strategy for research data management, led by DIRISA, the Data Intensive Research Initiative of South Africa, a component of the National Integrated Cyberinfrastructure System (NICIS).[1}

At a time when nationwide student protest action has highlighted the financial constraints on the higher education sector[2], the need to act collaboratively is imperative to sustain levels of excellence reflected in the worldwide university rankings – that conversely have driven South African universities into competition with one another in the past. The national research data management strategy signals a new way of thinking about library and IT services, that marks a clear divergence from traditional repository infrastructure development over the past twenty years.

While the concept of a shared data service is not unique[3], the process is exceptional in a paradigm shift is enabled by a proposal for a Western Cape Data-Intensive Research Facility (WC DIRF) as a tier 2 node of the DIRISA network.[4] The node will be operated and managed by a consortium of Western Cape institutions, including the University of Cape Town (UCT), as the lead institution, the University of the Western Cape (UWC), the Cape Peninsula University of Technology (CPUT), Stellenbosch University (SU) and the Square Kilometre Array South Africa Project (SKA SA).    The development of a strong relationship with figshare has provided an awareness amongst consortium partners and South African universities of the development work that is needed to support researchers to achieve persistence and provenance of their research outputs.

REFERENCES

  1. https://www.dirisa.ac.za/aboutus/. Last accessed, 30 June 2017.
  2. Du Toit, André, and N. Chabani Manganyi, eds. Political violence and the struggle in South Africa. Springer, 2016.
  3. Kaye, John, Rachel Bruce, and Dom Fripp. “Establishing a shared research data service for UK universities.” Insights 30.1 (2017): 59.
  4. A Proposal for a Western CapeTier2 Data Intensive Research Facility, December 2016

Biography

Dr Dale Peters is Director: UCT eResearch, providing leadership in engaging networked information technologies to enhance and support innovative practice in scientific research; promoting multi-institutional collaboration and trans- and inter-disciplinary research in the deployment of computational data infrastructures.

She was appointed in 2010 by the Minister of Science and Technology to convene of the Work Group for the Data Intensive Research Initiatives of South Africa (DIRISA), towards to the formalisation of the National Integrated Cyber-Infrastructure System (NICIS).  She was appointed by the NRF in 2013 to lead the national team on the Belmont Forum e-Infrastructures and Data Management research action, and currently represents the Department of Science and Technology on the OECD project on “International Coordination of Data in Infrastructures for Open Science”.

Responsible for the planning, development and management of a number of national and international e-Infrastructure projects, Dale Peters commands a high level of technology awareness to inform strategy and policy development. She aims to advance the global vision for Open Access, Open Data and Open Science for research, development and innovation.

Recent Comments

    About the conference

    eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

    Conference Managers

    Please contact the team at Conference Design with any questions regarding the conference.

    © 2018 - 2019 Conference Design Pty Ltd