Laying the foundation for Australian participation in international eResearch networks in Solid Earth and Environmental Science

Simon Cox1, Erin Robinson2, Adrian Burton3, Ben Evans4, Lesley Wyborn5, Tim Rawling6

1CSIRO Land and Water, Melbourne, Australia, simon.cox@CSIRO.au

2Earth Science Information Partners, Boulder, USA, erinrobinson@esipfed.org

3Australian Research Data Commons, Canberra, Australia, adrian.burton@ardc.org.au

4National Computational Infrastructure, Canberra, Australia, ben.evans@anu.edu.au

5National Computational Infrastructure, Canberra, Australia, lesley.wyborn@anu.edu.au

6AuScope Ltd, Melbourne, Australia, tim.rawling@unimelb.edu.au

 

Globally, a significant amount of government investments in Australia, USA and Europe are building solid Earth and environmental science eResearch infrastructures to facilitate the next generation of transdisciplinary research to address pressing geoscience and environmental science issues within the constraints of social impacts and sustainable development. These eResearch investments are developing best practices for both cyberinfrastructure development and data issues such as data management/stewardship, vocabularies and common data services. In more recent years there has also been an additional emphasis for both data collection and software development projects that are publicly funded to be in line with the FAIR principles (Findable, Accessible, Interoperable and Reusable) Wilkinson et al. [1]. The FAIR principles are not easy to implement, particularly those for interoperability and many find them challenging.

Fortunately, the rules for almost all research funding schemes are that developments, wherever possible, will be open source and that access will primarily be on merit. These rules create an ideal environment to share any developments in software, tools, data services, vocabularies, etc., particularly as general solid Earth and environmental science data and supporting infrastructures have many common and exploitable patterns that cross institutional, community, national and continental boundaries.

Some of the existing major solid Earth and Environmental eResearch Infrastructure initiatives in Australia, USA and Europe include:

  • AuScope, an Australian National Collaborative Research Infrastructure Strategy (NCRIS) funded capability which provides research infrastructure to the Australian solid Earth science communities;
  • The Australian Integrated Marine Observing System (IMOS) NCRIS capability has a portfolio of ten facilities that undertake systematic and sustained observing of Australia’s vast marine areas. Any data collected by these facilities can be discovered, accessed, and downloaded via the Australian Ocean Data Network;
  • The Australian Terrestrial Ecosystems Research Network (TERN) NCRIS capability provides open access for researchers to Australia’s land-based ecosystem monitoring infrastructure, data and research tools and thus helping to contribute to a broader understanding and long-term sustainable management of Australia’s ecosystems over three key themes of measurement: biodiversity, carbon & water, and land & terrain;
  • The National Computational Infrastructure (NCI), partly funded by NCRIS, with operational funding provided through a formal collaboration with CSIRO, the Bureau of Meteorology, The Australian National University, Geoscience Australia, the Australian Research Council and others. NCI has built a major integrated research data platform (10+ PBytes) of national reference data collections spanning climate, coasts, oceans, and geophysics through to astronomy, bioinformatics, and the social sciences domains. This platform is connected to a 1.7 PFlop HPC and services managed on tightly-integrated high-performance cloud infrastructure to support the next generation of data-intensive science;
  • The Australian Research Data Commons (ARDC), established in 2018, is building on and strengthening the work of ANDS, Nectar, and RDS that supported Australian researchers across multiple domains with the Data Enhanced Virtual Laboratory and Research Data Cloud programs. ARDC will focus on partnering with other capabilities to improve modern data-intensive, cross-disciplinary research in Australia within the context of global collaborative research. ARDC has a particular focus on sector-wide collaborative action, information sharing and community building;
  • Earth Science Information Partners (ESIP), a US-based independent forum for the Earth science data and technology communities which has built a community of practice in USA, supported by NASA, NOAA and the USGS, to addresses topics such as data stewardship, data citation and documentation. It has become a brains trust and professional home for the Earth Science data and informatics community where both peer-led education and training and the co-development of conventions, practices and guidelines have helped make Earth science data more interoperable;
  • EarthCube was initiated by the US National Science Foundation (NSF) in 2011 to transform geoscience research by developing cyberinfrastructure to improve access, sharing, visualization, and analysis of all forms of geosciences data and related resources. As a community-governed effort, EarthCube’s goal is to enable geoscientists to tackle the challenges of understanding and predicting a complex and evolving solid Earth, hydrosphere, atmosphere, and environment systems. An important project is the EarthCube Council for Data Facilities which seeks increased coordination, collaboration, and innovation in the acquisition, curation, preservation, and dissemination of geoscience data, tools, models, and services across existing and emerging geoscience data facilities;
  • European Plate Observing System (EPOS) is a European Union’s (EU) Horizon 2020 research and innovation program which supports integrated use of data products and facilities from distributed research infrastructures for European solid Earth science. EPOS brings together Earth scientists, national research infrastructures, ICT experts, decision makers, and the public to develop new concepts and tools for addressing questions concerning geo-hazards and those geodynamic phenomena relevant to the environment and human welfare; and
  • ENVRIplus, also an EU Horizon 2020 project, which brings together environmental and Earth system research infrastructures, projects and networks together with technical specialist partners to create a more coherent, interdisciplinary and interoperable cluster of environmental research infrastructures across Europe.

There are many parallels across these nine existing eResearch Infrastructure initiative initiatives, but currently they are somewhat disconnected. Each is focused more on national/continental scale issues, in part because most funding initiatives are nationally generated. Clearly there are common technological and science challenges that each is trying to solve in isolation and although standards, vocabularies, formats, etc., are cohesive within each community, there are sufficient differences that make it hard to integrate data across them.

The time is ripe to synchronise efforts to create globally connected networks of solid Earth and environmental science data, information infrastructures, software and researchers with a goal of making scarce eResearch funding more effective by reducing duplication, increasing efficiency, and promoting partnerships and adoption across communities initially within the solid Earth and environmental sciences and then potentially to other domains. Already there are embryonic proposals to create integrated international networks to coordinate and harmonize these efforts. If Australia is to be part of the trend towards globalisation of high-quality solid Earth and environmental research projects then it needs to be a key collaborator in their development, to ensure Australian perspectives are included.

Already the recently formed ESIP/RDA Earth Space and Environmental Sciences Interest Group is starting to coordinate and harmonize efforts internationally in the international Solid Earth and Environmental research community. In Australia, associated satellite activities such as the Australian ESIP downunder (E2SIP), has formed an ESIP cluster in collaboration with the National Earth and Environment Sciences Facilities Forum.

An additional consideration is how to extend efforts from the research sector into government and industry initiatives and create a truly global network of solid Earth and environmental science data infrastructures to underpin fundamental research into global geoscience processes within the context of societal impacts and sustainable development. Currently many government/industry initiatives are poorly connected to equivalent activities in the research sector.

REFERENCES

  1. Wilkinson, M.D., Dumontier, M., Aalbersberg, IJ.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J-W., Silva, Santos L.B. da, Bourne, P.E., Bouwman, J., Brookes, A.J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C.T., Finkers, R., Gonzalez-Beltran, A., Gray, A.J.G., Groth, P., Goble, C., Grethe, J.S., Heringa, J., Hoen, P.A.C. ‘t, Hooft, R., Kuhn, T., Kok, R., Kok, J.N., Lusher, S.J., Martone, M.E., Mons, A., Packer, A.L., Persson, B., Rocca-Serra, P., Roos, M., Schaik, R. van, Sansone, S-A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M.A., Thompson, M., Lei, J. van der, Mulligen., E. van, Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K.J., Zhao, J., Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3. https://doi.org/10.1038/sdata.2016.18 Accessed 18 August, 2018.

Biography:

Simon has been researching standards for publication and transfer of earth and environmental science data since the emergence of the world wide web. Starting in geophysics and mineral exploration, he has engaged with most areas of environmental science, including water resources, marine data, meteorology, soil, ecology and biodiversity. He is principal- or co-author of a number of international standards, including Geography Markup Language, and Observations & Measurements. The value of these is in enabling data from multiple origins and disciplines to be combined more effectively, which is essential in tackling most contemporary problems in science and society. His current work focuses on aligning science information with the semantic web technologies and linked open data principles, and the formalization, publication and maintenance of controlled vocabularies and similar reference data.

Dr Cox is the author of over 40 journal articles, 30 technical specifications and international standards, and 150 conference papers.

Coordinated identifier infrastructures enabling Geoscience researchers to meet future directions in scholarly communications

Natasha Simons1, Julia Martin2, Mingfang Wu3, Adrian Burton4, Jens Klump5, Keith Russell6, Gerry Ryder7, Lesley Wyborn8, Tim Rawling9

1Australian Research Data Commons, Brisbane, Australia, natasha.simons@ardc.edu.au    

2Australian Research Data Commons, Canberra, Australia, Julia.Martin@ardc.edu.au

3Australian Research Data Commons, Melbourne, Australia, Mingfang.Wu@ardc.edu.au

4Australian Research Data Commons, Brisbane, Australia,  Adrian.Burton@ardc.edu.au

5CSIRO Mineral Resources, Perth, Australia, jens.klump@csiro.au

6Australian Research Data Commons, Melbourne, Australia, Keith.Russell@ardc.edu.au

7Australian Research Data Commons, Adelaide, Australia, gerry.ryder@ardc.edu.au

8 National Computational Infrastructure, ANU, Canberra, Australia, Lesley.Wyborn@anu.edu.au

9AuScope, Melbourne, Tim.Rawling@unimelb.edu.au

 

INTRODUCTION

In modern research, much of geoscience and equivalent investigations in the environmental sciences are based on observations and measurements of real-world phenomena which can range from simple visual observations on small hand sized physical samples to voluminous ex-situ measurements made using satellite or laboratory/sensor instruments. Information on samples, digital data and computational methods is rarely captured in traditional publications. Fifty years ago, most data that underpinned a scholarly publication could be represented in typeset tables, but with the advent of the digital age and the computerisation of instruments, the volumes of data collected became too large to present as tables within a paper. Data then at best became included as a supplement to the paper accessible by contacting the journal, or else could be obtained ‘by contacting the author’. Such approaches limit the ability to test the veracity and reproducibility of a publication and do not guarantee accessibility and persistence of input research artefacts into the future, nor do they ensure the capability of them being reused for purposes beyond the original use case. The Geoscience Paper of the Future was recently proposed to enable researchers to fully document, share, and cite all their research products including physical samples, data, software, and computational provenance [1] and at about the same time, the Findable, Accessible, Interoperable and Reusable (FAIR) Principles [2] emerged. Today, publishers do not have a consistent way of citing data underpinning a publication whilst details on how to reference/access physical specimens or software are rarely provided. Also, the interpretations of the FAIR principles can be quite inconsistent.

To address this complex issue, in 2017, a grant from the American Laura and John Arnold Foundation was awarded to the American Geophysical Union (AGU) and other partners (including AuScope, National Computational Infrastructure, the Australian Research Data Commons) to significantly improve the interconnection of data, samples, software and literature in the Earth and space sciences, based around the FAIR principles. The key objectives of the project are that:

  1. Publishers will follow consistent policies for sharing and citing data, samples and software used in the scholarly literature and will move from having these as supplements to the publication to using trusted repositories for publishing supporting research artefacts;
  2. Open repositories for Earth and environmental sciences will enable those policies and other data applications by providing persistent identifiers, rich metadata, and related services for the data, software and samples they hold;
  3. Geoscience researchers will know how to consistently share, document, and reference data, samples and software and use globally persistent identifiers to uniquely identify their research outputs.

These objectives finally provide a response to the inevitable change required in scholarly communication driven by the emergence of computers and the dawning of the age of digital data collection and curation fifty years ago, followed by the need for more complex software to process ever-increasing data volumes. However effective implementation will require a significant cultural change in today’s research practices, many of which come from the pre-digital era. A critical component of the AGU-led project is promoting the value of citation with identifiers to researchers so that they know how to effectively use them in publications and ensure credit is acknowledged when credit is due.

PROMOTING THE VALUE OF IDENTIFIERS TO RESEARCHERS

Although identifiers have been commonplace for scholarly publications for some time and most Australian researchers have an ORCiD, few realise the power of using equivalent identifier systems for all their research artefacts including physical samples, software and data.

1.       Advantages of using Samples Identifiers

The International Geo Sample Number (IGSN), used on 5 continents to uniquely identify physical samples, allows researchers to firstly gain credit for sample collection and preparation, and secondly enable them to trace where other analytical work is published on samples that they collected and curated. As the usage of IGSN grows it will also be possible to locate other samples from the same geographical features (e.g. a borehole or a remote island) to obtain a more complete overview of where new data generated by a researcher relates to existing data in the literature. Likewise, funders can trace where a sampling project they funded has resulted in high impact publications.

2.       Advantages of Using Software Identifiers

Proper use of identifiers and citation for software means that a researcher can trace where their software has been used by others in publications and acknowledged for this work. Further, by being able to search registers of appropriately described and cited software, researchers can also reduce the ‘Time to Science’ as they do not waste time rewriting complex code that already exists.

3.       Advantages of Using Identifiers for Datasets that Underpin Publications

Increasingly the use of unique identifiers for data and proper citation of that data is being used for career advancement. For example, through linking of identifiers, a researcher is able to track usage of any of their datasets used in a high impact paper by other researchers and gain credit. In addition, a persistent identifier such as a DOI ensures long-term access to the dataset for enabling reproducibility of the current research and reuse for new research directions.

CURRENT ARDC INFRASTRUCTURES TO PERSISTENTLY IDENTIFY RESEARCH ARTEFACTS

Once researchers embrace the need for identifiers as part of their research ecosystem, they must have access to infrastructures that enable the persistent and unique identification of, and access to their research artefacts throughout their career and beyond. Over the last 10 years, the Australian Research Data Commons (ARDC) and its predecessors have been building an infrastructure for data citation which assists researchers to enable FAIR publication of data and ensure proper recognition and citation of their data in their own and any subsequent publications that also use their data. Details are available on https://www.ands.org.au/working-with-data/citation-and-identifiers/data-citation.

In the recent ARDC/AuScope/NCI funded Geosciences Data-enhanced Virtual Laboratory project, the ARDC has been working with the Geoscience community to develop equivalent persistent identifier systems for samples and software. Australian geoscience researchers can obtain access to IGSNs for their physical samples (specimens) here: http://www.auscope.org.au/igsn-info/ and information about citation for physical samples is here: http://www.ands.org.au/working-with-data/citation-and-identifiers/igsn. An ARDC guide for software citation is available here: https://www.ands.org.au/working-with-data/citation-and-identifiers/software-citation.

Combined, these efforts will ensure that Australian Geoscience researchers can meet the new demands that are now emerging from the Earth and space science publishers and enable moving towards the Geoscience Paper of the Future. The ARDC identifier systems recently developed for physical samples and software are easily portable to other physical sciences such as the environmental, marine and bio domains and will help ensure that research artefacts will be Findable, Accessible for current and future generations of researchers and Reusable for purposes beyond which they were collected for. It is accepted that Interoperability will still take some time, but plans are already being developed.

REFERENCES

  1. Gil, Y., David, C.H., Demir, I., Essawy, B.T., Fulweiler, R.W, Goodall, J.L., Karlstrom, L., Lee, H., Mills, H.J., Oh, J.H., Pierce, S.A., Pope, A., Tzeng, M.W., Villamizar, S.R., and Yu, X., 2016. Toward the Geoscience Paper of the Future: Best Practices for Documenting and Sharing Research from Data to Software to Provenance. Earth and Space Science, 3, 388-415. https://doi.org/10.1002/2015EA000136 Accessed 18 August 2018.
  2. Wilkinson, M.D., Dumontier, M., Aalbersberg, IJ.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J-W., Silva, Santos L.B. da, Bourne, P.E., Bouwman, J., Brookes, A.J., Clark, T., Crosas, M., Dillo, I., Dumon, O., Edmunds, S., Evelo, C.T., Finkers, R., Gonzalez-Beltran, A., Gray, A.J.G., Groth, P., Goble, C., Grethe, J.S., Heringa, J., Hoen, P.A.C. ‘t, Hooft, R., Kuhn, T., Kok, R., Kok, J.N., Lusher, S.J., Martone, M.E., Mons, A., Packer, A.L., Persson, B., Rocca-Serra, P., Roos, M., Schaik, R. van, Sansone, S-A., Schultes, E., Sengstag, T., Slater, T., Strawn, G., Swertz, M.A., Thompson, M., Lei, J. van der, Mulligen., E. van, Velterop, J., Waagmeester, A., Wittenburg, P., Wolstencroft, K.J., Zhao, J., Mons, B. (2016). The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3. https://doi.org/10.1038/sdata.2016.18 Accessed 18 August 2018.

Biography:

Natasha Simons is a Research Data Management Specialist with the Australian National Data Service. Located at Griffith University in Brisbane, Natasha serves on the Council of Australian University Librarians Research Advisory Committee and is an ORCID Ambassador. She is an author and reviewer of papers related to library and information management and co-authored a 2013 book on digital repositories. Natasha was the Senior Project Manager for the Griffith Research Hub, which won awards from Stanford University and VALA. She is an advocate for open data, open repositories and ORCID.

A game changer in the paradigm of the silo-based service delivery – The integrated project delivery of a secure digital data capture for clinical trials at University of Sydney

Dr Daniele Vicari1, Jessica  Cornock2

1Research Portfolio – SIH, University of Sydney, Sydney, Australia, daniele.vicari@sydney.edu.au

2ICT, University of Sydney, Sydney, Australia, jessica.cornock@sydney.edu.au

 

Abstract

With no unified system to collect clinical trial data, researchers were often making use of laborious workflow including convoluted spreadsheets and paper questionnaires. Aiming to provide a robust and supported data capture and survey capability for researchers, the Information and Communication Technology (ICT) and Research Portfolio started a collaboration in 2017 and ran a joint project enhancing the University’s installation of REDCap.

REDCap is an established, secure web application for building and managing online surveys and databases. It is ideal for collecting and managing participant data with features supporting longitudinal data collection, complex team workflows and exports to a range of statistical analysis programs.

Description

Most Universities have been challenged to deliver the best infrastructure in the rapidly evolving research ecosystem with often each sector working independently. Thus, creating a gap between the end users and professional/operational teams. To break the silo-based paradigm, ICT teamed up with other professional services in the research portfolio and took an Integrated Project Delivery method to implement a digital data capture platform among clinical trial and other studies. We will present here the strategy and outcomes of this initiative. The agile and joint decision-making allowed several key capabilities to be delivered from 2017 to 2018. For example:

  • REDCap was updated from version 5.x to 7.4 (now 8.3) providing 3 years-worth of development updates.
  • REDCap is now available to any researcher with an Australian Access Federation (AAF) identity. Increasing the ability for collaboration and decreasing the administrative burden on ICT support.
  • REDCap is now cloud-hosted on AWS which has made it a high-availability system with expectations of minimal unscheduled down time.
  • In-house training and one-to-one consultations allowing the platform to be used widely including small clinical trials and other research projects (highlighted in figure 1).
  • Most importantly, a collaborative technical and business support structure has been established across ICT, Digital Research Support and Clinical Trials support.

Figure 1: REDCap project types in University. Translational research 1 (applying discoveries to the development of trials and studies in humans). Translational research 2 (enhancing adoption of research findings and best practices into the community).Repository (developing a data or specimen repository for future use by investigators)

Conclusion

All of these enhancements have led to a significant increase in the number of users and projects utilising this system.

The initiative of integrating teams such as ICT, Digital Research Support and Clinical Trials support demonstrated an effective and agile approach to deliver a clinical trial capture data tool while allowing many other researchers to make use of this platform across the diverse disciplines.  Researchers are able to collect their data in a structured database with proper access control (university credentials) and rigorous audit trail. Thus, changing the culture of data collection while improving research integrity and compliance.

This Integrated Project Delivery method established an active teamwork which is working now in improving the maintenance and feature enhancement of REDCap including automated upgrades, integration with other systems, and customised project developments.

In addition, the service delivery can be rapidly optimized once the research support serves as a conduit between the researchers (end users) and the service provider (ICT).


Biography:

Daniele has worked as biochemistry researcher in the past and she kept her passion for teaching/training while working in several educational institutions in Brazil, USA, Switzerland, and Australia. Taking advantage of her extensive experience as mentor and teacher, she is currently supporting, training and advising staff and student researchers in how to use the digital tools to achieve best research data management practices in the University of Sydney.

Understanding the ‘R’ in the FAIR Principles

Dr Robin Burgess1

1The University of New South Wales, Sydney, Australia, r.burgess@unsw.edu.au

 

Poster Summary

In 2015 the FAIR (Findable, Accessible, Interoperable and Reusable) principles were drafted. They are internationally recognised principles that are discipline-independent and support effective research data management. Benefits for researchers, as stated by the Australian National Data Service (ANDS) [1], are gaining maximum potential from a dataset, increasing the visibility and reproducibility of research, aligning with standards and attracting new research partnerships.

The focus of investigations of FAIR principles has been on understanding ‘F’ and ‘A’ and currently knowledge and skills in these areas are strong within Australian research infrastructure communities. Work on ‘I’ of the FAIR principles has been performed by the Australasian Repository Interoperability Working Group, that consisted of staff from Australian and New Zealand Universities, alongside members from ANDS and the Australasian Open Access Strategy Group (AOASG). A key recommendation from the group’s work was the need for common interoperability principles to be applied to repositories to help clarify the meaning of the FAIR principles [2]. Little attention, however, has been given to the ‘R’ in FAIR, understanding how to reuse data. Literature shows that consideration has been given predominantly to the sciences when it comes to reuse of data [3][4], with little attention to conditions and requirements for reusing data in the social sciences, arts and non-traditional research (NTROs).

This poster highlights the importance of reuse of data and what needs to be considered and understood for data to be effectively reused. Focus will be on R1 of the FORCE11 fair data principles that specifies ‘meta(data) have a plurality of accurate and relevant attributes’ [5].

Adoption of FAIR

Application of the FAIR principles is a robust approach towards the standardisation of data management. Importance lies in supporting institutions and researchers to understand and be able to apply the principles appropriately. For example, through the ANDS Self-Assessment Tool [6] and the current work of CAUL [7] which is looking at further understanding and applying the FAIR principles.

Planning for Data Reuse at UNSW

At UNSW, in support of reuse of data, the Research Data Management Planning (RDMP) tool [8] has a free text field asking the researcher about plans for reuse of their data. This is a descriptive approach with limited guidance for the researcher.  Not all researchers use the field and it appears that few have given much thought towards the reuse of data when commencing a research project. Planning for reuse of research data, using a tool such as an RDMP, could be a good starting point for application of the FAIR principles. An RDMP can include machine readable rights statements and provenance details, as well as guided opportunities for researchers to provide rich descriptive information which will better enable reproducibility of the research and reuse of data.

What’s next

A key component when considering reuse of data is associated with licensing, to clearly define conditions under which someone else can use the data (e.g. Creative Commons). However, there are other areas that need to be considered. These are related to the metadata associated with the data, particularly provenance information. A consideration for understanding data reuse is distinguishing between “use” and “reuse”. Knowledge needs to be shared between the producer or initial user of the data and researchers potentially reusing the data, to ensure a shared understanding of all facets of the data, including methods for generating and analysing the data, and the conditions under which these activities occurred. To be able to reuse research data, to answer new questions or to reproduce initial results, the researchers require richly described metadata that gives sufficient context about the research.

References

  1. ANDS ‘FAIR Data Principles’: (https://www.ands.org.au/working-with-data/fairdata)
  2. Link to a presentation given at the Repository Community Day, Brisbane (2017) about the working group. (http://www.caul.edu.au/sites/default/files/documents/cairss/repositoryevent2017ginny-natasha.pdf)
  3. Pasquetto, I.V., Randles, B.M. & Borgman, C.L., (2017). On the Reuse of Scientific Data. Data Science Journal. 16, p.8. DOI: http://doi.org/10.5334/dsj-2017-008
  4. Borgman C (2010). Research Data: Who will share what, with whom, when, and why?. Fifth China – North America library Conference. https://www.ratswd.de/download/RatSWD_WP_2010/RatSWD_WP_161.pdf
  5. The Future of Research Communication and e-Scholarship – The Data Fair Principles: https://www.force11.org/group/fairgroup/fairprinciples
  6. Fair Self-Assessment Tool: https://www.ands-nectar-rds.org.au/fair-tool
  7. CAUL Programs and Projects: https://www.caul.edu.au/programs-projects/fair-affordable-open-access-knowledge
  8. ResData, the tool for creating research data management plans at UNSW. https://resdata.unsw.edu.au

FAIR Go: New resources to support FAIR data

Keith Russell1, Kerry Levett2, Richard Ferrers3, Andrew White4
1 Australian Research Data Commons, Melbourne, Australia, keith.russell@ardc.edu.au
2 Australian Research Data Commons, Adelaide, Australia, kerry.levett@ardc.edu.au
3 Australian Research Data Commons, Melbourne, Australia, richard.ferrers@ardc.edu.au
4 Australian Research Data Commons, Brisbane, Australia, andrew.white@ardc.edu.au

 

POSTER OBJECTIVE

The poster will showcase a number of resources and materials the ARDC (the Australian Research Data Commons) and partners developed around FAIR data in 2017-2018. These resources have helped increase the understanding of what the FAIR data principles are, outlined the steps that can be taken to make data more FAIR, and highlighted gaps in this space. Future plans for the ARDC FAIR agenda are proposed.

BACKGROUND

The FAIR Data Principles (Findable, Accessible, Interoperable, Reusable) were drafted at a Lorentz Center workshop in the Netherlands in 2015 and published in 2016 [1]. They have since received international recognition as a useful framework for considering data in a way that will enable maximum use and reuse not only by humans but also by machines. In 2017-18, ANDS-Nectar-RDS (now ARDC) undertook a number of activities to promote the FAIR data principles and to increase research community understanding around Australia about what FAIR means in practice. This included recommending actions that research organisations and research infrastructure organisations can take to make their data more FAIR.

WERE WE FAIR AWARE?
The ‘Are you FAIR aware?’ survey was conducted in October 2017 and attracted 249 responses [3]. It showed that at that time there were varying levels of awareness for the FAIR data principles, but they were considered useful and 78% would recommend these to colleagues. The most frequently mentioned gaps that were mentioned in making data FAIR were standardised data formats, provenance tools, ethics and access advice.

SELF ASSESS the FAIRNESS of YOUR DATA

The widely popular FAIR Self assessment tool [4], was released in May 2018 and has already been adopted by CSIRO as well as a number of universities and projects. It is considered a good educational vehicle to improve the FAIRness of data by highlighting the steps that can be taken to make a data set more FAIR.

TRAINING MATERIALS AND RESOURCES RELATED TO FAIR

A collection of training materials [5] related to the FAIR principles broken down by principle were published. These support the community wanting to train and grow the understanding around the more detailed principles. The FAIR data infographic (Figure 1) links to detailed advice on each aspect of FAIR, from each of the coloured squares. The collection is now available for sharing, expanding and is being aligned internationally with other organisations working on data management training.

Figure 1: FAIR data infographic (CC-BY except F.A.I.R logos CC-BY-SA by Sangya Pundir [6]

 

AGU FAIR PROJECT

The ARDC is a partner (with NCI and AuScope and others) in the American Geophysical Union Enabling FAIR Data [7] project. In this project we are working with publishers, repository managers and researchers in Australian Earth, Space and Environmental Sciences to require authors submitting a publication to make their data FAIR and accessible from a recognised repository that can provide a persistent identifier and a landing page for each dataset cited. The project will also deliver supporting resources for researchers in Earth, space and environmental sciences on how they can make their data FAIR.

NEXT STEPS IN 2018-2019

In 2018-2019, ARDC will continue the work with the FAIR principles, providing further resources and support related to what FAIR means in practice, emphasising topics such as provenance, ethics and access. The ARDC will also be looking at how software, tools and infrastructure can be improved to make it easy and seamless for researchers to make their data FAIR.

REFERENCES

  1. ANDS-Nectar-RDS, ‘FAIR data principles: how well known or understood are they? 26 March 2018. Available from: https://www.ands-nectar-rds.org.au/single-post/2018/03/26/FAIR-data-principles-how-well-known-or-understood-are-they, accessed 30 August 2018.
  2. ANDS-Nectar-RDS, FAIR self-assessment tool. Available from: https://www.ands-nectar-rds.org.au/fair-tool, accessed 30 August, 2018.
  3. ANDS, FAIR Data Training. Available from: https://www.ands.org.au/working-with-data/fairdata/training, accessed 30 August, 2018.
  4. FAIR data principles logo https://commons.wikimedia.org/wiki/File:FAIR_data_principles.jpg accessed 30 August, 2018
  5. COPDESS [Coalition for Publishing Data in the Earth and Space Sciences], Enabling FAIR Data Project. Available from: http://www.copdess.org/enabling-fair-data-project/, accessed 30 August, 2018.

 

How to Get the Most out of the Research Data Alliance

Stefanie Kethers1, Andrew Treloar2, Mingfang Wu3

1 Australian Research Data Commons, Melbourne, Australia, stefanie.kethers@ardc.edu.au
2 Australian Research Data Commons, Melbourne, Australia, andrew.treloar@ardc.edu.au
3 Australian Research Data Commons, Melbourne, Australia, mingfang.wu@ardc.edu.au

 

The Research Data Alliance (RDA) was launched as a community-driven organization in 2013 by the European Commission, the United States National Science Foundation and National Institute of Standards and Technology, and the Australian Government’s Department of Innovation with the goal of building the social and technical infrastructure to enable open sharing of data.

With over 7000 individual members from more than 130 countries and over 50 organisational members and affiliates [1], RDA provides a neutral space where its members can come together through focused global Working and Interest Groups to develop and adopt infrastructure that promotes data-sharing and data-driven research, and to accelerate the growth of a cohesive data community that integrates contributors across domain, research, national, geographical and generational boundaries.

RDA offers a variety of benefits and advantages to its individual and organisational members, including:

  • international networking (RDA bringing together experts in a variety of fields that would not usually meet, offering up new perspectives, lenses and opportunities, access to collegial consultancy resources, and information about international activities and initiatives),
  • strategic influence (enabling members to influence how standards and solutions are developed so they are aligned with their own situations, promoting Australian leadership, driving international efforts and strategy, including that of the RDA, and using RDA to amplify the message to their own stakeholders),
  • reputation enhancement (being seen as leaders, early adopters, and/or participating in the development of standards and protocols),
  • opportunities to acquire new skills, especially in data science,
  • support for development (by providing access to a large body of potential co-developers, potential adopters and their use cases, and the RDA framework for dissemination and promotion of the outputs),
  • access to RDA outputs (by providing access to developers committed to creating solutions for the adopters’ problems, which leads to adoptable outputs).

Joining RDA as an individual member is simple, does not cost anything, and provides access to all public RDA pages, and the monthly RDA newsletter. Joining Working and Interest Groups does not create any commitment, and provides some additional information. However, many of the more significant benefits outlined above will only apply after some investment, e.g. actively contributing, or even chairing a Working or Interest Group, attending an RDA Plenary, or adopting an RDA output.

We have investigated the benefits of RDA for different categories of RDA members, distinguished by the roles they play (e.g., Working Group co-chair, organisational member, member of an RDA governance body) and investments they make in RDA. The results of our investigation, together with examples and case studies, are presented in this poster.

References

  1. RDA in a Nutshell, August 2018. Available from https://www.rd-alliance.org/sites/default/files/attachment/RDA_in_a_nutshell_Aug_2018.pptx (accessed 17 August 2018).

About the conference

eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.

© 2018 - 2019 Conference Design Pty Ltd