Mirror Mirror: reflections of FAIR Data evolution across a national portfolio of projects

Dr Richard Ferrers1, Mr  Keith Russell1

1Australian Research Data Commons, Melbourne, Australia

Introduction.

ARDC in late 2019 invested in a programme of 42 Data and Services Discovery projects  focusing on discovering elements required to create Transformative Data collections (32 projects) and the Institutional Role in a Data Commons (10 projects). One aspect in these projects was discovering how the data involved in the projects could be made more FAIR (Findable, Accessible, Interoperable and Reusable).

Method.

Projects were asked to self-assess the FAIRness of the data at the project start, project end and expected state two years after the projects. The results were included in the final reports and have now been collated. This poster provides a summary of the findings from those survey responses. FAIR was assessed through 14 questions. For each question a scale rated the level of FAIRness.

Results.

Key findings include: (1) while FAIR maturity varied substantially across projects, discipline did not appear a substantial indicator of FAIR maturity, and (2) across all projects, types of FAIRness in practice were much more evenly rated than expected.

The poster will provide graphs indicating the spread of FAIRness by project and FAIR category, and how through the projects their FAIRness improved. Some projects had mature FAIRness at the beginning and added minor improvements, while others started with low FAIR maturity and added to their FAIRness substantially – as big improvers.

Conclusion.

The poster presents a national programme of data projects through the lens of FAIR, across a range of disciplines and project types.


Biography:

Richard is a Research Data Specialist at The Australian Research Data Commons, where he works in the Engagements team with Victorian and Tasmanian Universites, and Trusted Data Community.

Trusted Data Community – Reaching for Trust

Dr Richard Ferrers1, Ms Margie Smith2, Dr Natalia Atkins5, Dr Siddeswara Guru4, Ms Katina Toufexis3

1Australian Research Data Commons, Melbourne, Australia
2Geoscience Australia, Canberra, Australia
3University of Western Australia, Perth, Australia
4TERN, University of Queensland, Brisbane, Australia
5IMOS, Hobart, Australia

Introduction.

The Trusted Data Community is a collaborative activity of 21 organisations seeking to gain an international certification (called Core Trust Seal, CTS) about the trustworthiness of their respective data repositories. Since 2019, the community has been meeting regularly to discuss and share progress and gathered for workshops and writing sessions.

Method.

This BOF will bring together lead community members and conference guests to :

(1) strengthen the community relationships through online interaction;

(2) discuss different approaches and methods to assess requirements for CoreTrustSeal;

(3) share current work, successes and challenges facing us individually and communally,  and

(4) allow the community to work together on challenging issues.

Results/Conclusion.

This BoF will provide an opportunity to bring the community together to share learnings with other organisations who are interested in CTS and for them to hear about and meet the community which will widen the potential benefits and interest in the existing community. As such, the BoF will feature a small number of guest speakers representing the broad perspective of community activity to date who will briefly (3-5 min) share with delegates what progress they have made to date with learning, costing and undertaking  Trusted data certification. In the second phase, community members will convene around, and report back on, one or two particularly challenging topics, allowing for in-depth discussion, sharing strategies and opportunities for progress.


Biography:

Richard is a Research Data Specialist at ARDC, based in Melbourne, where he works in the ARDC Engagements Team with Victorian and Tasmanian institutions.

https://orcid.org/0000-0002-2923-9889

Vocabularies, Vocabularies, Vocabularies: Which ones should I use? Which ones should I trust? Which Ones….?

Dr Adrian Burton1, Dr Simon Cox2, Dr Lesley Wyborn1, Mr Rowan Brownlee1

1Australia Reserach Data Commons, Canberra, Australia
2CSIRO, Clayton, Australia

Use of shared or harmonised terminology is required to maximise semantic interoperability, particularly across different domains and in multiple communities. The term ‘vocabulary’ is used to denote any semantic asset containing terms and information about those terms, including value sets (i.e., bag of terms, term list), controlled-vocabularies, glossaries, thesauri, and taxonomies, and potentially even concept maps, ontologies, and knowledge graphs.

Online vocabularies and vocabulary services are proliferating, and users are uncertain as to which ones they should select. In 2019, a review of Research Vocabularies Australia for the Australian Research Data Commons (ARDC) noted multiple cases of semantically overlapping vocabularies even within the same service. To improve utility these vocabularies should be consolidated, or trust metrics be developed. Vocabulary metadata should clearly display governance and particularly usage information, which is a proxy for trust.

Consolidation is difficult. Requirements from different applications and contexts can lead to the development of new vocabularies with what appears to be the same scope as existing vocabularies. However, we need guidelines to help users determine if an existing vocabulary meets their needs, and which ones are reliable and sustainable.

The BoF will start with short presentations to set the scene, followed by two breakout sessions to begin development of guidelines for the selection of vocabularies that are fit for purpose, authoritative, governed, persistent. The first will canvass what users need to know in order to make a selection. The second will explore development of a “5-star vocab” ranking like the Five Star Open Data.


Biography:

Adrian Burton is Director, Data, Policy and Services at the Australian Research Data Commons (ARDC). Adrian has provided strategic input into several national infrastructure initiatives,  is active in building national policy frameworks to unlock the value in the research data outputs of publicly funded research.

Lesley Wyborn is an Adjunct Fellow at the National Computational Infrastructure at ANU and works part-time for the Australian Research Data Commons. She had 42 years’ experience in Geoscience Australia in scientific research and in geoscientific data management. She is currently Chair of the Australian Academy of Science ‘National Data in Science Committee’ and is on the American Geophysical Union Data Management Advisory Board and the Earth Science Information Partners Executive Board. She was awarded the Public Service Medal in 2014, the 2015 Geological Society of America Career Achievement Award in Geoinformatics and the 2019 US ESIP Martha Maiden Award.

Governance Models for Research Compute

Ms Jo Dalvean1,Dr Ian Thomas1, Mr Nick May1, Dr Steve Quenette2, Dr Christopher Adda3, Dr Bernard Meade4, Luc Betbeder-Matibet5

1RMIT University, Melbourne, Australia
2Monash University, Clayton, Australia
3La Trobe University, Bundoora, Australia
4The University of Melbourne, Parkville, Australia
5University of New South Wales

Introduction

With the current challenges of increasing demand and reduced budgets, research institutions need careful accounting and optimisation of policies and processes.  As new compute capabilities are implemented, allocating research compute to researchers has become more complex and urgent.  These enabling technologies are key for research impact, but also require significant investment.  This mandates a need for responsive and equitable approaches to managing  resources.

The purpose of this BOF to enable experts to discuss challenges and strategies for handling governance of research computing allocations (such as HPC or cloud) to researchers, focussing primarily on policy, procedures and politics.  Are there models that would allow this process to be more fair, efficient and transparent for both users, institutions and the providers?  Is there any consensus on the best practice on allocating scarce resources?  Does this practice change for different classes of researchers and compute?

Methods

We start with introduction to the topic followed by some short talks by community leaders on this topic to help stimulate discussion.  Most of the session will be devoted to group discussion on specific questions.

Results

We hope to uncover key principles for successful governance of research compute from the experiences of the participants and then organise suitable follow-on activities.

Conclusion

Now that advanced research compute has moved from a niche requirement to a key enabler of impact, best practice models for handling these expensive resources becomes crucial.  We must move from ad-hoc heuristics and adaptations of existing processes to a more efficient and effective approach.


Biography:

Ms Jo Dalvean (0000-0002-7502-3202) is Implementation Lead for a project delivering improved research data and computing services at RMIT University. The role includes development of allocation, usage and reporting models for research compute services.

Dr Ian Thomas (https://orcid.org/0000-0003-1372-9469)

Mr Nick May (https://orcid.org/0000-0002-1298-1622)

Dr Steve Quenette (https://orcid.org/0000-0002-0368-7341)

Dr Christopher Adda (https://orcid.org/0000-0002-0905-8909)

Dr Bernard Meade (https://orcid.org/0000-0001-6740-522X)

Luc Betbeder-Matibet (https://orcid.org/0000-0003-4065-5784)

The social practice of online research networking

Dr Sara King1, Dr Linda Pearce2, Dr Diana Newport-Peace3, Dr Tully Barnett3, Ms Mary Filsell3, Ms Alexis Tindall4, Ms Ashley Dennis-Henderson4

1AARNet, Adelaide, Australia
2UniSA, Adelaide, Australia
3Flinders University, Adelaide, Australia
4University of Adelaide, Adelaide, Australia

The Adelaide Digital Humanities group is an informal group of university and Galleries, Libraries, Archives and Museums (GLAM) sector professionals with a shared interest in the Digital Humanities (DH). The group emerged in 2018 from ongoing engagement of a core of individuals in multiple DH and eResearch projects across the GLAM sector and three universities.

Momentum built through 2019 and would likely have developed further in 2020, even without COVID-19 and related challenges. But a shift to online has certainly changed how the group operates and provided new opportunities for online collaboration that we did not foresee. The demise of other projects and difficulties of national and international networking brought about a new energy and desire for local support, experimentation and discovery.

Considering this experience through Social Practice Theory, a number of factors underpin this success:

1) the purpose of the group was established and reinforced during lockdown;

2) the members of the group welcomed new ways of connecting;

3) the rules of the group extended to the online community through an inclusive culture and meeting etiquette.

In addition, the group leveraged the ongoing benefits of investment in eResearch and DH initiatives nationally and internationally demonstrating the value of such investment occurs beyond the initial frame of work.

As the group moves to formalise itself as a fledgling community of practice, this BoF invites colleagues to reflect on and discuss the characteristics of successful online networking models, to share good practice and identify and discuss the pitfalls.


Biography:

Dr Sara King is the Training and Engagement Lead for AARNet. She is focused on outreach within the research sector, developing communities of interest around training, outreach and skills development in eResearch. She is currently working on creating reusable guidance information for Jupyter Notebooks and other AARNet services to be adapted for Carpentry training workshops. She is passionate about helping others develop the infrastructure and digital literacies required for working in a data-driven world, translating technology so it is accessible to everyone.

Expanding the knowledge base of WA’s marine environment by improving the Environmental Impact Assessment process and leveraging Australia’s eResearch infrastructure investment.

Mr Gordon Motherwell2, Mr Chris  Gentle3,4, Mr Peter Brenton5, Mr Luke Edwards1,3

1Pawsey Supercomputing Centre, Perth, Australia
2Department of Water and Environmental Regulation, Perth, Australia
3Western Australian Marine Science Institution, Perth, Australia
4Western Australian Biodiversity Science Institute, Perth, Australia
5Atlas of Living Australia, Canberra, Australia

Until recently, Environmental Impact Assessment (EIA) practice was to use environmental information submitted with a proposal for the purpose of that proposal’s assessment only. The actual data was not submitted. As part of Digital Environmental Impact Assessment effort, centralising the collection and access to this data gathered and used for regulatory processes will reduce future approval and project delays and reduce effort, uncertainty and risk.

The Index of Marine Surveys for Assessments (IMSA) is a Department of Water and Environmental Regulation (the department) project, implemented in partnership with the Western Australian Marine Science Institution (WAMSI). IMSA is the first platform of its kind to deliver systematic capture and sharing of marine data taken as part of an EIA. With an estimated $50 million spent annually undertaking marine surveys for EIAs in WA, IMSA was developed to centralise this data and make it publicly available. IMSA provides access to marine surveyed reports, metadata and map layers through the department’s BioCollect online portal (provided by Atlas of Living Australia), as well as processed data products and raw data packages (hosted at the Pawsey Supercomputing Centre).

This talk will showcase “eResearch in Action” by describing how Government and private industry can work together to bring many benefits to the community.  IMSA was able to be implemented quickly by re-using existing NCRIS supported eResearch infrastructure and will enable future opportunities to better understand and manage the marine environment.


Biography:

Luke undertakes various outreach, engagement, support and training activities to drive uptake of Pawsey services.  He also currently works as the Data Manager for WAMSI (Western Australian Marine Science Institution) and Facility Manager for ASDAF (Australian Space Data Analysis Facility).

http://orcid.org/0000-0001-8590-3361

Research tool roadmap

Dr Christopher McAvaney2, Dr Jerry Lai1, Dr Yoong Goh1

1Intersect Australia, Sydney, Australia, 2Deakin University, Melbourne, Australia

‘Should I use R or Python in my research?’, ‘Can Tableau do anything that Power BI can do?’ and ‘Is Excel able to process my 3 terabytes of data?’ are some of the questions that have always been asked by researchers when they are deciding on which research tools to use in their research. The primary reason for these questions is that people, in general, do not have enough knowledge about the pros-and-cons of these tools in relation to their research context. As a result, many researchers may select tools that are not best suited for their research timeline and budget.

To assist researchers with the selection process, we propose to design a roadmap for research tools. The roadmap would be structured as an interactive decision tree which can be implemented into an online recommender system.

As the initial stage of the roadmap, we focus primarily on data collection, manipulation and visualization tools that are commonly available to researchers. We began by designing a rating scale to measure the usability, complexity / technical threshold, and ubiquitousness of these tools in different research contexts. Moreover, we are also considering other technical and financial aspects, such as  operating platforms, licensing and availability of support  within Deakin University. This roadmap will provide a template for designing decision trees for other tools to the wider eResearch community.

In this presentation, we will provide a progress update and details of the roadmap development at the initial stage which may lead to an online recommender system.


Biography:

Andrew (Yoong) has a strong background in computer science and mathematics. His work interests are natural language processing, data analysis and machine learning.

Jerry has a background in psychological science and statistics. Jerry works with multiple research teams in Deakin on survey design; statistical programming; data-analytics and visualisation and promoting statistical literacy.

Christopher McAvaney is Services Manager, eResearch at Deakin University. He is responsible for establishing an eResearch program of work at Deakin which includes REDCap.  Christopher has extensive research administration experience at Deakin and is keen to leverage as much as possible from REDCap via its extensible design.

REDCap: data capture from humans with a valuable and sustainable platform. Challenges and opportunities for REDCap administrators

Dr Christopher McAvaney1, Mr Aidan Wilson2, Mr John Liman3, Dr Weisi Chen2, Mr Shaun Grady2, Dr David Jung4, Ms Yelena Fridgant5

1Deakin University, Waurn Ponds, Geelong, Australia
2Intersect Australia, Sydney, Australia
3Monash University, Melbourne, Australia
4University of New South Wales, Sydney, Australia
5Sydney Local Health District, Sydney, Australia

REDCap (Research Electronic Data Capture) is a secure web application for building and managing online surveys and databases that supports multi-centre clinical trials with appropriate security/audit trail features. REDCap was built by Vanderbilt University and offers free licences for not-for-profit organisations. It has an increasingly broad international community of 4k consortium members and 1.4m users.

REDCap is free software and is distributed to consortium members under a license that provides no warranties or provisions of support from Vanderbilt. As such, licensees are generally responsible for administering and supporting their own instances.

A growing number of universities and research institutions in the Australasia region have REDCap instances, and generally, local administrators are responsible for providing all user support in addition to maintaining the application and any integrations with other institutional systems. Often this can be challenging to research support personnel without systems administration experience.

This BoF session will bring together a number of administrators of REDCap instances at universities and research institutions around Australasia, to present and discuss the range of support models, and share experience in administration, governance, user support, training and integration with other institutional systems. Following these presentations the administrators will participate in an open discussion with each other and the audience.

This BoF will benefit representatives from institutions who wish to install REDCap and want to know what is required to adequately support it, as well as from institutions who have REDCap installations and are looking for ways to improve systems, support and integration.


Biography:

Christopher McAvaney is Services Manager, eResearch at Deakin University. He is responsible for establishing an eResearch program of work at Deakin which includes REDCap.  Christopher has extensive research administration experience at Deakin and is keen to leverage as much as possible from REDCap via its extensible design.

Aidan Wilson is Intersect’s eResearch Analyst for ACU, and one of ACU’s REDCap administrators. He also delivered REDCap training and user support to ACU’s researchers.

John Liman is Helix’s Snr. Analyst Programmer at Monash University.  John is the Monash REDCap Administrator supporting researchers with REDCap design, training, project consultation and customisations.

Weisi Chen is Intersect’s eResearch Analyst for University of Technology Sydney, and one of his routine responsibilities is to act as the UTS REDCap administrator. With more than seven years of eResearch experience, Weisi has expertise in a broad range of eResearch technologies.

Shaun Grady is the Intersect eResearch Analyst at the University of Newcastle and key provider of REDCap training and the support at the University. Shaun originally trained as a Radiation Therapist so has a strong interest in Healthcare data management and research and will complete a Master of Information Technology in Data Science at Charles Sturt University in 2020.

David Jung is Research Data Support Officer at the UNSW Sydney and one of his roles is supporting researchers as a REDCap administrator.

Yelena Fridgant is a research data management consultant with over twenty years experience in data management. In 2015, Yelena was responsible for establishing the Sydney Local Health District district wide REDCap installation, the first in  a Local Health District in NSW. She is passionate about  supporting clinicians with their research project methodology, workflows, database designs and resolving their data management headaches.

Moving Data: Growing a National Data Transfer Platform

Mr Brian Flaherty1

1New Zealand eScience Infrastructure (NeSI), Auckland, New Zealand

Introduction

Leading up to 2019, regular engagement with research communities informed New Zealand eScience Infrastructure (NeSI) of growing requirements for a range of data management services, including repositories, access management, reusability, and privacy. Researchers were generating and transferring data at multiple times larger rates than in previous years.

Methods

In response, NeSI reviewed, revised, and launched a national Data Transfer Platform in early 2019 – operated in partnership with Globus, REANNZ and multiple research institutions – to enable researchers to move data to and from NeSI and other New Zealand research facilities quickly and easily. NeSI has also begun conversations with Australian AARNet colleagues to explore options for improving data movement between Australia and New Zealand.

Conclusion

By expanding the number of connected research institutions, NeSI’s national Data Transfer Platform will allow for wider support of trans-Tasman workflows and improved collaborations between international research instruments and compute facilities.

This presentation will share an update on the latest developments with New Zealand’s national Data Transfer Platform, as well as a future look at what’s to come.


Biography:

Brian Flaherty is the Product Manager, Data at New Zealand eScience Infrastructure (NeSI). He has a background in digital libraries, digital scholarship, research infrastructure & support, and discovery services.

Semi-auto generated reports from a large dataset for non-expert users

Dr Rebecca Handcock1,2, Professor Cameron Neylon2, Dr Richard  Hosking1,2, Aniek Roelofs1,2, Dr James Diprose2, Associate Professor Lucy Montgomery2, Dr Alkim Ozaygen2, Dr Katie Wilson2, Dr Chun-Kai (Karl) Huang2

1Curtin Institute for Computation, Curtin University, Bentley, Australia
2Curtin Open Knowledge Initiative, Curtin University, Bentley, Australia

BACKGROUND

The Academic Observatory (AO) dataset contains more than 12 trillion pieces of information on university research, publications, and funding, collected by the Curtin Open Knowledge Initiative (COKI). This dataset is used by researchers and strategic decision makers to understand university performance.

AO data is stored in Google Cloud Platform, with data presentation typically via data dashboards. Many users require custom data extractions presented as traditional reports, yet may not have the technical expertise to extract the data.

METHOD

Our method of generating these reports from the AO dataset is inspired by literate programming concepts, being templated documents with code insertions. We use the “Precipy” python library, with specific report parameters contained within a configuration file, and analytics functions for data processing and visualisation specified in a customisable analytics module.

The python tools we developed for use with “Precipy” were designed for the domain context of the AO dataset. This includes managing data access, summaries of tabular data, custom plots, and semi-auto generated blocks of text expansions common in such reports. These tools are combined with CSS and Markdown templates to control the final design and layout in PDF and HTML formats.

RESULTS AND CONCLUSION

Our methodology facilitates generating multiple similar reports such as for data from different countries, or repeated report running such as monthly summaries. It addresses the need for generating reports from the large complex AO dataset for non-expert users.


Biography:

Rebecca Handcock is a Spatial Data Scientist with a PhD from the University of Toronto. Her research ranges from using remote sensing and sensor networks to monitor agriculture and water, to recent projects focusing on health, research evaluation and bibliometrics. Rebecca has previously spent 10 years as a research scientist at CSIRO, and has held roles within the academic sector including the University of Washington. She is part of Homeward Bound, a global initiative to foster leadership among women in STEMM fields.

12345614

Recent Comments

    Categories

    About the conference

    eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

    Conference Managers

    Please contact the team at Conference Design with any questions regarding the conference.

    © 2018 - 2020 Conference Design Pty Ltd