New Zealand Index of Multiple Deprivation – Providing an Interactive Platform for Automated Creation of Regional Reports, Geocoding and Qualtrics Integration over Multiple Domains

Miss Bianca Haux1

1Centre for eResearch, University of Auckland, Auckland, New Zealand, b.haux@auckland.ac.nz

The New Zealand Index of Multiple Deprivation (NZIMD)

Since the 1990s, socio-economic deprivation in New Zealand has been measured by means of combining Census data into a New Zealand Deprivation Index (NZDep). However, as a result of growing connectivity and increasing access to various routine data sets there is an abundance of data available that isn’t included in this measure.

For this purpose, the area-based New Zealand Index of Multiple Deprivation (NZIMD) has been created by grouping 28 indicators of deprivation, including but not limited to the Census 2013, into seven Domains for deprivation: Employment, Income, Crime, Housing, Health, Education, and Geographical Access [1].

The Index of Multiple Deprivation was developed by the IMD team based at Epidemiology & Biostatistics at the School of Population Health: Dr. Daniel John Exeter, Dr. Jinfeng Zhao, Dr. Sue Crengle, Dr. Arier Chi Lun Lee and Michael Browne, with help and support from numerous individuals and organizations.

Making the Data accessible

Accessibility of the NZIMD data is key to reaching the full potential of the information contained in it. In order to make this information helpful, people have to be able to find and extract exactly that part of the data that is relevant to their field of expertise.

Therefore, an Interactive Map and Report website has been developed. It can be used to explore the geography of deprivation and its association with a given health or social outcome over the seven domains or a combination of all. The level of deprivation is colour coded to provide an immediate overview, and a graph highlights the rank of overall deprivation of the selected area as shown in Figure 1. These areas of interest can be chosen either by District Health Board (DHB), General Electoral District (GED), Territorial Authority (TA) or Region. Reports for the respective districts can be created automatically, including more in-depth statistics such as:

  • Chart of proportional distribution of the five deprivation quintiles across all domains.
  • Table of minimum, maximum and median deprivation ranks for data zones with Q5 IMD.
  • Colour coded map visualizations of deprivation distributions for each domain separately.
  • Explanatory text containing ranking and grading into least and most deprived domains in comparison to the overall IMD deprivation.

Automating this process saves a lot of manual labour and makes the reports for each district publicly available to anyone. Intended users for this application include researchers, policy analysts and organisations who are interested in better understanding the socio-economic circumstances of the communities they serve, but is also meant as a tool to provide outreach for the general public.

Geocoding

As part of the Interactive Report website a geocoding function has been implemented for single address search and batch conversion. The single search takes an address as input, highlights the data zone the address belongs to and draws the chart for the respective data zone. The report for that area can then be downloaded.

With the batch conversion, a file containing multiple addresses can be uploaded. The website converts the addresses to the data zone IDs and creates a downloadable file containing the deprivation values for each entry

Figure 1: Interactive NZIMD Report website with the Canterbury Region chosen, chart and geocoding on the left

Qualtrics Integration

The website also provides an API endpoint for users to integrate the geocoding function with their applications and retrieve the deprivation values programmatically. There has been a lot of interest from various people for an integration of such functionality for their Qualtrics surveys, which has prompted the development of a Qualtrics module that allows for anonymization by taking the participant’s address as input and replacing it with the data zone ID and embedding the associated deprivation values in the survey for evaluation purposes.

Tools and links

The Interactive Report Website can be accessed at https://imdmap.auckland.ac.nz.

It is based on R Shiny, uses the Leaflet library and is hosted on a rocker/geospatial Docker container on a NeCTAR instance. Source code for this application will be made publicly available on the Centre for eResearch Github account.

For the Qualtrics module there is an existing open source repository including instructions on how to deploy it on https://github.com/UoA-eResearch/IMD_Qualtrics_Module.

Impact

Primarily, the NZIMD targets social and health research and is used to improve the understanding of service quality on a neighbourhood level, highlight the degree of disparities across population groups and inform agencies. It impacts the approach to policy making and can prompt system change as well as increase the responsiveness and equity of health and social services provision. At the same time, it is the foundation for a more consistent approach to reporting and monitoring the social climate in New Zealand now and in the future.

REFERENCES

  1. Exeter, Daniel John et al. “The New Zealand Indices of Multiple Deprivation (IMD): A New Suite of Indicators for Social and Health Research in Aotearoa, New Zealand.” Ed. Isil Ergin. PLoS ONE 12.8 (2017): e0181260. PMC. Web. 7 June 2018.

Biography:

Bianca Haux is a Research IT Specialist at the Centre for eResearch, University of Auckland, New Zealand. Her background is in computer science with a medical context, including experience in the automation and evaluation of multiple image processing methods. At the Centre for eResearch she enables researchers to develop, use and integrate applications for the visualization of their research data using Augmented Reality, Virtual Reality or webpages to achieve impact and understanding of the research purposes.

RAiD: A Persistent Identifier and Data Management Record for Research Activities

Ms Siobhann Mccafferty1, Mr  Ian Duncan2, Dr Andrew Janke2

1Australian Access Federation, Brisbane, Australia, mccafferty@aaf.edu.au

2Research Data Services (RDS), Brisbane, Australia  i.duncan@rds.edu.au

3. University of Queensland, Brisbane, Australia, janke@uq.edu.au 

DESCRIPTION

This poster presents the RAiD Persistent Identifier for research activities.

Persistent Identifiers (PID’s) are an essential tool of digital research data management and the evolving data management ecosystem. They allow for a clear line of sight along data management processes and workflows, more efficient collaboration and more precise measures of cooperation, impact, value and outputs.

The use of PID’s is becoming standard practice for institutions, funders and infrastructure providers and standard Identifiers have developed for individuals [1] and outputs [2], but previously not for Activities or Projects.

RAiD was developed by The Australian Data Life Cycle Framework Project (DLCF) [3] in response to this need and is a freely available service and API.

To mint a RAiD Institutions can integrate the RAiD API into their services. Services that the RAiD API can be integrated with include:​

  • Institutional DMP Tools
  • Virtual Labs
  • Ethics application systems
  • Storage request platforms
  • A Stand alone RAiD request portal

The ID consists of two parts: A RAiD number, and a RAiD DMR (Data Management Record).

The RAiD number is generated via the ANDS minting service and is packaged with Service Point minting information and a time date stamp.

The RAiD DMR contains related Identifiers such as;

  • Doi
  • Orcid
  • ISNI
  • GRID
  • RAID

It also records and stores information of provisioned tools and storage from RAiD integrated services such as Cloudstor, Nectar Computing and AWS storage.

The RAiD identifier and service supports data management across all phases of research by placing the research project or activity at the centre of research workflows, creating a chain of provenance, improving discovery and access, and ensuring that output is attributable and reportable. Thus following the F.AI.R Principles [4] of making research data Findable, Accessible, Interoperable, and Re-usable.

RAiD [5] is a significant addition to eResearch PIDs and is being integrated into a selection of Institutions across Australia and New Zealand.

REFERENCES

  1. Available at: https://orcid.org/, accessed 08 September 2017
  2. Available at: https://www.doi.org/, accessed 08 September 2017
  3. Data Life Cycle Framework Project. Available at: https://www.dlc.edu.au/, accessed 08 September 2017
  4. FAIR Principles. Available at: https://www.force11.org/group/fairgroup/fairprinciples, accessed 08 September 2017
  5. Available at https://www.raid.org.au/, accessed 08 September 2017

Biographies 

Ian Duncan is the Director of Research Data Services (RDS).

Andrew Janke is the Informatics Fellow for the  National Imaging Facility (NIF), Systems Architect, DLCF,  Research Data Services (RDS) and Senior Research Fellow for the Centre for Advanced Imaging (CAI) University of Queensland.

Siobhann McCafferty is the Project Manager for the Data Life Cycle Framework Project.

 

ReDBox DLC and the Research Activity Portal

 

Mr Gavin Kennedy1

1QCIF, St Lucia, Australia

Title ReDBox DLC and the Research Activity Portal
Synopsis The RDS sponsored Research Activity ID (RAiD) is a persistent identifier in the Research Data Life Cycle, linking research activities to researchers, data infrastructure and services. The QCIF team have now integrated the RAiD into the ReDBox platform as a key identifier in its new data management features. This new ReDBox is the platform driving the Research Activity Portal (www.raportal.org.au), a RAiD service that we will demonstrate in this showcase session.
Format of demonstration Live Online Demonstration with short Slide Show
Presenter(s) Gavin Kennedy, Outreach and Engineering Manager, QCIF
Target research community Research Data Management
Statement of Research Impact Through the ReDBox DLC researchers will have a flexible data management platform which gives them multiple channels for acquiring data infrastructure and services, managing data, collaborating around data and services, and publishing data. The RAiD provides a consistent key in ReDBox for tying research activities to people, infrastructure and data.
Request to schedule alongside particular conference session Prefer not to have a lunchtime slot on the Wednesday or Thursday as we will be hosting a ReDBox user community meeting.
Any special requirements Nothing additional. Note that the url given above will be available late next week, but the demonstration version of the portal can be seen at: http://dlcf-test.redboxresearchdata.com.au/

Biography:

Gavin Kennedy is an IT research and solutions expert with over 30 years experience in ICT with the past 17 years in eResearch and ICT research. Gavin is the head of Outreach and Engineering at the Queensland Cyber Infrastructure Foundation (QCIF), where he is responsible for business development as well as leading QCIF’s Software Engineering team, who are the key developers of ReDBox, the popular research data management and publishing platform.

orcid.org/0000-0003-3910-0474

Open Science Framework – Tools for Reproducible Research

Dr Karmen Condic-jurkic1, 2

1The Australian National University, Acton, Australia, karmen.condic-jurkic@anu.edu.au

2Center for Open Science, Charlottesville, USA contact@cos.io  

The Open Science Framework (OSF) [1] is a free and open source project management tool developed by the Center for Open Science (COS) [2] in Charlottesville, USA to support researchers across the entire research lifecycle. The OSF is a versatile platform that helps with project management and organisation, collaboration, sharing and other research activities, including preprint service. The OSF integrates many services already used by researchers to streamline their process and increase efficiency. As a flexible repository, it can store and archive research data, protocols, and materials. The main goal of the Center for Open Science is to increase openness, integrity, and reproducibility of scholarly research and a number of products and service were developed to support these ideas.

The key features of OSF include:

  1. Structured projects to access files, data, code, and protocols in one centralized location and easily build custom organization for your project;
  • Controlled access and ability to make projects fully or partially public or private, making it easy to collaborate and share with the community or just your team.
  • Enhanced workflow for an automated version control;
  • Persistent identifiers for projects and materials;
  • Preregistration of research projects;
  • Extending Research with automatic creation of a preprint or a meeting abstract;
  • Institutional repositories and multi-institutional projects.

Recently, more specialized services were included under the OSF umbrella to promote good scientific practice and to improve scientific discovery and transparency:

  • OSF Meetings: an easy way to share ideas and findings presented at academic meetings in forms of presentations, posters or conference proceedings;
  • OSF Institutions: a central hub on a dedicated OSF page for the institutional research to enhance collaboration and increase the visibility of research output;
  • OSF Preprints: a single search interface to a number of preprint sources to facilitate and accelerate new models of scholarly communication across multiple disciplines.

REFERENCES

  1. Open Science Framework. Available from: https://osf.io, accessed 31st Aug 2017.
  2. Center for Open Science. Available from: https://cos.io, accessed 31st Aug 2017.

Biography:

I received master degree in chemistry in 2006 from University of Zagreb in Croatia, and my PhD degree in computational chemistry and biophysics was awarded in 2013 from Friedrich-Alexander University in Erlangen, Germany. I spent 2 years (2013-2015) as a postdoc at the University of Queensland, and moved to Canberra in 2015 to take another postdoc position at ANU. During my postdoc years, I have become very interested in open science and open data movement. In 2016, I became an ambassador for Open Science Framework.

Australian Government Linked Data Working Group: collaborating across agencies to work together on systems, vocabularies, ontologies and datasets.

Mrs Brigette Metzler1, Dr Belinda  Brown1, Mr Marco  Wallenius1, Mr Nicholas Car1, Dr Armin Haller1

1Australian Government Linked Data Working Group, Hobart, Australia

Abstract:

 

The Australian Government Linked Data Working Group (AGLDWG) is a community of Commonwealth Government experts and champions, with invited non-voting participation of individuals, corporations and other entities. The working group was established in 2012.
The Principles on Open Public Sector Information and Open Government National Action Plan are part of a whole of government change agenda – for increased access to information and data to lead to more responsive services and decision-making, and to value add information for the environment, society and economy by improving its discovery, access and re-use.
The Productivity Commission Data Availability and Use Inquiry Report (2017), also makes the case for bringing in new legislation for data sharing and re-use, and the need to link, merge and share data across government, and with the private sector.
In this context, there is an ever increasing need to find new ways of working together. The AGLDWG works to lay the foundations for transformation by establishing standards for guidance and technical mechanisms to implement Linked Data. We also work to raise awareness of linked data practices and technologies, and communicate the benefits of increasing interoperability.
Our work has four main foci: systems, vocabularies, ontologies and datasets

Data:

The key to Linked Data lies in the use of consistent digital curation standards and using Uniform Resource Identifiers to identify and name abstract ‘things’ (and where possible assigning a URI to that ‘thing’). This allows us to create interoperability between datasets, thereby creating an environment conducive to both new insights and increased collaboration.

Vocabularies:

Creating standardised vocabularies simultaneously provides the capacity to generate interoperability between datasets and discoverability of semantic data. Vocabularies, standardised in their structure and delivery according to Linked Data and Semantic Web principles, provide online, look-up-able definitions for things which can be used much more easily and powerfully than older vocabulary tools such as (paper) dictionaries, tables on web pages or XML code lists.

Systems:

Many systems have the capacity to operate with Linked Data. Some are dedicated to Linked Data, such as RDF Triplestores and the Linked Data API, and others function to facilitate Linked Data such as general website content management and web serving.

Ontologies:

A great number of foundational, or fundamental, ontologies have been produced to cater for concepts such as time (TIME ontology), simple authoring information (Dublin Core) and tracing changes to things over time (PROV-O, the provenance ontology). The AGLDWG has produced several ontologies, an example of which is one that defines properties for datasets within the data.gov.au catalogue.

Conclusion

The growth of public data provides for a future rich in opportunity. There will be growth in eResearch and Linked Data, including targeted projects to merge data for environmental, social and economic research.

 


Biographies

Dr Belinda Brown, Parks Australia, Canberra ACT, Australia:
Belinda works across multidisciplinary areas for science, environment, and information management; drawing on over 15 years of experience in the research and public sectors.
Belinda has a PhD in earth systems sciences, and started her career as a research scientist working on a range of international co-operative projects into the early development of southern ocean seaways and palaeo-climate around Antarctica. This laid the foundations for work in the science-information-policy interface; including amongst other things, working with the National Biodiscovery Working Group, the COAG National Science Working Group for Climate Change Adaptation, COAG Solar Thermal Technology Roadmap Committee, the United Nations Convention on Biological Diversity and the Global Strategy for Plant Conservation.
Belinda has an interest in enabling evidence based information for improved social, economic, and environmental outcomes; and works with colleagues to extend the value of public data, including the Linked Data and eResearch fields. Belinda is lead author and manager for the Knowledge Management Strategy for Parks Australia Environmental Information. Recent projects also include the Biomes of Australian Soil Environments (BASE) Project, a National Threatened Species Project, and a National Environmental Science Program Emerging Priorities project for the digital curation of long term monitoring datasets.

Mrs Brigette Metzler

Brigette Metzler is a senior quantitative user researcher with the Department of Human Services. She has previously worked in ontology and metadata management, including work on defining the links and the events that cause connections between the services the department delivers on behalf of the Australian government. Brigette is undertaking her PhD at the University of Queensland in Public Policy. Brigette is an active member of the Australian Government Language Data Management Working Group and is the co-chair of the Australian Government Linked Data Working Group.

Mr Nicholas Car:

Nicholas Car is the Data Architect at Geoscience Australia (GA). He coordinates data management within GA and works with GA’s partners to implement data sharing and use across institutional boundaries. His particular areas of interest are in data provenance and data semantics, both of which relate strongly to Linked Data and Semantic Web technologies. He has been an active participant in the Australian Government Linked Data Working Group for a number of years and chairs a number of other data and metadata groups in Australia and internationally, such as the Australian National Data Service-run Australian Vocabularies Special Interest Group and the Research Data Alliance’s Provenance Patterns Working Group.

Marco Wallenius
Marco Wallenius is a project officer in Commonwealth Information Management at the National Archives of Australia. Marco has been a member of the Australian Government Linked Data Working Group and is active in the data community.

Dr Armin Haller

Armin Haller is a Senior Lecturer at the Australian National University with a joint appointment in the Research School of Management and the Research School of Computer Science. Armin acts as the Manager of the Australian office of the W3C and he is Director of the MBA program. Prior, he has been a Research Scientist at Data61 of CSIRO, Australia.
Dr Haller received his PhD from the National University of Ireland, Galway under the supervision of Manfred Hauswirth and Stefan Decker while he was working in DERI. His PhD thesis work was centred around the application of ontologies to workflow models. Most of this research has been carried out in the SFI funded m3pe project and in the EU funded SUPER project. Earlier, he was also involved in the WSMO and WSMX working groups.
His general research interests are Linked Data, Ontology engineering and and the semantic Web in general. Currently he is chairing the Semantic Sensor Network Ontology working group of the W3C.

 

Utilising the International IGSN Persistent identifiers to Facilitate Online Access to Australian Physical Samples and to Data and Research derived from them

Dr Jens Klump1, Dr Lesley Wyborn2, Ms Irina Bastrakova3, Dr Anusuriya Devaraju1, Prof Dr Brent McInnes4, Dr Simon Cox5, Mr Ryan Fraser1

1CSIRO Mineral Resources, Kensington, Australia, jens.klump@csiro.au anusuriya.devaraju@csiro.au ryan.fraser@csiro.au

2NCI/ANU, Canberra, Australia, lesley.wyborn@anu.edu.au

3Geoscience Australia, Canberra, Australia, irina.bastrakova@ga.gov.au

4Curtin University, Bentley, Australia, directorjdlc@curtin.edu.au  

5CSIRO Land & Water, Clayton, Australia simon.cox.csiro.au

Introduction

The collection of physical samples is the foundation of many research endeavours and is undertaken by many different entities (e.g., individual researchers, laboratories, government agencies, mining companies, citizens, museums, etc.) for multiple purposes. However over time, a particular site can be resampled many times over either because the collector did not know that this area had previously been sampled, or because samples collected had not been properly curated and were no longer available. Furthermore, resampling may not be an option, due to cost, accessibility constraints, or timeliness. Researchers are increasingly realising that repositories of well-curated samples can be a treasure chest not only for the actual samples themselves but also for combination with prior observations that have already been made by a variety of instruments on the same sample.

In order to be able to reuse physical samples, they must be systematically curated over the long-term. Systematic sample curation, cataloguing, and persistent globally unique identification ensure both that their existence is known, and allows data derived from them through laboratory and field tests to be linked to these samples [1]. This has already been demonstrated in databases for geochemistry and for hyperspectral remote sensing. In the example of hyperspectral remote sensing, links can be established between remote sensing data products and the samples that were used as ground truth for their calibration.

IGSN in Implementation

In an Australian collaboration, we used the IGSN (International Geo Sample Number, http://igsn.github.io) to identify samples in a globally unique and persistent manner. IGSN is interoperable with other persistent identifier systems such as DataCite and the basic IGSN description metadata schema is designed to be aligned with existing schemas, such as OGC Observations and Measurements (O&M) and DataCite, which makes crosswalks to other metadata schemas easy [2,3]. IGSN metadata are disseminated through the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) allowing it to be aggregated in other applications such as portals (e.g. the Australian IGSN catalogue http://igsn.org.au) (Figure1). Through this protocol sample metadata can be made available in more than one schema.

The software for IGSN web services is based on components developed for DataCite and adapted to the specific requirements of IGSN. This cooperation in open source software development ensures sustainable implementation and faster turnaround times for updates.

IGSN, in particular in its Australian implementation, is characterised by a federated approach to system architecture and organisational governance giving it the necessary flexibility to adapt to particular local practices within multiple domains, whilst maintaining an overarching international standard.

IGSN in Australia

There are currently three IGSN allocating agents in Australia: Geoscience Australia, CSIRO and Curtin University, representing three different sectors respectively – government agencies, government research agencies, and academia [4]. In Australia, IGSN has also benefited from funding and support from the National Collaborative Research Infrastructure Strategy (NCRIS). For example, the Australian Research Data Services (RDS) Program have provided funding to help develop registration services and a common web portal that allows discovery of physical samples and sample collections at a national level (http://igsn.org.au) (Figure 1). The Australian National Data Services (ANDS) also plays an important role in this collaboration as a promoter for IGSN, facilitator for outreach to other domains that require sample identification (e.g., soils, mineral spectra, digital core specimens, insects), and as a host for the IGSN-related vocabulary service.

 

   Figure 1: Screen shot of the Australian IGSN Portal Demonstrator

 

As a result, the IGSN network enables a common access to catalogues of unambiguously identified samples from different agents, which ultimately, promotes collaboration across all Earth Science disciplines. It also increases the cost effectiveness of research by reducing the need to re-collect samples in the field and at the same time can help to increase the rigour of interdisciplinary science in that, provided there is still material left, the same sample can be analysed by multiple techniques and research groups, often over decades. Further, by extending the RDS Data Life Cycle Framework (https://www.dlc.edu.au/about) to include IGSN sample identifiers in Australia, funding agencies could even use the portal to review grant proposals for expensive collection program proposals and ascertain just how many samples are already available in curated repositories from a particular area, what data have been derived and published on them, and also determine what data collection programs have already been funded in similar areas!

IGSN INTERNATIONAL GOVERNANCE

IGSN is governed by an international organisation, the IGSN Implementation Organization e.V. (http://www.igsn.org). Membership in this organisation links the Australian IGSN community to the wider international community and at the same time allows it to act locally to ensure that the services offered are relevant to the needs of Australian researchers. This flexibility aids the integration of new disciplines into a global community of a physical samples information network.

REFERENCES

1. McNutt, M., K. A. Lehnert, B. Hanson, B. A. Nosek, A. M. Ellison, and J. L. King (2016), Liberating field science samples and data, Science, 351(6277), 1024–1026, doi:10.1126/science.aad7048.
2. Horsburgh, J. S. et al. (2016), Observations Data Model 2: A community information model for spatially discrete Earth observations, Environmental Modelling & Software, 79, 55–74, doi:10.1016/j.envsoft.2016.01.010.
3. Devaraju, A., J. F. Klump, S. J. D. Cox, and P. Golodoniuc (2016), Representing and Publishing Physical Sample Descriptions, Comp. Geosci., 96, 1–10, doi:10.1016/j.cageo.2016.07.018.
4. Wyborn, L. A. et al. (2017), Building an Internet of Samples: The Australian Contribution, in Geophysical Research Abstracts, vol. 19, pp. EGU2017-11497, Copernicus Society, Vienna, Austria.

 


Biography:

Jens Klump is the CSIRO Science Leader for Earth Science Informatics. As a member of CSIRO Mineral Resources, he is based in Perth, Western Australia. Jens’ field of research is the application of information technology to earth science questions. His research topics include data driven science and machine learning, virtual research environments, remotely operated instruments, programmatic access to data, high performance and cloud computing, and the development of system solutions for large geoscience projects.

Jens has degrees in geology and in oceanography from the University of Cape Town (UCT) and received his PhD in marine geology from the University of Bremen, Germany. He was part of the team that developed the foundations for what later became DataCite and later applied the principles developed here to building the International Geo Sample Number (IGSN), a persistent identifier system for physical specimens. Jens has more than sixteen years of experience in designing and building research data infrastructures and has served on several committees working on related topics. Jens is the vice president of the IGSN Implementation Organisation and vice president of the EGU Earth and Space Sciences Division.

A survey of attitudes, perceptions and experiences around data sharing and the concept of open data in the Australian Earth Science community

Prof. Brent I.A McInnes1, Prof. Joel  Cutcher-Gershenfeld2

1John de Laeter Centre, Curtin University, Australia, b.mcinnes@curtin.edu.au

2Brandeis University, Boston, USA, joelcg@brandeis.edu

 

 

This work reports on the findings of a 2017 national survey of attitudes, perceptions and experiences around data sharing in the Australian Earth Sciences community. The survey, which is the first of its kind in Australia, provides a benchmark metric for the adoption and utilisation of open data concepts by Australian Earth Scientists, and to determine where Australia sits in the “open data” spectrum relative to counterparts in the United States and Europe.

A total of 249 Earth Science professionals from academic (69%), government (22%) and industrial/other organisations (9%) participated in the survey.  The responses were evaluated on the basis of self-identification of gender, disciplinary focus (geoscience, eResearch and interdisciplinary) and age cohort.

Notable findings include:

  1. For all respondents, there is perceived to be a large gap between the importance of finding, accessing and using data, and the difficulty of actually doing so. Interdisciplinary researchers value finding, accessing, and using data within and across fields more than those who identify as being geoscientists, and see accessing data across disciplines as quite difficult.
  2. Women value finding, accessing, and using data within and across fields more than men. They also report data access as being of more of a priority. The most senior cohort sees using data from other fields as less important than mid-career and early career individuals.
  3. Both geoscience and interdisciplinary scientists perceive a lack of support from employers or colleagues for bridging across fields and disciplines. They also report a lack of support for open sharing and reuse of data.  In contrast, those whose primary identity is eResearch do experience such support from employers and colleagues. Interestingly, the lowest perceived support is among those with the most employment experience.
  4. The current state of geoscience eResearch infrastructure is not seen as sufficient to ensure effective data preservation. Confidence around eResearch concepts is low, except for respondents who identified as eResearch professionals. All agreed on the importance of improving mechanisms for crediting the usage of data, and that tenure/promotion policies are a substantial barrier to creating an open data environment.
  5. Sharing data on physical samples is seen as important by all, and very important by eResearch professionals, however it is perceived as being hard to do. The actual sharing of physical samples is not seen as hard as sharing data on physical samples.
  6. Geoscientists and interdisciplinary scholars do not see leaders clarifying common directions and aligning efforts in sharing data, models, and software. In contrast, eResearch professionals report stronger leadership in their field and do see alignment of efforts in their work environment.
  7. There are perceived challenges around cooperation and open sharing of data within the Geosciences, within eResearch, and between the two. The challenges are even greater when it comes to end-user knowledge and training around accessing and contributing to eResearch open data ecosystems.

 


Biography:

Brent is the Director of the John de Laeter Centre (JdLC), a Curtin-based research infrastructure hub operating $33M of research grade analytical facilities which employs 25 staff that supports research, education and training in the minerals, petroleum and environmental sectors.

Research ID: researcherid.com/rid/B-7408-2013
ORCID: orcid.org/0000-0002-2776-0574 

 

ReDBox – Tracking the Research Data Lifecycle

Mr Andrew White1

1QCIF, Brisbane, Australia a.white@qcif.edu.au gavin.kennedy@qcif.edu.au

 

DESCRIPTION

Institutions are increasingly looking for Research Data Management solutions to support the Research Data Lifecycle. ReDBox and the cloud based ReDbox Lite are the leading Australian solutions that provide organisations with the ability to manage, describe, share and publish research data collections to systems such as Research Data Australia. The capabilities allow end to end tracking and support for the research data lifecycle. It has a flexible workflow system for cataloguing data sets and linking to them wherever they reside. The system evolved over a period of time with considerable input from the research data community and ongoing development and community support continue makes ReDBox suitable for a variety of institutional requirements. The poster outlines how ReDBox supports the Research Data Lifecycle, from Research Data Management Planning, to metadata harvesting, storage provision, collection management and ultimately publication through Research Data Australia and Datacite.


Biography:

Andrew White works both for QCIF and ANDS on activities related to national data management Infrastructure including the ReDBox Data Management System. His role includes project management support for a range of national projects QCIF is engaged with.

Andrew is also an ANDS Data Consultant whose primary role is to provide outreach support for ANDS activities and interests in Queensland. Based in Brisbane he works across a variety of the ANDS Programmes providing data management assistance as well as facilitating outreach activities across a number of Queensland Universities and Public Sector Institutions.

 

 

Creating an Open FAIR-way to Connect Researchers, Publishers and Data Repositories: a New AGU-led Initiative in the Earth and Space Sciences.

Shelley Stall1, Dr Lesley Wyborn2, Erin Robinson3, Dr Brooks Hanson4, Mark Parsons5, Dr Kerstin Lehnert6, Professor Joel Cutcher-Gershenfeld7, Professor Brian Nosek8

1American Geophysical Union, Washington, USA, sstall@agu.org 

2National Computational Infratructure, Canberra, Australia,Lesley.wyborn@anu.edu.au

3Earth Science Information Partnership, Boulder, USA,erinrobinson@esipfed.org

4American Geophysical Union, Washington, USA, bhanson@agu.org

5Lamont-Doherty Earth Observatory of Columbia University, New York, USA, lehnert@ldeo.columbia.edu

6Rensselaer Polytechnic Institute, University of Colorado, , Boulder, USA, parsom3@rpi.edu

7Heller School for Social Policy and Management, Brandeis University, Waltham, USA, joelcg@brandeis.edu

8Center for Open Science, Charlottesville, USA nosek@cos.io

 

ABSTRACT

Open, accessible, and high-quality data and related data products and software are critical to the integrity of published research: they are key to ensure transparency of research and to support reproducibility and repeatability. Unfortunately not all research artifacts are saved in such a way that they can firstly be understood by other researchers reading the publication, then subsequently be reused and repurposed in multiple other research endeavors.

To accelerate this process, the American Geophysical Union and a set of partners representing the International Earth and space Science community including the Coalition for Publishing Data in Earth and Space Sciences (COPDESS), the Earth Science Information Partnership (ESIP), DataCite, Research Data Alliance (RDA), and the Center for Open Science (COS) have been awarded a grant from the Laura and John Arnold Foundation to develop a collaborative solution across researchers, journals and repositories that will evolve the Earth and Space Science (ESS) publication process to include not just the publication, but all research inputs into that publication and related derived data products to help develop a unified process that is efficient and standardised for researchers and supports their work from grant application through to publishing [1].

The aim of the project is to develop and implement a collaborative solution for researchers, journals and repositories that will connect publications in the Earth and space sciences with related data, samples and software in repositories, and then make these connections and data interoperable and discoverable across multiple publishers and repositories. A reference set of best practices will be developed for researchers, publishers, and repositories that will include: metadata and identifier standards; data services; common taxonomies; landing pages at repositories to expose the metadata and standard repository information; standard data citation; and standard integration into editorial peer review workflows.

The solution will include defining and managing the metadata requirements and storage requirements for data and derived products, and the incorporation of the changes needed into the submission and workflows for each publisher. It will also provide support and oversight of the adoption process, best practices, and continued compliance of the requirements by both repositories and publishers ensuring a sustainable, scalable solution.

The project will be based around the FAIR guidelines as developed by FORCE11.org [2], which seeks to ensure that research artifacts that are input to and/or support the publication process will be Findable, Accessible, Interoperable, and Reusable (FAIR). Research artefacts can include datasets, images, video, software, scripts, models, physical samples, and other tools and technology: all are an integral part of modern day research and hence by providing persistent identifiers for each and then being able to link their IDs to publications they provide the supporting evidence, reproducibility and integrity of the scientific record.

This project will build on existing work of COPDESS [3], ESIP [4], RDA [5], the scientific journals, and domain repositories to ensure that well documented data, preserved in a repository with community agreed-upon metadata and data standards, and through supporting persistent identifiers becomes part of the expected research products submitted in support of each publication. The solution will also ensure that the submission of data and derived products supporting research have documentation that is machine readable and better meets the FAIR Data objectives.

In Australia, this initiative was supported by AuScope [6], the Australian National Data Service (ANDS) [7] and National Computational Infrastructure (NCI) [8]. The first meeting of the Advisory Board will be in Washington D.C. on 15 November 2017 and will be followed by a 2-day Stakeholder Workshop that will bring together repositories and journals/publishers for a workshop on implementing standards and best practices.

REFERENCES

1. American Geophysical Union Coalition Receives Grant to Advance Open and FAIR Data Standards in the Earth and Space Sciences. Available from http://news.agu.org/press-release/agu-coalition-receives-grant-to-advance-open-and-fair-data-standards/ Accessed 30 August 2017.
2. The Force 11 FAIR data principles. Available from https://www.force11.org/group/fairgroup/fairprinciples, accessed 30 August 2017.
3. Coalition for Publishing Data in Earth and Space Sciences (COPDESS). Available from http://www.copdess.org/ , accessed 30 August 2017.
4. Earth Science Information Partnership (ESIP). Available from http://www.esipfed.org/ , accessed on 30 August, 2017.
5. Research Data Alliance (RDA). Available from https://www.rd-alliance.org/ , accessed on 30 August, 2017.
6. Australian National Data Service (ANDS). Available from http://www.ands.org.au/ , accessed on 30 August, 2017.
7. AuScope. Available from http://auscope.org.au/ , accessed on 30 August, 2017.
8. National Computational Infrastructure. Available from http://nci.org.au/ , accessed on 30 August, 2017.


Biography:

Lesley Wyborn is a geochemist by training and worked for BMR/AGSO/GA for 42 years in a variety of geoscience and geoinformatics positions. In 2014 she joined the ANU and currently has a joint adjunct fellowship with National Computational Infrastructure and the Research School of Earth Sciences. She has been involved in many NCRIS funded eResearch projects over the years. She is Deputy Chair of the Australian Academy of Science ‘Data for Science Committee’ and is co-chair of several RDA Interest Groups as well as a member of the AGU Earth and Space Science Executive Committee.

ORCID ID: http://orcid.org/0000-0001-5976-4943 

CODATA Commission on Standards

Dr Simon Cox1, Dr Lesley Wyborn2, Dr Marshall Ma3, Dr Simon Hodson4, Professor Geoffrey Boulton4

1CSIRO , Clayton South, Australia, simon.cox@csiro.au

2Australian National University, Canberra, 6200,lesley.wyborn@csiro.au

3University of Idaho, Moscow, USA, max@uidaho.edu

4CODATA, Paris, France, simon@codata.org|G.Boulton@ed.ac.uk

ABSTRACT

CODATA, the Committee on Data for Science and Technology, was established in 1966 by ICSU to promote and encourage, on a world-wide basis, the compilation, evaluation and dissemination of reliable numerical data of importance to all fields of science and technology.  CODATA has played a particular role in standardizing the values of some of the key physical constants – see http://www.codata.org/committees-and-groups/fundamental-physical-constants.

CODATA is concerned with all types of data resulting from experimental measurements, observations and calculations in every field of science and technology, including the physical sciences, biology, geology, astronomy, engineering, environmental science, ecology and others. Particular emphasis is given to data management problems common to different disciplines and to data used outside the field in which they were generated.

Researchers across the science disciplines, the humanities, the social sciences need to create integrated data platforms that interoperate across discipline boundaries, and enable access to data by a diversity of users. The use of shared models and vocabularies makes data more easily re-useable, and thus more valuable.

The current landscape sees a variety of approaches to promulgating and maintaining community data models, formats, and vocabularies. These are generally organized within disciplines or groups of disciplines, with limited interoperability and linking between them. The emergence of the linked data paradigm, building on the key technologies of the World Wide Web, provides an opportunity to harmonize both tools and key content. The CODATA Commission on Standards aims to assist the science community to develop a coordinated approach, sharing best practices, and where necessary providing a platform for publication and governance of key cross-disciplinary ontologies and vocabularies.


Biography:

Simon Cox is a CSIRO research scientist, who has been working on standards related to environmental information since the dawn of the web era, through the Dublin Core Metadata Initiative, Open Geospatial Consortium, ISO/TC 211, INSPIRE, Research Data Alliance, Australian Government Linked Data Working Group and W3C. He was awarded the 2006 OGC Gardels Medal and presented the 2013 AGU Leptoukh Lecture.

12

About the conference

eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.

© 2018 - 2019 Conference Design Pty Ltd