The National Research Data Cloud 2025

Rob Cook1, Peter Elford2

1Pangalax, Brisbane, Australia, rob.cook@pangalax.com

2AARNet, Canberra, Australia, Peter.Elford@AARNet.edu.au

 

Description

The National Research Data Cloud (ARDC), integrating and further developing the work of the ANDS, Nectar and RDS Project, is a priority recommendation from the 2016 Research Infrastructure Roadmap [1]. As a result the Department of Education and Training has instigated a review, chaired by Professor Robyn Owens, to propose a future design for such a national research data cloud and a transition plan to move from the current situation towards the new design.

In this BoF Rob Cook will report the current state of the review and its proposals about potential future design characteristics, and his report will be followed by a panel session discussing and seeking feedback about important questions that have arisen during the review.  The panel discussion will be facilitated by Peter Elford and the panel itself will include representatives of all the major stakeholder groups.  The panellists will be asked to respond briefly and then a number of key questions will be posed to the panel and the audience.  The discussion is expected to explore how the future research data cloud could and should fit into the overall Australian e-Infrastructure landscape. There will be a particular focus on how the ARDC might interact with existing eResearch providers including NCI, Pawsey, AARNet, and the AAF, and with the universities, institutions and organisations that host the research communities we are collectively trying to serve. Attendees will be encouraged to consider what the ARDC will be like in 2025.   The BoF will focus on what might be possible rather than immediate next steps and its outputs will assist the current review process and provide some pointers to how future investment will provide the greatest national research impact.

This BoF aligns to The Connected Researcher theme, but is likely to be of wide interest to all attendees.

[1] 2016 Research Infrastructure Roadmap, https://www.education.gov.au/2016-national-research-infrastructure-roadmap

 

 

Appreciating the ‘Method to the madness’ in Research; Optimizing the Madness via eResearch Technology

Miss Amanda Miotto1

1Griffith University, Nathan, Brisbane, Australia, a.miotto@griffith.edu.au

 

DESCRIPTION

This presentation aims to highlight the value of documenting the processes for research groups: Working alongside researchers to map out their processes for their workflow pipelines, with the goal of preemptively identifying potential issues and opening up possibilities using technology to accelerate their research.

In our experience most research groups work organically- small groups working on two or three problems at the same time, all interlinked but each group looking at unique questions and often for an answer that leads itself to more questions. It can be difficult to maintain efficient and methodical workflows when you are often heading in a number of directions at the same time, uncertain where your question will lead you.

Our aim is to gain an understanding about their workflows and highlight areas where technology can enable and accelerate research. Sitting with the researchers on the ground floor, we work together to gain an understanding around their research path, map their process and data workflow and expand their documentation.

Introducing someone with a fresh perspective, without assumptions, can bring new viewpoints to problems and offer ‘out of the box’ thinking. This can illuminate areas that have previously needed to be complicated or flexible but on re-evaluation have stabilized and are ready for optimization. These interactions can also spark the conversation regarding relevant emerging eResearch technology which can enable new revenues of outcomes and collaborations; as well as highlighting appropriate data management.

Mapping these processes and data flow can have further benefits. Having proper documentation can assist new staff coming into the team, technical groups needing current infrastructure information, offer transparency for managers and audits, encourage reproducible and responsible research and reduce the knowledge lost when contracts finish or students move on.

To complicate matters, when researchers have the time to invest in their data management, it’s often difficult to know where to start. Solutions can err on either side of extremely broad or far too specialized and intricate. Then there is the paradox of when to implement a data management plan. In the beginning of a research project, there may be not enough information about future data to form a data management plan, however further down the research lifecycle there may be an overwhelmingly diverse stockpile of data to keep track of. These discussions on workflow lead to suggestions in data management resources and provides ready-made documentation for data librarians.

In this session, we will share our experiences and lessons learnt; moving on to an open discussion regarding the experiences of others. This talk would be of interest to researchers, managers and supporting staff.

 


Biography:

Amanda Miotto is an eResearch Support Specialist and Software Developer for Griffith University. She started off in the field of BioInformatics and learnt to appreciate the beauty of science before discovering the joys of coding. She is heavily involved in Software Carpentry, Hacky Hours and Research Bazaar, and has worked on platforms around HPC, microscopy & scientific database portals; as well as engagement with research groups  highlighting relevant upcoming technologies.

The Astronomy Data and Computing Services (ADACS) Story

Dr Jenni Harrison1,2, Professor Andrew Rohl3

1Pawsey Supercomputing Centre, , Australia,

2CSIRO, , Australia,

3Curtin University, Bentley, Australia

 

Title The Australian Data and Computing Services (ADACS) Story
Synopsis ADACS has been established and is funded by Astronomy Australia Ltd (AAL).  ADACS is providing eResearch services exclusively tailored for the needs of the Australian Astronomy community. Services are being delivered via a unique partnership that has been created between Swinburne University, Curtin University and the Pawsey Supercomputing Centre.  By offering bespoke training, support and expertise astronomers are being supported to maximise the scientific return from eResearch infrastructure.
Format of demonstration Slide Show
Presenter(s) Dr Jenni Harrison, Director of Strategic Projects and Engagement, Pawsey Supercomputing Centre and

Professor Andrew Rohl, Director of Curtin Institute for Computation and Professor of Computational Science, Curtin University

Target research community Astronomy, or anyone who may wish to use the ADACS model to deliver eResearch services to other communities.
Statement of Research Impact ADACS was only established in March 2017, and hence it is too early too early to evaluate the impact of this in initiative on research. ADACS will be evaluated in due course, with research impact considered.
Request to schedule alongside particular conference session If possible co-located with “National Programs and Partnerships”

 

Any special requirements Standard AV, to allow two presenters with questions

 


 

Biographies:

Jenni is the Director of Strategic Projects and Engagement at the Pawsey Supercomputing Centre in WA.  Jenni’s present responsibilities include leading projects in areas of national priority, such as astronomy and as a result she currently co-Directs the ADACS initiative.  Jenni is also responsible for engagement and correspondingly is leading the Capital Refresh for the next generation of supercomputing, data and associated services for Pawsey expected by 2020.  For 5 years previously, Jenni led the Data (and eResearch) Team at Pawsey.  Prior to working in Australia, Jenni directed significant Digital Health Education and Research projects for approximately 5 years, for the NHS in Scotland.  Before this role, Jenni for the policy advisor in eResearch to the Ministry of Research, Science and technology in New Zealand.

Andrew is the Director of the Curtin Institute for Computation and has been engaged in eResearch service delivery from its inception in Australia.  Prior to being the Executive Director of iVEC (now Pawsey), he was part of the grid computing program in the Australian Partnership for Advanced Computing.  As iVEC Executive Director, Andrew was a key contributor to attracting $80 Pawsey Centre Funding to iVEC.  Andrew is currently the independent Board Member on the NeSI.

AUS-SPECCHIO: taking spectroscopy data from the sensor to discovery

A/Prof. Laurie Chisholm1, Dr Cindy  Ong1, Dr Andreas Hueni1

1University Of Wollongong, Wollongong, Australia

 

Title AUS-SPECCHIO: taking spectroscopy data from the sensor to discovery
Synopsis AUS-SPECCHIO is national spectral information system supported by the Australian Government through the NCRIS Australian National Data Service. Funded as a data capture project, the mission of the system is to collate, share and discover new and existing spectral libraries related to any earth and environmental feature. AUS-SPECCHIO is open source for the benefit of all proximal and remote sensing researchers, established from user demand with functionality based upon extensive stakeholder consultation, feedback, and testing. The system incorporates features such as: a metadata standard to improve interoperability and sharing, links to published best practice guides, mechanisms to house validation data associated with spectra, semi-automated operations such as automatic validation of airborne hyperspectral data and a metadata export feed to ANDS RDA. Currently hosted by the University of Wollongong, a transition is planned to Geoscience Australia where the use of the system will extend to sensor calibration and meet the national call for validation of image products housed in Digital Earth Australia.
Format of demonstration Video, Slide Show
Presenter(s) Laurie Chisholm, Associate Professor, University of Wollongong (presenter), Cindy Ong, Andreas Hueni
Target research community Australian Proximal and Remote Sensing Community
Statement of Research Impact Case studies from operational testing and use will be shown which demonstrate the capacity of the system to capture and manage an expanding range of spectroscopy research data to support research.  As the basis of a spectral information system, AUS-SPECCHIO is delivering a benefit to the end users by greatly improved management of existing and new data, increased data quality by applying algorithms to a centralised and well-defined data pool, facilitating quicker acquisition to product/publication cycles, and supporting sensor calibration and satellite image product validation. The newly structured and enhanced version of AUS-SPECCHIO, including a robust metadata standard has served as a model for international adoption.
Request to schedule alongside particular conference session  
Any special requirements n/a

 


Biography:

A/Prof Laurie Chisholm has over 20 years of experience in remote sensing and spatial analysis in the environmental sciences.  She is Project Leader for the ANDS DC-10 project to develop a national spectroscopy information system, “AUS-SPECCHIO”. She has particular expertise in the use of hyperspectral data to discriminate between plant species, and to assess the physiological effects of various stressors (fungal, nutrient, water) on spectral reflectance. Additional research interests focus on evaluating the impact of disturbance events on ecosystem function and resilience at the landscape scale using satellite imagery. She has been a participant in several TERN Auscover supersite field campaigns, conducting vegetation surveys in support of airborne remote sensing data acquisition.  Currently Laurie is multi-sensor remote sensing data  to map invasive plant species for input into a novel mixed-methods cultural environmental research framework to address Natural Resource Management issues.

Collaborate, coordinate and thrive

Dr Markus Buchhorn1

1Australasian eResearch Organisations (AeRO), Canberra, Australia

 

ABSTRACT

The 2016 Research Infrastructure Roadmap1 references a vision for eResearch that includes an Australian Research Data Cloud. To establish it will require an ever-increasingly wide range of stakeholders and service providers to work together, to coordinate and align their platforms under a broad framework. However, to effectively use it across the entire national research endeavour will need much more than that. We will need the workforce to build, operate and support it. We will need the user community to be properly skilled and supported to take advantage of it. We will need the underpinning systems to be properly integrated with institutional services, with national and state scientific computing platforms, with international frameworks, and with emerging commercial services. We will need a rich and smooth flow of communication about the many services and benefits. We will need these services to be trustworthy and valued, and to increase their maturity as expectations continue to grow.

The Roadmap is largely silent on all these issues. The Members of AeRO though are collaborating hard to ensure that the many investments, from all sources, are properly coordinated, designed, deployed and operated, to ensure that researchers can thrive in the continuously growing data-driven research world. This presentation will discuss a range of activities to support AeRO Members to achieve these important goals, to seek input from the wider community, and to encourage more participation across the sector, ultimately to deliver a seamless and transformative experience for our research community.

 

REFERENCES

  1. Research Infrastructure Roadmap, available from: https://www.education.gov.au/2016-national-research-infrastructure-roadmap, accessed 15 June 2017.

 

Biography:

Markus is the Chief Executive Officer of AeRO.

Hacky Hour for eResearch Training, Engagement and Community Development

Dr Nick Hamilton1, Ms Belinda Weaver2

1University Of Queensland, St Lucia, Australia,

2Software and Data Carpentry , Brisbane, Australia

 

In the biosciences, as well as many other fields of research, there are often relatively low levels of mathematical and statistical expertise, as well as a lack of basic computing, eresearch and data skills. This is despite such skills becoming increasingly important across many fields to create cutting-edge research. Towards filling this knowledge gap, we have been experimenting with running “Hacky Hours” at The University of Queensland for the last 18 months. These are weekly events, held in an outdoor cafe, where researchers who would like assistance with their research IT can come along and ask questions or just work on whatever they are working on, in the company of other researchers who are into computing. A strong community of both helpers and researchers with questions has built up around this event, with many returning regularly. Often, a researcher with a question one week will come back and be a helper for another week.

During this period, we have been collecting data on all of our attendees and the types of problems they bring, as well as on our helpers and their interests. Typical questions include:

  • getting started with Python and R
  • software tools
  • how to access high performance computing
  • cloud data storage
  • tools for data cleaning and data visualisation.

The disciplines attendees belong to are very diverse and include: the biosciences, economics, psychology, humanities, languages, chemistry, mechanical engineering, nanotechnology, biomedical engineering, and ecology. There are also attendees from  the library. Interestingly, a significantly larger number of women than men come to ask questions at Hacky Hour, though the helpers are approximately gender-balanced.

The Hacky Hour model of training and engagement offers a number of benefits.

The friendly, non-judgmental and informal environment encourages a greater diversity of participants than are often associated with research IT. For the University organisations that allow or encourage their employees to participate as helpers, there is the benefit of presenting a friendly face to IT and eResearch facilities. As time is limited to an hour a week, helpers are more willing to donate their time without fear of problems blowing out and   their being stuck with working on the problem. Hacky Hour often works as a referral service: while the Hacky Hour helpers may not have a solution, they may well know a person or organisation who could help. Similarly, the Hacky Hour helpers build up a knowledge base of common problems and their solutions, as well as resources such as R cheat sheets, short training courses, or good web sites on how to get started with Python. The helpers also gain valuable skills in helping the problem owners understand and define their problems and thus how to develop a solution. Often the problem may not be what the problem owner thinks it is, or the solution may be completely different from what the problem owner thought they needed. While occasionally a helper will solve a problem directly, the Hacky Hour ethos is much more about helping the problem owner develop the skills to find a path to the solution themselves. For the helpers, the informal discussions about solving problems is a good way to share high level expertise with each other and keep up with the latest technical developments. More broadly, the helpers and problem owners have now become a community that can be drawn upon to help at or participate in other training or community events such as Software Carpentry bootcamps, HealthHack or Research Bazaar (ResBaz) events.

In this poster, we will outline our experiences with Hacky Hour, the strategies that we have taken to developand maintain a community of helpers and attract a diverse range of problem owners, as well as the outcomes and benefits we have seen.


 

Biographies

Dr Nick Hamilton is the Institute Bio-Mathematician at the Institute for Molecular Bioscience (IMB), The University of Queensland, and holds a co-appointment with the Research Computing Centre at UQ. He gained a PhD in Pure Mathematics from the University of Western Australia in 1996 and was subsequently awarded Fellowships in Australia and Belgium. In 2002, Nick made the decision to change fields into the exciting new areas of computational biology and bioinformatics, returned to Australia, and subsequently took up a position within the ARC Centre of Excellence in Bioinformatics at The University of Queensland. In 2008 he was appointed as a Laboratory Head at IMB, and Institute Bio-Mathematician in 2014, where he continues to lead a group in bio-image informatics, mathematical modelling and data visualisation, developing methodologies to deal with the current deluge of data that new microscopy imaging technologies have enabled. He also has interests and has participated in many training and engagement models such as Hacky Hour, HealthHack, ResBaz and Software Carpentry, and has Chaired the Winter School in Mathematics and Computational Biology for the last 6 years.
https://orcid.org/0000-0003-0331-3427

Belinda Weaver is the Community Development Lead for Software and Data Carpentry, global organisations that aim to make researchers more productive and their research more reliable by teaching them computational and data skills. She was formerly the eResearch Analyst Team Leader for the Queensland Cyber Infrastructure Foundation, where she helped deliver cloud storage and solutions to Australia researchers. She was a key organiser of the very successful Brisbane Research Bazaar events in 2016 and 2017 – cross-institutional, community-building events that taught a range of digital skills to postgraduate students and early career researchers.

She helped inaugurate the weekly Hacky Hour drop-in research IT advice sessions at The University of Queensland. She is a certified Software Carpentry instructor and instructor trainer and has taught at many Software Carpentry workshops. She organised the two very successful Library Carpentry sprints in 2016 and 2017 which updated and extended the basic lessons. The 2017 hackathon pulled in more than 100 people across 13 sites in seven countries, including the British Library and the National Library of the Netherlands. She will take a Library Carpentry roadshow to staff at the national and state libraries of Australasia during July and August 2017.

Belinda has formerly worked as a librarian, repository manager, project manager, newspaper columnist, Internet trainer and in research data management. She tweets as @cloudaus (https://twitter.com/cloudaus).

The value of an Integrated eResearch Service Catalogue: a La Trobe University case study

Dr Ghulam Murtaza1, Ms Sheila Mukerjee2

1Intersect Australia Ltd, Sydney, Australia, ghulam@intersect.org.au

2La Trobe University, Bundoora, Austalia, sheila.mukerjee@latrobe.edu.au

 

ABSTRACT

The provision of information on the full range of services available to researchers within a university is always challenging. IT groups, research offices and libraries within universities use websites or similar tools to communicate service offerings with varying degrees of effectiveness. Common problems include the large amount of information that needs to be communicated, the use of different terminology by different departments of the university, different methods to request services from different departments and where to find the most relevant and up to date information.

Universities, like other organisations, commonly adopt service desk approaches for their internal service delivery in areas such as IT, HR, facilities, etc. These involve the use of process-oriented tools and aim to achieve economies of scale within an organisation. Increasingly, universities are sourcing some services from external providers and the mix of internal/external service delivery is therefore changing over time. The key objective is to provide information to end users (researchers) that is comprehensive (incorporating both internal and external service delivery), user-friendly, and enables easy access. The researcher should not need to know about the “back office” arrangements used to provide the services.

A number of Intersect member universities are now attempting to improve the quality and effectiveness of eResearh support information and access they provide to researchers through more integrated approaches. In particular, the Intersect eResearch Analysts in several member organisations are working with the local Research Office, IT and Library groups to create an “integrated service catalogue” for researchers. This involves using the local internal communications channel (eg. intranet) to inform researchers and give them an opportunity to request the full range of eResearch support services, regardless of where the service elements are sourced. The consolidated list of services include:

  • Training relevant to researchers, consultation and advice for researchers, grant assistance, research IT planning, research software development and research data management services
  • HPC, storage and compute services offered by all providers

This presentation shares the experience of collaboratively developing an integrated eResearch services catalogue for La Trobe University and Intersect services. The presentation will cover elements such as the architecture of integrated services, how external services were embedded within the university, and the behind the scenes IT led triaging process and service delivery model for eResearch services. We present analysis of the metrics built around this integrated service catalogue to provide insights into the response of the research community and opportunities for new and improved services.


 

Biographies:

Dr Ghulam Murtaza is currently Intersect Digital Research Analyst for La Trobe University. During his time at Intersect, Ghulam has worked with Australian Catholic University and La Trobe University where he has lead multiple eResearch initiatives including the efforts to imbed Intersect services within local eResearch offerings. Ghulam is a published researcher and has previously held research and academic positions at many different reputable universities including UNSW, MAARCS institute of WSU, NEWT and Microsoft Research. Ghulam holds a Bachelor of Science (Honours) and Masters of Science in Computer Science from LUMS, Pakistan. He further completed his PhD in Computer Science from University of New South Wales (UNSW).

Sheila Mukerjee is Manager of Business Engagement for ICT at La Trobe University with a portfolio covering Research and Library. Her role of strategic partner and advisor covers strategy and business plans, future direction, major capital projects, business improvements and the sourcing and building of specialist technology capability for researchers. She has a keen interest in the way universities operate and strategise in the changing landscape of education with particular emphasis on technology and agility. She has published in the areas of data warehousing, student systems and agility in the education sector.

SKA Regional Centre Activities in Australasia

Slava Kitaeff1, Peter Quinn2, Andreas Wicenec3, Tao An4, Juan Carlos Guzman5

1 International Centre for Radio Astronomy Research/ Commonwealth Scientific and Industrial Research Organisation, Perth, Australia, slava.kitaeff@icrar.org / slava.kitaeff@csiro.au

2 International Centre for Radio Astronomy Research, Perth, Australia, peter.quinn@icrar.org

3 International Centre for Radio Astronomy Research, Perth, Australia, andreas.wicenec@icrar.org

4 Shanghai Astronomical Observatory, Shanghai, China, antao@shao.ac.cn

5 Commonwealth Scientific and Industrial Research Organisation, Perth, Australia, Juan.Guzman@csiro.au

 

Astronomy has a history and a tradition of using remote sites and space missions to gather large amounts of data. Australia hosts SKA pathfinders ASKAP and MWA, and will host SKA1-Low, producing orders of magnitude more data than any other astronomical instruments before. All this data will be used by hundreds of scientists working in multiple institutions across the globe. Australia participates and leads a number of major SKA science themes. In order to fully exploit the scientific potential of the instruments and enable ground-breaking scientific discoveries the SKA community needs to build the expertise and develop the technologies to support the science teams during their scientific exploration of the data products released by the SKA Observatory.

While SKA Observatory is responsible for generation of calibrated data products, the production of more advanced data products, such as science grade catalogues or very deep-stacked image cubes, are within the scope of SKA Regional Centres (SRC). The data volume and the individual sizes of datasets will be very large to be served via traditional data management models making the data centric processing as the preferred model for science data analysis. The data products need to be curated and served according to SKA policies. Multi-messenger data may need to be co-located and co-processed. Distributed science teams will need new tools, methods, frameworks and algorithms to maximise the scientific productivity.

Aiming at developing a prototype of the future infrastructure a three-year design study commenced in April 2017 called ERIDANUS Project. The project will deploy a prototype data intensive research infrastructure and middleware, between and within Australia and China, capable of addressing SKA-class data and processing challenges. The project will respond to the identified challenges, and will collaborate with the Advanced European Network for E-infrastructures for Astronomy with the SKA (AENEAS) project.

The poster outlines the current activities and future plans as undertaken by ICRAR, CSIRO and SHAO as part of the ERIDANUS project.


 

Biography

Dr Slava Kitaeff is the Project Engineer for the SKA Regional Centre and ERIDANUS National Project Lead at the International Centre for Radio Astronomy Research and CSIRO Astronomy and Space Science. Dr Kitaeff’s background is the radio astronomy instrumentation, high performance scientific computing and data management. http://linkedin.com/in/slavakitaeff/

The Astronomy Data and Computing Services (ADACS) Story

Dr Jenni Harrison1,2, Professor Andrew Rohl3

1Pawsey Supercomputing Centre, , Australia,

2CSIRO, , Australia,

3Curtin University, Bentley, Australia

Title The Australian Data and Computing Services (ADACS) Story
Synopsis ADACS has been established and is funded by Astronomy Australia Ltd (AAL).  ADACS is providing eResearch services exclusively tailored for the needs of the Australian Astronomy community. Services are being delivered via a unique partnership that has been created between Swinburne University, Curtin University and the Pawsey Supercomputing Centre.  By offering bespoke training, support and expertise astronomers are being supported to maximise the scientific return from eResearch infrastructure.
Format of demonstration Slide Show
Presenter(s) Dr Jenni Harrison, Director of Strategic Projects and Engagement, Pawsey Supercomputing Centre and

Professor Andrew Rohl, Director of Curtin Institute for Computation and Professor of Computational Science, Curtin University

Target research community Astronomy, or anyone who may wish to use the ADACS model to deliver eResearch services to other communities.
Statement of Research Impact ADACS was only established in March 2017, and hence it is too early too early to evaluate the impact of this in initiative on research. ADACS will be evaluated in due course, with research impact considered.
Request to schedule alongside particular conference session If possible co-located with “National Programs and Partnerships”

 

Any special requirements Standard AV, to allow two presenters with questions

Biographies 

Jenni is the Director of Strategic Projects and Engagement at the Pawsey Supercomputing Centre in WA.  Jenni’s present responsibilities include leading projects in areas of national priority, such as astronomy and as a result she currently co-Directs the ADACS initiative.  Jenni is also responsible for engagement and correspondingly is leading the Capital Refresh for the next generation of supercomputing, data and associated services for Pawsey expected by 2020.  For 5 years previously, Jenni led the Data (and eResearch) Team at Pawsey.  Prior to working in Australia, Jenni directed significant Digital Health Education and Research projects for approximately 5 years, for the NHS in Scotland.  Before this role, Jenni for the policy advisor in eResearch to the Ministry of Research, Science and technology in New Zealand.

Andrew is the Director of the Curtin Institute for Computation and has been engaged in eResearch service delivery from its inception in Australia.  Prior to being the Executive Director of iVEC (now Pawsey), he was part of the grid computing program in the Australian Partnership for Advanced Computing.  As iVEC Executive Director, Andrew was a key contributor to attracting $80 Pawsey Centre Funding to iVEC.  Andrew is currently the independent Board Member on the NeSI.

Designing institutional services for persistent identifiers

Dr Maude Frances1, Dr Daniel Bangert1, Mr Harry Sidhunata1

1UNSW Sydney, Sydney, Australia d.bangert@unsw.edu.aum.frances@unsw.edu.au h.sidhunata@unsw.edu.au

INTRODUCTION

In an increasingly diverse landscape of scholarly outputs and infrastructure, persistent identifiers (PIDs) have become a key technology in tackling issues of accessibility, integration, and interoperability. For researchers, PIDs improve the visibility and citability of research, allow work to be more easily tracked and measured, and demonstrate a level of curation and governance over objects.

PID service providers include CrossRef, DataCite, ORCID, FundRef, ISNI, and Ringgold. Several of these organisations have worked to improve integration across international research infrastructure through collaborative projects such as the ORCID and DataCite Interoperability Network (ODIN) and Technical and Human Infrastructure for Open Research (THOR), and by organising meetings about the future of open identifiers. Persistent identifiers also feature in the work of research community organisations such as the Research Data Alliance and FORCE11.

DESIGN AND WORKFLOWS

At UNSW Library, implementation of PIDs has focused on DOIs and ORCID iDs, seeking to leverage the international efforts that have already seen widespread adoption of these content and contributor identifiers by the scholarly community. PIDs flow through a number of systems and tools, including UNSW’s publication management system (an installation of Symplectic Elements) and library repositories for publications, faculty-based collections, and research data (instances of Fedora repositories).

The UNSW DOI Service is a web application that enables UNSW researchers to request a DOI for grey literature that has been deposited in the institutional repository (UNSWorks) or a faculty-based repository [1] [2]. These research outputs include theses, conference papers, conference presentations, conference posters, reports, working papers, and creative works. DOIs are minted through the Australian National Data Service, a member of DataCite. DOIs minted in the service point to an existing persistent link (Handle), which resolves to the landing page for the record. The landing page displays metadata about the record and includes links to download the publication file.

The service allows the user to search for grey literature in the repository, review publication details, enter any missing mandatory metadata, and agree to conditions for requesting a DOI. Once a request is submitted, it is reviewed by an administrator (UNSW Library staff) who either mint a DOI or decline the request. When a DOI is minted, it is then added to the record metadata and if requested, the DOI is also added to the publication file.

An alternative workflow exists for ‘trusted partners’ who need to mint a large number of DOIs or are required to mint DOIs on a regular basis. For example, a faculty, school or centre administrator responsible for an ongoing series of reports. Trusted partners are given access to part of the web application that enables them to search for the relevant record and mint a DOI without external review (see Figure 1).

INTEGRATIONS

The UNSW DOI Service is integrated with a Citation Builder tool that uses descriptive metadata (DC or MODS) to display a citation in Harvard style. The tool can also be configured to draw on metadata in different schemas and display citations in alternative styles. The purpose of citation display is twofold: for the user to confirm details of the record prior to requesting a DOI, and to give the user a citation for use after a DOI has been minted.

Integrations with other identifiers include the use of Handles for resolving DOIs, and mapping ORCID identifiers for contributors to the DataCite metadata schema. Integration between identifiers for content (DataCite DOIs) and contributors (ORCID iDs) result in a number of benefits for discovery and reporting:

 

Figure 1: UNSW DOI Service workflow for trusted partners

  1. ORCID identifiers are drawn from UNSW’s publication management system, ensuring that the identifier has been verified by the researcher.
  • ORCID identifiers are displayed in the repository, making the author’s ORCID profile, including other works listed in their profile, more visible and discoverable.
  • Repository metadata, including ORCID identifiers, are harvested by external aggregators (e.g. Trove, BASE), further exposing the ORCID profile.
  • By including ORCID iDs within DataCite metadata, authorship is unambiguously identified in a machine-readable, structured manner.
  • Works with a DataCite DOI can be easily added to the author’s ORCID profile via manual claim or auto-update from DataCite to ORCID.

 

As emphasised by THOR, widespread use and trust in PIDs throughout the research lifecycle depends largely on the ‘human infrastructure’ offered by these services. At UNSW, key messages for the research community focus on how PIDs make research easier to find and cite. In addition, they help track research impact through citations and altmetrics. It is also anticipated that PIDs will improve submission and reporting workflows by becoming a tool for auto-populating information and claiming records from other systems. Examples of this include publishers and repositories that offer authentication via ORCID and reference managers that support DOI lookup.

CONCLUSION

The design and delivery of PID services at UNSW Library is guided by the characteristics of ‘trusted identifiers’: unique, persistent, descriptive, interoperable, and governed [3]. Identifiers assigned to UNSW research outputs are interoperable, based on institutional sources of truth, and contain human and machine-readable metadata. In the case of DOIs, persistent access to the resource is guaranteed by the library as the custodian of the identifiers and associated repository content. As identifiers become a part of each stage of the research lifecycle, the challenge for institutions will be to continue to effectively govern their assignment, follow standards set by community-driven efforts, and optimise their use for and by researchers.

REFERENCES

  1. Bangert, D. & Frances, M. (2016, November). Cite my thesis? DOIs for grey literature. CAUL Research Repositories 2016, Sydney. http://doi.org/10.5281/zenodo.165620
  2. Mitra, P., Sheng, M., Sidhunata, H., & Bangert, D. (2017). UNSW DOI Service[software]. Zenodo. http://doi.org/10.5281/zenodo.582692
  3. ODIN Consortium, Fenner, M., Thorisson, G., Ruiz, S., & Brase, J. (2013). D4.1 Conceptual model of interoperability. figshare. https://doi.org/10.6084/m9.figshare.824314.v1

Biography:

Dr Maude Frances leads the Library Repository Services unit in The University of New South Wales (UNSW Sydney) Library. She has led a number of research infrastructure projects in which the Library collaborated with researchers to develop workflows and services for curating research data and publications.

About the conference

eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.

© 2016 - 2017 Conference Design Pty Ltd