The Astronomy Data and Computing Services (ADACS) Story

Dr Jenni Harrison1,2, Professor Andrew Rohl3

1Pawsey Supercomputing Centre, , Australia,

2CSIRO, , Australia,

3Curtin University, Bentley, Australia

Title The Australian Data and Computing Services (ADACS) Story
Synopsis ADACS has been established and is funded by Astronomy Australia Ltd (AAL).  ADACS is providing eResearch services exclusively tailored for the needs of the Australian Astronomy community. Services are being delivered via a unique partnership that has been created between Swinburne University, Curtin University and the Pawsey Supercomputing Centre.  By offering bespoke training, support and expertise astronomers are being supported to maximise the scientific return from eResearch infrastructure.
Format of demonstration Slide Show
Presenter(s) Dr Jenni Harrison, Director of Strategic Projects and Engagement, Pawsey Supercomputing Centre and

Professor Andrew Rohl, Director of Curtin Institute for Computation and Professor of Computational Science, Curtin University

Target research community Astronomy, or anyone who may wish to use the ADACS model to deliver eResearch services to other communities.
Statement of Research Impact ADACS was only established in March 2017, and hence it is too early too early to evaluate the impact of this in initiative on research. ADACS will be evaluated in due course, with research impact considered.
Request to schedule alongside particular conference session If possible co-located with “National Programs and Partnerships”

 

Any special requirements Standard AV, to allow two presenters with questions

Biographies 

Jenni is the Director of Strategic Projects and Engagement at the Pawsey Supercomputing Centre in WA.  Jenni’s present responsibilities include leading projects in areas of national priority, such as astronomy and as a result she currently co-Directs the ADACS initiative.  Jenni is also responsible for engagement and correspondingly is leading the Capital Refresh for the next generation of supercomputing, data and associated services for Pawsey expected by 2020.  For 5 years previously, Jenni led the Data (and eResearch) Team at Pawsey.  Prior to working in Australia, Jenni directed significant Digital Health Education and Research projects for approximately 5 years, for the NHS in Scotland.  Before this role, Jenni for the policy advisor in eResearch to the Ministry of Research, Science and technology in New Zealand.

Andrew is the Director of the Curtin Institute for Computation and has been engaged in eResearch service delivery from its inception in Australia.  Prior to being the Executive Director of iVEC (now Pawsey), he was part of the grid computing program in the Australian Partnership for Advanced Computing.  As iVEC Executive Director, Andrew was a key contributor to attracting $80 Pawsey Centre Funding to iVEC.  Andrew is currently the independent Board Member on the NeSI.

RAiD: A Persistent Identifier and Data Management Record for Research Activities

Ms Siobhann Mccafferty1, Mr  Ian Duncan2, Dr Andrew Janke2

1Australian Access Federation, Brisbane, Australia, mccafferty@aaf.edu.au

2Research Data Services (RDS), Brisbane, Australia  i.duncan@rds.edu.au

3. University of Queensland, Brisbane, Australia, janke@uq.edu.au 

DESCRIPTION

This poster presents the RAiD Persistent Identifier for research activities.

Persistent Identifiers (PID’s) are an essential tool of digital research data management and the evolving data management ecosystem. They allow for a clear line of sight along data management processes and workflows, more efficient collaboration and more precise measures of cooperation, impact, value and outputs.

The use of PID’s is becoming standard practice for institutions, funders and infrastructure providers and standard Identifiers have developed for individuals [1] and outputs [2], but previously not for Activities or Projects.

RAiD was developed by The Australian Data Life Cycle Framework Project (DLCF) [3] in response to this need and is a freely available service and API.

To mint a RAiD Institutions can integrate the RAiD API into their services. Services that the RAiD API can be integrated with include:​

  • Institutional DMP Tools
  • Virtual Labs
  • Ethics application systems
  • Storage request platforms
  • A Stand alone RAiD request portal

The ID consists of two parts: A RAiD number, and a RAiD DMR (Data Management Record).

The RAiD number is generated via the ANDS minting service and is packaged with Service Point minting information and a time date stamp.

The RAiD DMR contains related Identifiers such as;

  • Doi
  • Orcid
  • ISNI
  • GRID
  • RAID

It also records and stores information of provisioned tools and storage from RAiD integrated services such as Cloudstor, Nectar Computing and AWS storage.

The RAiD identifier and service supports data management across all phases of research by placing the research project or activity at the centre of research workflows, creating a chain of provenance, improving discovery and access, and ensuring that output is attributable and reportable. Thus following the F.AI.R Principles [4] of making research data Findable, Accessible, Interoperable, and Re-usable.

RAiD [5] is a significant addition to eResearch PIDs and is being integrated into a selection of Institutions across Australia and New Zealand.

REFERENCES

  1. Available at: https://orcid.org/, accessed 08 September 2017
  2. Available at: https://www.doi.org/, accessed 08 September 2017
  3. Data Life Cycle Framework Project. Available at: https://www.dlc.edu.au/, accessed 08 September 2017
  4. FAIR Principles. Available at: https://www.force11.org/group/fairgroup/fairprinciples, accessed 08 September 2017
  5. Available at https://www.raid.org.au/, accessed 08 September 2017

Biographies 

Ian Duncan is the Director of Research Data Services (RDS).

Andrew Janke is the Informatics Fellow for the  National Imaging Facility (NIF), Systems Architect, DLCF,  Research Data Services (RDS) and Senior Research Fellow for the Centre for Advanced Imaging (CAI) University of Queensland.

Siobhann McCafferty is the Project Manager for the Data Life Cycle Framework Project.

 

Designing institutional services for persistent identifiers

Dr Maude Frances1, Dr Daniel Bangert1, Mr Harry Sidhunata1

1UNSW Sydney, Sydney, Australia d.bangert@unsw.edu.aum.frances@unsw.edu.au h.sidhunata@unsw.edu.au

INTRODUCTION

In an increasingly diverse landscape of scholarly outputs and infrastructure, persistent identifiers (PIDs) have become a key technology in tackling issues of accessibility, integration, and interoperability. For researchers, PIDs improve the visibility and citability of research, allow work to be more easily tracked and measured, and demonstrate a level of curation and governance over objects.

PID service providers include CrossRef, DataCite, ORCID, FundRef, ISNI, and Ringgold. Several of these organisations have worked to improve integration across international research infrastructure through collaborative projects such as the ORCID and DataCite Interoperability Network (ODIN) and Technical and Human Infrastructure for Open Research (THOR), and by organising meetings about the future of open identifiers. Persistent identifiers also feature in the work of research community organisations such as the Research Data Alliance and FORCE11.

DESIGN AND WORKFLOWS

At UNSW Library, implementation of PIDs has focused on DOIs and ORCID iDs, seeking to leverage the international efforts that have already seen widespread adoption of these content and contributor identifiers by the scholarly community. PIDs flow through a number of systems and tools, including UNSW’s publication management system (an installation of Symplectic Elements) and library repositories for publications, faculty-based collections, and research data (instances of Fedora repositories).

The UNSW DOI Service is a web application that enables UNSW researchers to request a DOI for grey literature that has been deposited in the institutional repository (UNSWorks) or a faculty-based repository [1] [2]. These research outputs include theses, conference papers, conference presentations, conference posters, reports, working papers, and creative works. DOIs are minted through the Australian National Data Service, a member of DataCite. DOIs minted in the service point to an existing persistent link (Handle), which resolves to the landing page for the record. The landing page displays metadata about the record and includes links to download the publication file.

The service allows the user to search for grey literature in the repository, review publication details, enter any missing mandatory metadata, and agree to conditions for requesting a DOI. Once a request is submitted, it is reviewed by an administrator (UNSW Library staff) who either mint a DOI or decline the request. When a DOI is minted, it is then added to the record metadata and if requested, the DOI is also added to the publication file.

An alternative workflow exists for ‘trusted partners’ who need to mint a large number of DOIs or are required to mint DOIs on a regular basis. For example, a faculty, school or centre administrator responsible for an ongoing series of reports. Trusted partners are given access to part of the web application that enables them to search for the relevant record and mint a DOI without external review (see Figure 1).

INTEGRATIONS

The UNSW DOI Service is integrated with a Citation Builder tool that uses descriptive metadata (DC or MODS) to display a citation in Harvard style. The tool can also be configured to draw on metadata in different schemas and display citations in alternative styles. The purpose of citation display is twofold: for the user to confirm details of the record prior to requesting a DOI, and to give the user a citation for use after a DOI has been minted.

Integrations with other identifiers include the use of Handles for resolving DOIs, and mapping ORCID identifiers for contributors to the DataCite metadata schema. Integration between identifiers for content (DataCite DOIs) and contributors (ORCID iDs) result in a number of benefits for discovery and reporting:

 

Figure 1: UNSW DOI Service workflow for trusted partners

  1. ORCID identifiers are drawn from UNSW’s publication management system, ensuring that the identifier has been verified by the researcher.
  • ORCID identifiers are displayed in the repository, making the author’s ORCID profile, including other works listed in their profile, more visible and discoverable.
  • Repository metadata, including ORCID identifiers, are harvested by external aggregators (e.g. Trove, BASE), further exposing the ORCID profile.
  • By including ORCID iDs within DataCite metadata, authorship is unambiguously identified in a machine-readable, structured manner.
  • Works with a DataCite DOI can be easily added to the author’s ORCID profile via manual claim or auto-update from DataCite to ORCID.

 

As emphasised by THOR, widespread use and trust in PIDs throughout the research lifecycle depends largely on the ‘human infrastructure’ offered by these services. At UNSW, key messages for the research community focus on how PIDs make research easier to find and cite. In addition, they help track research impact through citations and altmetrics. It is also anticipated that PIDs will improve submission and reporting workflows by becoming a tool for auto-populating information and claiming records from other systems. Examples of this include publishers and repositories that offer authentication via ORCID and reference managers that support DOI lookup.

CONCLUSION

The design and delivery of PID services at UNSW Library is guided by the characteristics of ‘trusted identifiers’: unique, persistent, descriptive, interoperable, and governed [3]. Identifiers assigned to UNSW research outputs are interoperable, based on institutional sources of truth, and contain human and machine-readable metadata. In the case of DOIs, persistent access to the resource is guaranteed by the library as the custodian of the identifiers and associated repository content. As identifiers become a part of each stage of the research lifecycle, the challenge for institutions will be to continue to effectively govern their assignment, follow standards set by community-driven efforts, and optimise their use for and by researchers.

REFERENCES

  1. Bangert, D. & Frances, M. (2016, November). Cite my thesis? DOIs for grey literature. CAUL Research Repositories 2016, Sydney. http://doi.org/10.5281/zenodo.165620
  2. Mitra, P., Sheng, M., Sidhunata, H., & Bangert, D. (2017). UNSW DOI Service[software]. Zenodo. http://doi.org/10.5281/zenodo.582692
  3. ODIN Consortium, Fenner, M., Thorisson, G., Ruiz, S., & Brase, J. (2013). D4.1 Conceptual model of interoperability. figshare. https://doi.org/10.6084/m9.figshare.824314.v1

Biography:

Dr Maude Frances leads the Library Repository Services unit in The University of New South Wales (UNSW Sydney) Library. She has led a number of research infrastructure projects in which the Library collaborated with researchers to develop workflows and services for curating research data and publications.

ReDBox DLC and the Research Activity Portal

 

Mr Gavin Kennedy1

1QCIF, St Lucia, Australia

Title ReDBox DLC and the Research Activity Portal
Synopsis The RDS sponsored Research Activity ID (RAiD) is a persistent identifier in the Research Data Life Cycle, linking research activities to researchers, data infrastructure and services. The QCIF team have now integrated the RAiD into the ReDBox platform as a key identifier in its new data management features. This new ReDBox is the platform driving the Research Activity Portal (www.raportal.org.au), a RAiD service that we will demonstrate in this showcase session.
Format of demonstration Live Online Demonstration with short Slide Show
Presenter(s) Gavin Kennedy, Outreach and Engineering Manager, QCIF
Target research community Research Data Management
Statement of Research Impact Through the ReDBox DLC researchers will have a flexible data management platform which gives them multiple channels for acquiring data infrastructure and services, managing data, collaborating around data and services, and publishing data. The RAiD provides a consistent key in ReDBox for tying research activities to people, infrastructure and data.
Request to schedule alongside particular conference session Prefer not to have a lunchtime slot on the Wednesday or Thursday as we will be hosting a ReDBox user community meeting.
Any special requirements Nothing additional. Note that the url given above will be available late next week, but the demonstration version of the portal can be seen at: http://dlcf-test.redboxresearchdata.com.au/

Biography:

Gavin Kennedy is an IT research and solutions expert with over 30 years experience in ICT with the past 17 years in eResearch and ICT research. Gavin is the head of Outreach and Engineering at the Queensland Cyber Infrastructure Foundation (QCIF), where he is responsible for business development as well as leading QCIF’s Software Engineering team, who are the key developers of ReDBox, the popular research data management and publishing platform.

orcid.org/0000-0003-3910-0474

Open Science Framework – Tools for Reproducible Research

Dr Karmen Condic-jurkic1, 2

1The Australian National University, Acton, Australia, karmen.condic-jurkic@anu.edu.au

2Center for Open Science, Charlottesville, USA contact@cos.io  

The Open Science Framework (OSF) [1] is a free and open source project management tool developed by the Center for Open Science (COS) [2] in Charlottesville, USA to support researchers across the entire research lifecycle. The OSF is a versatile platform that helps with project management and organisation, collaboration, sharing and other research activities, including preprint service. The OSF integrates many services already used by researchers to streamline their process and increase efficiency. As a flexible repository, it can store and archive research data, protocols, and materials. The main goal of the Center for Open Science is to increase openness, integrity, and reproducibility of scholarly research and a number of products and service were developed to support these ideas.

The key features of OSF include:

  1. Structured projects to access files, data, code, and protocols in one centralized location and easily build custom organization for your project;
  • Controlled access and ability to make projects fully or partially public or private, making it easy to collaborate and share with the community or just your team.
  • Enhanced workflow for an automated version control;
  • Persistent identifiers for projects and materials;
  • Preregistration of research projects;
  • Extending Research with automatic creation of a preprint or a meeting abstract;
  • Institutional repositories and multi-institutional projects.

Recently, more specialized services were included under the OSF umbrella to promote good scientific practice and to improve scientific discovery and transparency:

  • OSF Meetings: an easy way to share ideas and findings presented at academic meetings in forms of presentations, posters or conference proceedings;
  • OSF Institutions: a central hub on a dedicated OSF page for the institutional research to enhance collaboration and increase the visibility of research output;
  • OSF Preprints: a single search interface to a number of preprint sources to facilitate and accelerate new models of scholarly communication across multiple disciplines.

REFERENCES

  1. Open Science Framework. Available from: https://osf.io, accessed 31st Aug 2017.
  2. Center for Open Science. Available from: https://cos.io, accessed 31st Aug 2017.

Biography:

I received master degree in chemistry in 2006 from University of Zagreb in Croatia, and my PhD degree in computational chemistry and biophysics was awarded in 2013 from Friedrich-Alexander University in Erlangen, Germany. I spent 2 years (2013-2015) as a postdoc at the University of Queensland, and moved to Canberra in 2015 to take another postdoc position at ANU. During my postdoc years, I have become very interested in open science and open data movement. In 2016, I became an ambassador for Open Science Framework.

Australian Government Linked Data Working Group: collaborating across agencies to work together on systems, vocabularies, ontologies and datasets.

Mrs Brigette Metzler1, Dr Belinda  Brown1, Mr Marco  Wallenius1, Mr Nicholas Car1, Dr Armin Haller1

1Australian Government Linked Data Working Group, Hobart, Australia

Abstract:

 

The Australian Government Linked Data Working Group (AGLDWG) is a community of Commonwealth Government experts and champions, with invited non-voting participation of individuals, corporations and other entities. The working group was established in 2012.
The Principles on Open Public Sector Information and Open Government National Action Plan are part of a whole of government change agenda – for increased access to information and data to lead to more responsive services and decision-making, and to value add information for the environment, society and economy by improving its discovery, access and re-use.
The Productivity Commission Data Availability and Use Inquiry Report (2017), also makes the case for bringing in new legislation for data sharing and re-use, and the need to link, merge and share data across government, and with the private sector.
In this context, there is an ever increasing need to find new ways of working together. The AGLDWG works to lay the foundations for transformation by establishing standards for guidance and technical mechanisms to implement Linked Data. We also work to raise awareness of linked data practices and technologies, and communicate the benefits of increasing interoperability.
Our work has four main foci: systems, vocabularies, ontologies and datasets

Data:

The key to Linked Data lies in the use of consistent digital curation standards and using Uniform Resource Identifiers to identify and name abstract ‘things’ (and where possible assigning a URI to that ‘thing’). This allows us to create interoperability between datasets, thereby creating an environment conducive to both new insights and increased collaboration.

Vocabularies:

Creating standardised vocabularies simultaneously provides the capacity to generate interoperability between datasets and discoverability of semantic data. Vocabularies, standardised in their structure and delivery according to Linked Data and Semantic Web principles, provide online, look-up-able definitions for things which can be used much more easily and powerfully than older vocabulary tools such as (paper) dictionaries, tables on web pages or XML code lists.

Systems:

Many systems have the capacity to operate with Linked Data. Some are dedicated to Linked Data, such as RDF Triplestores and the Linked Data API, and others function to facilitate Linked Data such as general website content management and web serving.

Ontologies:

A great number of foundational, or fundamental, ontologies have been produced to cater for concepts such as time (TIME ontology), simple authoring information (Dublin Core) and tracing changes to things over time (PROV-O, the provenance ontology). The AGLDWG has produced several ontologies, an example of which is one that defines properties for datasets within the data.gov.au catalogue.

Conclusion

The growth of public data provides for a future rich in opportunity. There will be growth in eResearch and Linked Data, including targeted projects to merge data for environmental, social and economic research.

 


Biographies

Dr Belinda Brown, Parks Australia, Canberra ACT, Australia:
Belinda works across multidisciplinary areas for science, environment, and information management; drawing on over 15 years of experience in the research and public sectors.
Belinda has a PhD in earth systems sciences, and started her career as a research scientist working on a range of international co-operative projects into the early development of southern ocean seaways and palaeo-climate around Antarctica. This laid the foundations for work in the science-information-policy interface; including amongst other things, working with the National Biodiscovery Working Group, the COAG National Science Working Group for Climate Change Adaptation, COAG Solar Thermal Technology Roadmap Committee, the United Nations Convention on Biological Diversity and the Global Strategy for Plant Conservation.
Belinda has an interest in enabling evidence based information for improved social, economic, and environmental outcomes; and works with colleagues to extend the value of public data, including the Linked Data and eResearch fields. Belinda is lead author and manager for the Knowledge Management Strategy for Parks Australia Environmental Information. Recent projects also include the Biomes of Australian Soil Environments (BASE) Project, a National Threatened Species Project, and a National Environmental Science Program Emerging Priorities project for the digital curation of long term monitoring datasets.

Mrs Brigette Metzler

Brigette Metzler is a senior quantitative user researcher with the Department of Human Services. She has previously worked in ontology and metadata management, including work on defining the links and the events that cause connections between the services the department delivers on behalf of the Australian government. Brigette is undertaking her PhD at the University of Queensland in Public Policy. Brigette is an active member of the Australian Government Language Data Management Working Group and is the co-chair of the Australian Government Linked Data Working Group.

Mr Nicholas Car:

Nicholas Car is the Data Architect at Geoscience Australia (GA). He coordinates data management within GA and works with GA’s partners to implement data sharing and use across institutional boundaries. His particular areas of interest are in data provenance and data semantics, both of which relate strongly to Linked Data and Semantic Web technologies. He has been an active participant in the Australian Government Linked Data Working Group for a number of years and chairs a number of other data and metadata groups in Australia and internationally, such as the Australian National Data Service-run Australian Vocabularies Special Interest Group and the Research Data Alliance’s Provenance Patterns Working Group.

Marco Wallenius
Marco Wallenius is a project officer in Commonwealth Information Management at the National Archives of Australia. Marco has been a member of the Australian Government Linked Data Working Group and is active in the data community.

Dr Armin Haller

Armin Haller is a Senior Lecturer at the Australian National University with a joint appointment in the Research School of Management and the Research School of Computer Science. Armin acts as the Manager of the Australian office of the W3C and he is Director of the MBA program. Prior, he has been a Research Scientist at Data61 of CSIRO, Australia.
Dr Haller received his PhD from the National University of Ireland, Galway under the supervision of Manfred Hauswirth and Stefan Decker while he was working in DERI. His PhD thesis work was centred around the application of ontologies to workflow models. Most of this research has been carried out in the SFI funded m3pe project and in the EU funded SUPER project. Earlier, he was also involved in the WSMO and WSMX working groups.
His general research interests are Linked Data, Ontology engineering and and the semantic Web in general. Currently he is chairing the Semantic Sensor Network Ontology working group of the W3C.

 

Field Acquired Information Management Systems Project: FAIMS Mobile, a customisable platform for data collection during field research

A/Prof. Shawn Ross1, Dr Adela Sobotkova1, Dr Brian Ballsun-Stanton

1Macquarie University, Sydney, Australia

Title Field Acquired Information Management Systems Project: FAIMS Mobile, a customisable platform for data collection during field research
Synopsis FAIMS Mobile is open-source, customisable software designed specifically to support field research across many domains. It allows offline collection of structured, text, multimedia, and geospatial data on multiple Android devices, and is built around an append-only datastore that provides complete version histories. It includes customisable export to existing databases or in standard formats. Finally, it is designed for rapid prototyping using and easy redeployability to reduce the costs of implementation. Developed for ‘small data’ disciplines, FAIMS Mobile is designed to collect heterogenous data of various types (structured, free text, geospatial, multimedia) produced by arbitrary methodologies. Customised by an XML-based domain specific language, it supports project-specific data models, user interfaces, and workflows, while also addressing problems shared across field-based projects, such as provision of a mobile GIS, data validation, delivery of contextual help, and automated synchronisation across multiple devices in a network-degraded environment. Finally, it promotes synthetic research and improves transparency and reproducibility through the production of comprehensive datasets that can be mapped to vocabularies or ontologies as they are created.
Format of demonstration Slides / screenshots
Presenter(s) A/Prof Shawn A Ross, Director of Data Science and eResearch, Macquarie University and Co-Director, FAIMS Project.

Dr Adela Sobotkova, Research Associate, Department of Ancient History, Macquarie University and Co-Director, FAIMS Project.

Dr Brian Ballsun-Stanton, Research Associate, Department of Ancient History, Macquarie University and Technical Director, FAIMS Project.

Target research community Researchers in fieldwork disciplines where people (rather than automated sensors) collect data, e.g., archaeology, biology, ecology, geosciences, linguistics, oral history, etc.
Statement of Research Impact FAIMS Mobile has changed users’ daily practice. Case studies indicate that users benefit from the increased efficiency of fieldwork (the time saved by avoiding digitisation more than offsets the time required to implement the system). Born-digital data avoided problems with delayed digitisation, which often occurred long after field recording when the context of records had been forgotten. Researchers reported more complete, consistent, and granular data, and that information could be exchanged more quickly between field researchers and lab specialists, facilitating the evaluation of patterns for meaning. They also observed that the process of moving from paper to digital required comprehensive reviews of field practice, during which knowledge implicit in existing systems to become explicit and data was modelled carefully for the first time.
Request to schedule alongside particular conference session  
Any special requirements Nothing special.

Biography:

Shawn A Ross (Ph.D. University of Washington, 2001) is Associate Professor of History and Archaeology and the Director of Data Science and eResearch at Macquarie University.  A/Prof Rossʼs research interests include the history and archaeology of pre-Classical Greece, oral tradition as history (especially Homer and Hesiod), the archaeology of the Balkans (especially Thrace), Greece in its wider Mediterranean and Balkan context, and the application of information technology to research. Since 2009, the focus of A/Prof Rossʼs work has been fundamental archaeological research in Bulgaria. He is a Research Associate at the American Research Center in Sofia, Bulgaria, and supervises the Tundzha Regional Archaeological Project (http://www.tundzha.org), a large-scale archaeological survey and palaeoenvironmental study in central and southeast Bulgaria. Since 2012 A/Prof Ross has also directed the Field Acquired Information Management Systems (FAIMS) project (http://www.faims.edu.au/) aimed at developing data capture, management, and archiving resources for researchers in fieldwork-based disciplines. Previously, A/Prof Ross worked at the University of New South Wales (Sydney, Australia) and William Paterson University (Wayne, New Jersey).

Facilitating Research Connections across the Pacific

Ms Celeste Anderson1

1Pacific Wave, La Mirada, , United States celestea@pacificwave.net dave@cenic.org jhess@cenic.org 

Description

Since 1996, the Pacific Wave Internet Facility has leveraged high-speed connections across the Pacific Ocean to provide researchers with the ability to connect to other international networks using the exchange infrastructure.  Recent developments such as the 100G upgrade of the AARNet SX-Transport connections from Australia and New Zealand to the West Coast of the United States, the testing of the AutoGOLE NSI + MEICAN pilot [1], and trial of SDX features are enhancing the capabilities of the facility to support international research.  This poster will show the status of these efforts and explain how researchers in the Australasia region might leverage these resources to further their work.

Pacific Wave

Pacific Wave is a distributed international network peering facility for the Pacific Rim and beyond with peering points for network connections in Seattle, Sunnyvale, Palo Alto, and Los Angeles. Supporting 29 networks representing over 47 countries, Pacific Wave is a joint project between the Corporation for Education Network Initiatives in California and the Pacific Northwest Gigapop.

Pacific Wave continues to enhance, upgrade and evolve its facility to support more 100G connections, develop and test new technologies and provide instrumentation for measurement, monitoring, analysis and visualization.  New Advanced Services include connectivity with the Pacific Research Platform (PRP), AutoGOLE dynamic circuit provisioning, experimental inter-domain SDX collaborations, and access to 100Gbps Data Transfer Nodes (DTNs) to accelerate large data transfers over long-haul high-capacity international links.

SDN/SDX and AUTOMATED Autogole

SDX Nodes are present in Los Angeles and in Seattle, connected at 100G to the common exchange and are reachable by any Pacific Wave participant network. Intel x86 control nodes are positioned in Seattle and Los Angeles, with an additional SDN switching node to be added in Sunnyvale. The AutoGOLE fabric delivers dynamic network services between other GLIF Open Lambda Exchanges (GOLEs).  The project is experimenting with new path finding and signaling algorithms as well as network modeling for optimizations [2].

 

Pacific Research platform

A project underway through funding by the United States’ National Science Foundation (NSF) to UC San Diego and UC Berkeley is the Pacific Research Platform (PRP), a science-driven high-capacity data-centric “freeway system” on a large regional scale. The PRP project recently held its first National Research Platform workshop to explore how the initial pilot project can be expanded and efforts are underway to establish a Global Research Platform (GRP) leveraging SDN and SDX concepts. [3] The Platform uses Pacific Wave resources for part of its infrastructure.

Pacific Islands Research and Education Network (PIREN) and Guam Open R&E Exchange

The PIREN project run by the University of Hawaii and its partners has made significant progress in fostering research and education (R&E) network capacity to interconnect Pacific Islands with each other and to the global R&E network fabric by building on previous projects and relationships. Hawaii has played an important role in furthering scientific research and cultural exchange in the region, with the mainland, and internationally and is involved in the creation of the Guam Open R&E Exchange (GOREX) following on the new subsea system SEA-US build and plans for 100G connections from this island. [4]

REFERENCES

1.AutoGole MEICAN Pilot. Available from: https://wiki.rnp.br/display/secipo/AutoGOLE+MEICAN+Pilot , accessed 31 Aug 2017.

2.Hess, J., Pacific Wave: SDN/SDX. Available from:   https://docs.google.com/presentation/d/1qgjPUmjGeYwUCZFjeO-DpPF1odnmSmzG6d85WPkDKOU/edit#slide=id.g1ceddacd70_0_0, accessed 31 Aug 2017.

3.Mambretti, J., Toward a Global Research Platform (GRP) in Presentations of the first National Research Platform Workshop, Pacific Research Platform Project, Boseman, Montana, United States. Available from: http://prp.ucsd.edu/presentations/nrp/s5-0-mambretti-grp-intro-presentation-august-2017.ppt/view, accessed 31 Aug 2017

4.Lassner, D., View from the Pacific. Available from https://www.youtube.com/watch?v=-uQHqf_O06o accessed 31 Aug 2017.


Biography:

Celeste Anderson is Director of the Networking Engineering and Communications Infrastructure group in the Information Technology division of the University of Southern California (USC), which includes the Los Nettos Regional Network, the USC campus network, USC colocation services (external) and international internet exchanges. Anderson also serves as Director for Customer Relations for the Pacific Wave Internet exchange on behalf of the Corporation for Education Network Initiatives in California (CENIC), and works on the NSF-funded International Research Network Connections (IRNC) Pacific Wave Extension project. Anderson co-chairs the “PerfClub” support group for perfSONAR measurement implementations and has participated in various working groups within CENIC, Internet2 and Educause.

HPC at The University of Sydney: balancing user experience with system utilisation

Dr Stephen Kolmann1

1The University Of Sydney, Sydney, Australia stephen.kolmann@sydney.edu.au

The University of Sydney’s High Performance Computing cluster, called Artemis, first came online with 1344 cores of standard compute, 48 cores with high memory and 5 nodes with 2 K40 GPUs each. These resources were made available as one large resource pool, shared by all users, with job priority determined by PBS Professional’s fairshare algorithm.1 This, coupled with three-month maximum walltimes, led to high system utilisation. However, wait times were long, even for small, short jobs, resulting in sub-optimal end user experience.

To help cater for strong demand and improve the end user experience, we expanded Artemis to 4264 cores. We knew this expansion would help lower wait times, but we did not rely on this alone. In collaboration with our managed service provider (Dell Managed Services at the time, now NTT Data Services), we designed a queue structure that still caters for a heterogeneous workload, but lowers wait times for small jobs, at the expense of some system utilisation. Figure 1 shows how we partitioned compute resources on Artemis to achieve this balance.

Figure 1: Distribution of Artemis’s compute cores. The left pie chart shows the coarse division of all Artemis’s compute cores, and the right pie chart shows the nominal distribution of compute cores within the shared area of the left pie chart.

The cores are divided into three broad categories: condominiums, strategic allocations and shared cores.

  1. Condominiums are compute nodes that we manage on behalf of condominium owners
  • Strategic allocations are dedicated to research groups who won access via a competitive application process
  • Shared cores are available to any Sydney University researcher who wants Artemis access

The shared cores are further sub-divided into separate resource pools that cater for different sized jobs.  This division was made to segregate small, short jobs from large, long running jobs. The idea behind this partitioning is that short, small should start quickly, but larger, longer running jobs should be willing to tolerate longer wait times.

This poster will explore our experience with this queue structure and how it has impacted metrics such as job wait times, system utilisation and researcher adoption.

REFERENCES

1. PBS Professional Administrators Guide, p. 165. Available from:

http://www.pbsworks.com/documentation/support/PBSProAdminGuide12.pdf, accessed 1 Sep 2017.


Biography:

Stephen Kolmann is currently working at The University of Sydney as an HPC Specialist where he provides end-user HPC documentation and training and acts as a consultant for internal HPC-related projects. He completed a PhD in Computational Chemistry, where he made extensive use of HPC facilities both at The University of Sydney and NCI

Parks Australia Collaborations across e-Research systems: Biomes of Australian Soil Environments (BASE) project

Dr Belinda  Brown1, Dr Andrew Bissett2, Professor Andrew Young3, Dr Anna Fitzgerald4, Dr Andrew Gilbert4

1Parks Australia, Canberra, Australia, Belinda.Brown@environment.gov.au

2CSIRO, Hobart, Australia, Andrew.Bissett@csiro.au

3National Research Collections Australia, CSIRO, Canberra, Australia, Andrew.Young@csiro.au

4 Bioplatforms Australia Ltd, Sydney, Australia, afitzgerald@bioplatforms.com agilbert@bioplatforms.com 

 

Environmental information is a strategic asset of Parks Australia and is at the heart of management decisions. The way in which information is collected, described, managed, stored and used is critical to business needs.

The Knowledge Management Strategy for Parks Australia Environmental Information supports information management objectives, now and into the future, to help build the knowledge needed to protect and conserve Australia’s biodiversity, as well as engage with stakeholders and national research infrastructure partners.

Parks Australia contributes environmental information to national and international networks, including, amongst others: the Atlas of Living Australia (ALA), Terrestrial Ecosystem Research Network (TERN) and the Global Biodiversity Information Facility (GBIF).

The agency also contributes to strategic partnerships and projects. The Biomes of Australian Soil Environments (BASE) Project is a recent example with partners from CSIRO and Bioplatforms Australia, with contributions to a national e-Research database. The project collaboration marshalled complementary partners around Australia to pool time and resources to collect under a national sampling framework, including Parks Australia reserves. The project was an opportunity to develop a national environmental and soil microbial diversity framework, to enable new continental baseline information on soil microbial communities, which are primary drivers of soil ecological processes such as nutrient and carbon cycling.

BASE was developed in an open data framework, and is the first Australian soil microbial diversity database. BASE links environmental and soil data across bio-geographic regions, including Commonwealth reserves. Its database provides a reference for comparative analysis across different datasets and regions. The database provides a platform to grow and evolve over time.  It provides a basis to link with other databases and tools, and national e-research infrastructure networks, such as those at the Atlas of Living Australia and Bioplatforms Australia. It provides a new baseline for investigations into the largely un-quantified role of soil microbial diversity in broad scale patterns of plant species abundance, and ecosystem resilience.

Partnerships, and collaboration models such as this provide a basis to value add multi-disciplinary data and enabling science. Information sharing and collaborative practice will continue to expand and change with more organisations working together for multi-disciplinary and integrated outcomes. Now, with emerging policy initiatives across the public sector for data integration, sharing and re-use; national infrastructure and e-Research collaborations continue to be an important component to help build high-value datasets for targeted science, services, policies and programs.


Biography:

Belinda works across multidisciplinary areas for science, environment, and information management;  drawing on over 15 years of experience in the research and public sectors.

Belinda has a PhD in earth systems sciences, and started her career as a research scientist working on a range of international  projects into the development of southern ocean seaways and palaeo-climate around Antarctica.  This laid the foundations for her work in the science-data-policy interface; including amongst other things, working with the National Biodiscovery Working Group, the COAG National Science Working Group for Climate Change Adaptation, COAG Solar Thermal Technology Roadmap Committee, the UN Convention on Biological Diversity and the Global Strategy for Plant Conservation.

Belinda is also lead author and manager for the Knowledge Management Strategy for Parks Australia Environmental Information, and its implementation. Belinda has an interest in enabling evidence based information for improved social, economic, and environmental outcomes; and works with colleagues to extend the value of public data, including Linked Data and eResearch. Recent projects include the Biomes of Australian Soil Environments (BASE) Project, a National Threatened Species Project, and a National Environmental Science Program Emerging Priorities project for the digital curation of long term monitoring datasets.

1234566

Recent Comments

    About the conference

    eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

    Conference Managers

    Please contact the team at Conference Design with any questions regarding the conference.

    © 2018 - 2019 Conference Design Pty Ltd