Harnessing The Wisdom of Crowds: The SWARM Project

Prof. Richard O. Sinnott1 & the SWARM Team1

1University of Melbourne, rsinnott@unimelb.edu.au

 

The Smartly-assembled,  Wiki-style Argument Marshalling (SWARM) project was funded by the   US   Intelligence   Advanced   Research   Projects   Activity   (IARPA)   as   part   of   the Crowdsourcing Evidence, Argumentation, Thinking and Evaluation (CREATE) program. The project  formally  commenced  in January  2018 and has been awarded  (up to $19m).  The SWARM is one of 4-projects funded globally through the IARPA CREATE program. These projects are tasked with suporting improved reasoning to aid the intelligence community by leveraging the wisdom of crowds. Whilst previous IARPA programs have demonstrated the benefits in leveraging the wisdom of crowds to get improved answers, the actual reasoning and deliberation in what makes a good answer remains unclear. This is the primary goal of SWARM.

The evaluation of the SWARM platform and the other platfoms is currently being undertaken by an independent crowd managed by IARPA and their Test & Evaluation team. This crowd will be organised into separate teams. Each team will be assigned a set of questions that require reasoning and evaluation to come up with the most highly regarded answers over several months, i.e. those answers (hypotheses) with the best reasoning and presentation.

This presentation will cover the overarching goals of SWARM and the underpinning technical solutions that have been developed. This includes the mobile applications that have been developed  to encourage  crowd  participation.  The talk will also briefly  include  early (non- funded SWARM work) exploring the extent that deep learning approaches can be used for automation of the assessment of collective reasoning.


Biography:

Professor Richard O. Sinnott is the Director of eResearch at the University of Melbourne and Chair of Applied Computing Systems. In these roles he is responsible for all aspects of eResearch (research-oriented IT development) at the University. He has been lead software engineer/architect  on  an  extensive  portfolio  of  national  and  international  projects,  with specific focus on those research domains requiring finer-grained access control (security). He has over 300 peer reviewed publications across a range of applied computing research areas.

Streamlining Collaboration for the Murchison Widefield Array Telescope Project with the Australian Access Federation and eduGAIN

Greg Sleap1, Alison Lynton3, John Scullen4, Scott Koranda5, Randall Wayth1, Adam Beardsley2, Benjamin Oshrin5, Heather Flanagan5

1Curtin Institute of Radio Astronomy, greg.sleap@curtin.edu.aur.wayth@curtin.edu.au

2Arizona State University, Adam.Beardsley@asu.edu  

3Curtin University, A.Lynton@curtin.edu.au  

4Australian Access Federation, john.scullen@aaf.edu.au   

5Spherical Cow Group, New York, NY, skoranda@sphericalcowgroup.combenno@sphericalcowgroup.comhlflanagan@sphericalcowgroup.com  

 

The Murchison Widefield Array (MWA) is a low-frequency radio telescope operating between 80 and 300 MHz. It is located at the Murchison Radio-astronomy Observatory (MRO) in Western Australia, the planned site of the future Square Kilometre Array (SKA) lowband telescope, and is one of three telescopes designated as a Precursor for the SKA. Initially developed by an international collaboration including partners from Australia, Canada, India, New Zealand, and the United States, and led by the Curtin University node of the International Centre for Radio Astronomy Research (ICRAR), today the MWA includes partners from more than 20 institutions around the world including China and Japan.

To streamline collaboration and facilitate access to data and resources, MWA staff deployed an identity management infrastructure built on the foundation of federated identity. By leveraging the existing investment Curtin University had made in federated identity infrastructure and the Australian Access Federation (AAF), MWA published its federated services into the worldwide eduGAIN interfederation and enabled single sign-on (SSO) access using home organization credentials for collaborators throughout the world.

This presentation discusses the issues encountered, processes engaged, and challenges faced when implementing the MWA eduGAIN solution from four unique perspectives:

  1. The MWA data manager and project scientist charged with enabling secure and scalable access to a growing collaboration with partners throughout the world.
  2. The Senior Systems Engineer at Curtin University tasked with facilitating access to AAF services including eduGAIN for MWA resources.
  3. AAF staff enabling subscribers to connect internationally with eduGAIN.
  4. Consultants providing technical input on scalable and sustainable federated identity architecture in support of international collaboration.

Biographies:

Scott Koranda specializes on identity management architectures that streamline and enhance collaboration for research organizations. https://orcid.org/0000-0003-4478-9026

Greg Sleap has been the Murchison Widefield Array (MWA) Data Manager since mid-2016, planning, developing and supporting the systems which allow astronomers around the world to utilise the MWA’s extensive data archive. https://orcid.org/0000-0003-0134-3884

Alison Lynton has worked for Curtin University as a Senior Systems Engineer since 2001, specialising in Unix. She has a passion for advocating for researchers needs within her organisation. https://orcid.org/0000-0002-6236-1915

John Scullen joined AAF in February 2016 to lead the development of new processes and tools in the Next Generation AAF project. His role has since expanded to oversee the AAF’s project portfolio.

From the soil sample jar to society: an example of collating and sharing scientific data

Hannah Mikkonen1, Ian Thomas2, Paul Bentley3, Andrew Barker4, Suzie Reichman5

1 RMIT University, Melbourne, Australia, hannah.mikkonen@student.rmit.edu.au

2 RMIT University, Melbourne, Australia, ian.edward.thomas@rmit.edu.au

3 CDM Smith, Melbourne, Australia, bentleypd@cdmsmith.com

4 CDM Smith, Melbourne, Australia, barkerao@cdmsmith.com

5 RMIT University, Melbourne, Australia, suzie.reichman@rmit.edu.au

 

Introduction

Background concentrations of metals and elements in soil are the natural geogenic concentrations. Soil data on background metal/element concentrations is important for assessments of agricultural health and productivity, ecological risk, mineral exploration, and assessment of pollution. However, soil surveys and the associated collection and chemical analysis of soil samples take a considerable amount of time and are financially expensive. Therefore, soil survey datasets are a valuable resource for other scientist’s, land assessors and policy makers.
A website “The Victorian Background Soil Database” (http://doi.org/10.4225/61/5a3ae6d48570c) and an interactive map titled “Soil Explorer” were developed to present and share the results of a Background Soil Survey for Victorian soils.  The database and map were developed by RMIT researchers in collaboration with Data Scientists at CDM Smith, the Environment Protection Authority, Victoria; the Australian Contaminated Land Consultants Association and with help from the RMIT eResearch team. Soil Explorer is a Shiny [6] web-based application to visualise data (based on the R language). The app provides an interactive platform that integrates individual soil data points, soil statistics and spatial groups of geology and region for the background soil data. The data collation process involved collection of soil samples from across Victoria, collation of soil sample data from publicly available environmental assessment reports, screening the quality of collated data, and calculation of summary statistics. The data communication process involved development of an interactive map using Shiny, licensing of the dataset, development of a DOI, placement of the Shiny application onto a secure and reliable server, launching of the website, and recording the use of the website using Google’s data analytics platform. This presentation will describe how soil scientists, e-research support and the environmental industry worked together to tackle the cross disciplinary barriers and challenges involved in collecting, analysing, visualising, and communicating data using a web-based Shiny dashboard, written using the R Language.

Understanding what the end user wants

The need for a background soil database was identified by the members of Australian Contaminated Land Consultants Association (ACLCA) who identified mis-classification of soil (due to lack of understanding of background concentrations of soil) as a potential cause of unsustainable disposal of natural soils to landfill. ACLCA approached the RMIT researchers to develop a HazWaste Fund proposal that was ultimately successful. Throughout the project, ACLCA and the EPA Victoria (as the HazWaste Fund administrator) played key roles in scoping the project and ensuring the methods and deliverables were relevant to industry and in forms that were usable. One of the advantages of this project was that the research was undertaken by a student who also worked in the environmental assessment industry, and supervised by a researcher who had previously worked in environmental regulatory industry.

Methods

The project development was handled using an agile development and deployment approach, with two-week “sprints” of allocated work tracked on an online task board. Changes to site source code during development were communicated between different collaborators using a source control repository.

The website, maps and summary-statistic sheets were scripted and automated using the R language [1]. R was adopted for several reasons. Firstly, all the statistical analysis could be automated, including the output of 126 separate summary-statistic sheets. Second, several R packages facilitate the generation of HTML dashboards and formatted reports; (e.g. leaflet [2], crosstalk, sf [3], rmarkdown [4], knitr [5]). Finally, R is open source, which allows for the code to be edited by people from different industries and institutions.

Following emerging best practices for dataset publication, steps were made to ensure that the data was both accessible and had potentially larger reach. A Digital Object Identifier (DOI) was made so that the dataset can easily be referenced in publications and can be discovered, through records to be created in Research Data Australia. Google Analytics have been used to assess the site traffic and allow for us to better understand the users’ interests.  Beyond these automated analytic metrics, an online form was embedded on the website to allow visitors to reach out to get further information and initiate conversations with the authors. These steps were based on the requirement that the site would serve as a starting point for further discussion and collaboration.

App deployment was managed using Bitbucket for version control. App hosting occurred on an RMIT server using Redhat with SSL browser security. This project was a ’proof of concept’ for research translation and for communication of environmental science using digital platforms, which are “bespoke” / customised to the research project. The R language (and packages there within) provide a complete coding environment from: data processing, analysis, through to data visualisation and reporting (both pdf and web based), providing researchers with a single environment to undertake and communicate their research. One outcome of this project is to share the roadmap for other researchers at RMIT, with the purpose of introducing researchers to new tools and techniques for enhancing and communicating their research practice.

Presentation of point data rather than models

Soil data is increasingly being presented as modelled spatial layers. There is often a lack of communication of the accuracy and confidence in predicted information. Background concentrations of metals in soil can vary by 100 fold within a single soil sample. Therefore, at this stage it was considered most relevant to simply present the results and then provide summary statistics that clearly describe the data variability.

Next Steps

There are key directions for further research. First, expand the dataset and user interface to meet needs of not just the environmental assessment industry but also agricultural, mining and research industries. Second, assess how to merge local data with national datasets. Third, to develop predictive spatial models for background concentrations.

Acknowledgements

The authors would like to acknowledge the financial support of the Hazwaste Fund (project S43-0065) and the Australian Contaminated Land Consultants Association (ACLCA) Victoria.  We also acknowledge and thank the R project for freely available statistical computing (http://www.r-project.org).

References

  1.    R Core Team, 2016. R: A language and environment for statistical computing, in: Computing, Vienna, Austria.
  2.    Chang, W., Cheng, J., Allaire, J., Xie, Y., McPherson, J., 2017. Package ‘shiny’, Web Application Framework for R, version 1.0.5. CRAN.
  3.    Pebesma, E., Bivand, R., Racine, E., Sumner, M., Cook, I., Keitt, T., Lovelace, R., Wickham, H., Ooms, J., Müller, K., 2018. sf: Simple Features for R, R package, version 0.6-3. CRAN
  4.     Allaire, J., Horner, J., 2017. Package ‘markdown’, version 0.8. CRAN.
  5.    Xie, Yihui (2018). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.20. CRAN

Biography:

Ian Thomas (https://orcid.org/0000-0003-1372-9469) is a software developer and system administrator at the Research Capability Unit at RMIT University. He has worked in data curation for output of high-performance computing systems, microscopy data for materials, and screen media objects (film and television). His current work is in high-performance computing, containerized research workflows and in cloud-based platforms in support of eResearch applications.

Earth and Environment Science Information Partners – parallel pathways on opposite sides of the globe

Simon J D Cox1, Lesley Wyborn2, Adrian Burton3, Erin Robinson4, Tim Rawling5

1CSIRO Land and Water, Melbourne, Australia, simon.cox@csiro.au

2National Computational Infrastructure, ANU, Canberra, Australia, lesley.wyborn@anu.edu.au

3Australian Research Data Commons, Canberra, Australia, adrian.burton@ands.org.au

4Earth Science Information Partners, Boulder, USA, erinrobinson@esipfed.org

5AuScope, Melbourne, Australia, tim.rawling@unimelb.edu.au

 

DESCRIPTION

Addressing research problems in earth and environmental science usually requires combining data from multiple sources. This is facilitated by the use of common practices, vocabularies, interfaces and standards. The newly formed Earth and Environment Science Information Partners (E2SIP) provides a forum for coordinating these in the Australian research community. E2SIP is modelled on the US Earth Science Information Partners (ESIP)

Over the last 20 years the ESIP has built a community of practice in USA, through regular meetings, workshops, and online forums to examine and develop emerging technologies. Education and training are a key aspect of ESIP’s work. Conventions, practices and standards developed through ESIP have been influential internationally. The Australian Earth and Environment Science Information Partners has been established through liaison with ESIP to support similar functions in Australia. E2SIP will use ESIP’s collaboration platforms and will convene workshops, courses, hackathons, and develop guidance and best practices tailored for the Australian community. ESIP has been supported by NASA, NOAA, USGS and various foundations and scholarly organizations. E2SIP is working with the National Earth and Environment Sciences Facilities Forum which provides a common voice to government on behalf of long term science infrastructure.

AUDIENCE

This BoF session aims to introduce E2SIP to those members of the community who are not already involved.  It will highlight the opportunities that involvement will bring but will also give participants a voice in the evolution of the group.

STRUCTURE

A series of lightening talks from organisations committed to the formation of E2SIP will set the scene and then will be followed by an facilitated discussion with a panel comprising the nascent E2SIP membership.  The focus of the discussion will be the prioritization of focus areas for attention of E2SIP and the development of a broad engagement and science strategy for the group.  This discussion will allow participants to design E2SIP so that is meets local needs and, whist drawing form the ESIP parent model, is specifically applicable to Australian related earth and environmental science data issues.

OUTCOMES

The BoF will provide an engagement opportunity for E2SIP members, it will enable those unfamiliar with the program to better understand opportunities it may provide to them and will allow potential new members to become involved in the discussions that will frame the activities of the group.  At the end of the session E2SIP will produce a series of priority areas agreed to in the session that will inform future planning as the E2SIP cluster is established.

This work is in partnership with the US Earth Systems Information Partners (ESIP)


Biography:

Simon has been researching standards for publication and transfer of earth and environmental science data since the emergence of the world wide web. Starting in geophysics and mineral exploration, he has engaged with most areas of environmental science, including water resources, marine data, meteorology, soil, ecology and biodiversity. He is principal- or co-author of a number of international standards, including Geography Markup Language, and Observations & Measurements, that have been broadly adopted in Australia and Internationally. The value of these is in enabling data from multiple origins and disciplines to be combined more effectively, which is essential in tackling most contemporary problems in science and society. His current work focuses on aligning science information with the semantic web technologies and linked open data principles, and the formalization, publication and maintenance of controlled vocabularies and similar reference data.

Rapid analysis of extreme coastal events through enhanced data and tools

Ron Hoeke1, Claire Trenham2, Julian O’Grady3, Robert Davy4, Rebecca Gregory5, Kathy McInnes6, Mark Hemer7

1CSIRO Oceans & Atmosphere, Aspendale, Australia, Ron.Hoeke@csiro.au

2CSIRO Oceans & Atmosphere, Aspendale, Australia, Claire.Trenham@csiro.au

3CSIRO Oceans & Atmosphere, Aspendale, Australia, Julian.O’Grady@csiro.au  

4CSIRO IM&T, Canberra, Australia, Robert.Davy@csiro.au

5CSIRO Oceans & Atmosphere, Aspendale, Australia, Rebecca.Gregory@csiro.au

6CSIRO Oceans & Atmosphere, Aspendale, Australia, Kathleen.Mcinnes@csiro.au

7CSIRO Oceans & Atmosphere, Hobart, Australia, Mark.Hemer@csiro.au

 

Coastal communities and maritime activities face a number or risks from coastal hazards associated with storm waves and flooding. These risks are generally increasing due to sea level rise, but other factors define local risk.  These factors include: astronomical tides; severe weather-induced storm surges (caused by low atmospheric pressure and strong onshore winds); wind and swell waves caused respectively by local and remote weather systems; and other variations in local sea levels that occur on intraseasonal to interannual time scales. Efforts to better understand and predict coastal sea levels has led to a proliferation of both modelled and observed data. The Sea-level Rise, Waves, and Coastal Extremes team in CSIRO’s Climate Science Centre (CSC) maintain and archive many of these datasets. However, some datasets were stored on tape and in a data structure that made time-series analysis prohibitively slow, and analysis code was held by individual researchers and not managed in a version control system. Also, the datasets have not been regularly updated with the most recent data. Therefore, there was no systematic and computationally economical means of accessing the data for the purposes of better understanding their contributions to extreme coastal sea levels for multiple locations or for large gridded datasets.

As part of the National Environmental Science Programme’s (NESP) Earth Systems and Climate Change Hub, and with the support of the CSIRO eResearch Collaboration Projects programme, we are developing a suite of software (based on python and R) in a shared code repository and improving our data structures to deliver improved science and services (see Figure 1). The scope of the eResearch project is that it will also facilitate the delivery of future projects.  Initially, we have focussed on the CAWCR Wave Hindcast [1,2], but are currently adding in other datasets pertaining to tides, sea level, and storm surge, both based on observations and models.

Figure 1: Flow chart showing path from original data structure to value-added data delivery.

In this presentation, we discuss how we have solved a number of problems that held us back from rapid and repeatable analysis and show a sea level prediction tool we have developed for use by the wider community. In particular we will discuss:

  • Data restructuring (wave hindcast)
  • Data aggregation: co-locating all relevant datasets into disk-based cloud devices
  • Software development (migration of code into a version control system and subsequent restructuring)
  • Updating the data with automated monthly jobs.
  • Examples of how the resulting rapid access data can be quickly and effectively analysed though statistics/extreme value analysis and viewed using online tools (e.g. Rshiny)

These improvements allow maintenance of up-to-date scientific datasets and rapid assessment of output for various applications [3]. For example, we now have the capability to perform extreme value analysis (e.g. full generalised Pareto fitting) of gridded wave hindcast output in parallel. An example of this is illustrated by Figure 2, which shows the extent of extreme waves during an East Coast Low event that occurred in June 2016 [4]. Prior to starting this work, the computational effort to extract this data, calculate the statistics, and combine it over the grid was infeasible. As a result of improving data structure and performance, we are now able to produce these results in approximately 20 minutes.

Figure 2: Example of extreme storm event analysis for the East Coast Low that caused extensive erosion and coastal flooding along much of east coast NSW and Tasmania between June 4-6, 2016.  The map to the left represents the maximum significant wave height (Hs) over the duration of the event (several days), normalised by the 50-year average return interval, i.e. map colours greater than 1 indicates storm wave heights greater than the 1-in-50 year event.

We also discuss how such improvements are being integrated into a number of coastal hazard assessment tools. An example of how we will make data available to users is through the Rshiny software package. Using Rshiny we are developing an interactive webpage to access the extreme water levels dataset for coastal locations around Australia which are generated in the rapid-access pipeline software.

REFERENCES

  1. Durrant T, Greenslade D, Hemer M, Trenham C. Global Wave Hindcast focussed on the Central and South Pacific. CAWCR Tech. Rep. 70, 1–54. 2014.
  2. Hemer M, Zieger S, Durrant T, O’Grady J, Hoeke RK, McInnes KL, Rosebrock U. A revised assessment of Australia’s national wave energy resource. Renewable Energy. 2016. doi:10.1016/j.renene.2016.08.039
  3. Davy R; Hoeke RK; Trenham C, O’Grady, J, Hemer M, Gregory R, McInnes K. Fast time series analysis of wave hindcast data. In: Collaborative Conference on Computational and Data Intensive Science; Melbourne. 2018.
  4. Hoeke RK, Stephenson A, McInnes K, Davy R, O’Grady J, Hemer M, Williams G. A multivariate statistical retrospective of two high-impact coastal events. Australasian Coasts & Ports 2017: Working with Nature. 2017:593.

Biography:

Claire joined CSIRO Marine & Atmospheric Research in 2011 working on ocean wave climate modelling. After a period working as a Senior Research Data Services Specialist for the National Computational Infrastructure (NCI) in Canberra between 2014 and 2017, she returned to CSIRO’s Oceans & Atmosphere division in 2017. She currently works in the Sea level rise, waves and coastal extremes team alongside the regional climate team as part of the Climate extremes and projections group. Claire is heavily involved in climate data and preparation for CMIP6, as well as regional climate modelling, data processing, making improvements to data and software to enhance science capabilities, and supervising student volunteers in data digitisation. In a past life she was a radio astronomer, and is also a qualified high school maths/science teacher.

ORCID: http://orcid.org/0000-0003-4258-9936

A Collaborative Australian Characterisation Informatics Strategy, Impact and Current Work

Wojtek James Goscinski1David Abramson2, Jason Andrade3, Caroline Fuery4, Graham Galloway5, Nick Hauser6, Andrew Janke7, Jafar Lie8, Andrew Mehnert9, Ryan Sullivan10, Veah Tapat11, Roger Wepf12, Lance Wilson13

1 MASSIVE and Monash University, Clayton, Australia, wojtek.goscinski@monash.edu

2 University of Queensland, St Lucia, david.abramson@uq.edu.au

3 University of Wollongong, Wollongong, jason@uow.edu.au

4 Australian Microscopy and Microanalysis Research Facility and University of Sydney, Sydney,caroline.fuery@sydney.edu.au

5 National Imaging Facility and University of Queensland, St Lucia, 8 National Imaging Facility and University of

Queensland, St Lucia, v.tapat@uq.edu.au

6 ANSTO, Lucas Heights, nha@ansto.gov.au

7 National Imaging Facility and University of Queensland, St Lucia, andrew.janke@uq.edu.au

8 MASSIVE and Monash University, Clayton, Australia, jafar.lie@monash.edu

9 Centre for Microscopy, Characterisation and Analysis, AMMRF, NIF, University of Western Australia, Perth,andrew.mehnert@uwa.edu.au

10 University of Sydney, Sydney, ryan.sullivan@sydney.edu.au

11 National Imaging Facility and University of Queensland, St Lucia, v.tapat@uq.edu.au

12 University of Queensland, St Lucia, r.wepf@uq.edu.au

13 MASSIVE and Monash University, Clayton, Australia, lance.wilson@monash.edu

 

CHARACTERISATION INFORMATICS

Australian characterisation infrastructure encompasses a wide selection of instruments and capabilities that are united by the need to  address  common informatics challenges. The multi-modal and distributed nature of the research, science  and  supporting  instruments  is  a challenge that has united the Australian characterisation community to successfully coordinate key informatics initiatives nationally.

Characterisation refers to the general process of probing and measuring the structures and properties of materials at the micro, nano and atomic scales. It is essential across natural, agricultural, physical, life and biomedical sciences and engineering. Characterisation  facilities, as outlined in the 2016 National Research Infrastructure Roadmap, provide researchers  in  Australian  universities,  research  centres,  and  industries with critical infrastructure, including both instrumentation and expertise, to enable quality research outcomes in an efficient and cost-effective manner. These facilities  are  a  key  capability  that  underpin  flagship  Australian  research  collaborations including ARC Centres of Excellence which are both significant users and partners in the development of future characterisation techniques and applications. The Australian Characterisation community and our partners bring together thousands of researchers who are driving the future of Australian imaging and innovation.

Characterisation has become a capability where informatics infrastructure, expertise and best practice is essential to turning data into  new  discoveries. As a collective, the Australian Characterisation community shares a number of significant informatics challenges, and are are working together to plan and implement strategies to overcome these challenges.

OUTLINE

This presentation will be a high level outline of the Australian characterisation informatics landscape, and will include:

  • An overview of A Collaborative Characterisation Informatics Strategy [1], a common outline of the challenges and requirements across major characterisation facilities. This document was produced by a writing group with representatives from Monash University, AMMRF, ANSTO and NIF, based on the outcomes of a series of open Characterisation  Informatics  workshops. It is intended to guide a characterisation informatics strategy and investment.

 

Figure 1: The Australian characterisation community provides a wide range of techniques that are applied across a variety of scientific domains. Common across these are a set of shared informatics challenges.

  • Outcomes of the Characterisation Informatics program of work, funded under NCRIS ANDS, NeCTAR and RDS, including: an overview of  Characterisation Virtual Laboratory (CVL) usage, research impact and economic impact  [2]; and  reusable  components of this infrastructure  and how they have been applied to capture, analyse, visualise  and  publish data. CVL has supported over 2,000 researchers, and has worked with or is working with 26 facilities to integrate over 100 instruments with a total value of around $250 million.
  • Current work under the Characterisation Data Enhanced Virtual Laboratory, which is structured to make a significant contribution toward addressing the three major challenges identified in A Collaborative Australian Characterisation Informatics Strategy, and includes:

○    A National Characterisation Data Capability: A broad program of work that has wide scale impact for Australian characterisation users. This includes infrastructure (data management and data analysis), expertise and FAIR. In particular, it will develop a model for CVL Desktop federation & deployability across  3+ nodes.  This  will  provide a stable, standardized and tuned data processing environment across key Australian data generating hubs.

○    Domain-focused Characterisation Data Projects: Focus areas that will uplift specific data-driven areas of need, will demonstrate the application of the broader DeVL in an exemplar field, and will apply existing infrastructure and best practise.

REFERENCES

  1. A Collaborative Characterisation Informatics Strategy, https://goo.gl/nxUspi
  2. K  Sweeny,   M  Fridman,   B  Rasmussen,  Estimating  the  value  and  impact  of  Nectar  Virtual  Laboratories,https://nectar.org.au/wp-content/uploads/2016/06/Estimating-the-value-and-impact-of-Nectar-Virtual-Laboratories-2017.pdf

Biography:

Dr Wojtek James Goscinski is the coordinator of MASSIVE, a national high performance computing facility for data science, and Associate Director at the Monash eResearch Centre a role in which he leads teams to develop and implement digital strategies to nurture and underpin next-generation research. He holds a PhD in Computer Science, a Bachelor of Design (Architecture), and a Bachelor of Computer Science.

Untitled Article

Mr Greg Sleap2, Dr Randall Wayth2, Dr Adam Beardsley5, Ms Alison Lynton3, Mr John Scullen4, Dr Scott Koranda1, Mr Benjamin Oshrin1, Ms Heather Flanagan1

1Spherical Cow Group, New York, United States,  skoranda@sphericalcowgroup.combenno@sphericalcowgroup.comhlflanagan@sphericalcowgroup.com 

2Curtin Institute of Radio Astronomy, Perth, Australia, greg.sleap@curtin.edu.aur.wayth@curtin.edu.au

3Curtin University, Perth, Australia, A.Lynton@curtin.edu.au

4Australian Access Federation, Brisbane, Australia, john.scullen@aaf.edu.au

5Arizona State University, Tempe, United States, Adam.Beardsley@asu.edu

 

The Murchison Widefield Array (MWA) is a low-frequency radio telescope operating between 80 and 300 MHz. It is located at the Murchison Radio-astronomy Observatory (MRO) in Western Australia, the planned site of the future Square Kilometre Array (SKA) lowband telescope, and is one of three telescopes designated as a Precursor for the SKA. Initially developed by an international collaboration including partners from Australia, Canada, India, New Zealand, and the United States, and led by the Curtin University node of the International Centre for Radio Astronomy Research (ICRAR), today the MWA includes partners from more than 20 institutions around the world including China and Japan.

To streamline collaboration and facilitate access to data and resources, MWA staff deployed an identity management infrastructure built on the foundation of federated identity. By leveraging the existing investment Curtin University had made in federated identity infrastructure and the Australian Access Federation (AAF), MWA published its federated services into the worldwide eduGAIN interfederation and enabled single sign-on (SSO) access using home organization credentials for collaborators throughout the world.

This presentation discusses the issues encountered, processes engaged, and challenges faced when implementing the MWA eduGAIN solution from four unique perspectives:

  • The MWA data manager and project scientist charged with enabling secure and scalable access to a growing collaboration with partners throughout the world.
  1. The Senior Systems Engineer at Curtin University tasked with facilitating access to AAF services including eduGAIN for MWA resources.
  2. AAF staff enabling subscribers to connect internationally with eduGAIN.

Consultants providing technical input on scalable and sustainable federated identity architecture in support of international collaboration.


Biographies:

Scott Koranda specializes on identity management architectures that streamline and enhance collaboration for research organizations.  https://orcid.org/0000-0003-4478-9026

Greg Sleap has been the Murchison Widefield Array (MWA) Data Manager since mid-2016, planning, developing and supporting the systems which allow astronomers around the world to utilise the MWA’s extensive data archive. https://orcid.org/0000-0003-0134-3884

Alison Lynton has worked for Curtin University as a Senior Systems Engineer since 2001, specialising in Unix. She has a passion for advocating for researchers needs within her organisation. https://orcid.org/0000-0002-6236-1915

John Scullen joined AAF in February 2016 to lead the development of new processes and tools in the Next Generation AAF project. His role has since expanded to oversee the AAF’s project portfolio.

Seamap Australia: a National Marine Habitat Portal for the Australian Continental Shelf

Peter Walsh1, Emma Flukes2, Vanessa Lucieer3, Claire Butler4, Mark Hepburn5, Craig Johnson6

1Institute for Marine and Antarctic Studies (IMAS), University of Tasmania, Hobart, Australia, Peter.Walsh@utas.edu.au

2Institute for Marine and Antarctic Studies (IMAS), University of Tasmania, Hobart, Australia, Emma.Flukes@utas.edu.au

3Institute for Marine and Antarctic Studies (IMAS), University of Tasmania, Hobart, Australia, Vanessa.Lucieer@utas.edu.au

4Institute for Marine and Antarctic Studies (IMAS), University of Tasmania, Hobart, Australia, Claire.Butler@utas.edu.au

5Condense Pty Ltd, Hobart, Australia, mark@condense.com.au

6Institute for Marine and Antarctic Studies (IMAS), University of Tasmania, Hobart, Australia, Craig.Johnson@utas.edu.au

 

The Australian National Data Service (ANDS) High Values Collections (HVC) program funded the establishment of an Australian marine habitat classification scheme, spatial database and web portal, which we have named Seamap Australia [1].

Seamap Australia provides marine habitat data for the Australian continental shelf synthesised into a spatial dataset using a new classification scheme and made available via www.seamapaustralia.org. In a nation first, Seamap Australia combines data from around Australia under a single classification scheme enhanced with biodiversity, environmental pressures, physical and other data, to become a truly multidisciplinary resource.

More specifically, Seamap Australia represents a number of key collaborative outcomes:

  • the collation of all national benthic habitat mapping data into a single platform complete with metadata records, Web Mapping and Web Feature Services, harvested by the Australian Ocean Data Network (AODN), Research Data Australia (RDA) and other data discovery services;
  1. the synthesis of these datasets into one spatial data product using the newly developed classification scheme for the Australian continental shelf and made available through the ANDS Research Vocabularies Australia service;
  2. a web portal combining the Seamap Australia habitat data with multiple other third-party data (including seafloor imagery, biodiversity surveys, environmental pressures, commercial fishing effort, and physical data) to create an invaluable multidisciplinary resource for marine managers, researchers, industry and the general community; and
  3. a number of innovative web interface features to enhance usability and management of data provenance.

The University of Tasmania was well positioned to develop this spatial project. Previous IMAS projects including SeaMap Tasmania, Redmap Australia and Reef Life Survey have generated awareness of what is possible when spatial data are made publicly available. We anticipate that Seamap Australia will enable national scale cross-disciplinary studies of continental shelf habitats. We hope that by bringing all of the marine habitat mapping data sets into one portal, and through highlighting the custodians of these data, that institutions are encouraged to work collaboratively to address nationwide solutions.

The Seamap Australia team have recently won the Big Data Innovation and Public Sector and Government categories of the 2018 Tasmanian iAwards, and will progress to the National iAwards later this year.

REFERENCES

  1. Butler, C., Lucieer, V., Walsh, P., Flukes., E and Johnson, C. (2017) Seamap Australia [Version 1.0] the development of a national benthic marine classification scheme for the Australian continental shelf. Institute for Marine and Antarctic Studies, University of Tasmania. 52 pgs.

Biography:

Peter Walsh began his career working in Fire Behaviour and Ecology research for the Department of Conservation and Land Management (CALM) in Western Australia.  Given an opportunity to diversify, he began writing eResearch applications for managing research data before progressing to the role of Operations Manager in CALM’s Data Centre.  In the late ’90’s, he started an enterprise developing software for monitoring data centre infrastructure, which went on to sell in six languages to over 30 countries worldwide.  In 2007, Peter joined the Bluenet Project at the University of Tasmania, eventually taking up his current position as Manager, Data and Information Systems at the Institute for Marine and Antarctic Studies in 2011. https://orcid.org/0000-0003-4998-0188

Cloudification of IMOS and the Australian Ocean Data Network – Beyond Lift & Shift

Dr Peter Blain1, Philip Bohm1, Leigh Gordon1, Craig Jones1, Alex McKeown1, Cameron Moloney1, George Sattler1, Angus Scheibner1

IMOS-AODN, Hobart, Australia, peter.blain@utas.edu.au, bohm@utas.edu.au leigh.gordon@utas.edu.au, craig.jones@utas.edu.au, alex.mckeown@utas.edu.au cameron.moloney@utas.edu.au, george.sattler@utas.edu.au, angus.scheibner@utas.edu.au

 

BACKGROUND

The Integrated Marine Observing System (IMOS) is an Australian Government infrastructure project. IMOS is an integrated national array of observing equipment that monitors the open oceans and coastal marine environment around Australia, covering physical, chemical and biological variables. All IMOS data is freely and openly available for the benefit of Australian marine and climate science. IMOS observations are guided by societal needs for improved ocean information, and focused through science planning undertaken collaboratively across the Australian marine and climate science community. The IMOS information infrastructure is built on open source applications such as GeoNetwork and GeoServer, extended by open source projects governed by IMOS.

The IMOS information infrastructure has been leveraged to build the Australian Ocean Data Network (AODN), which is an interoperable online network of marine and coastal data resources, including the Integrated Marine Observing System (IMOS), data from the six (6) Commonwealth agencies – Commonwealth Science and Industrial Research Organisation (CSIRO), Bureau of Meteorology (BOM), Royal Australian Navy (RAN), Australian Institute of Marine Science (AIMS), Geoscience Australia (GA) and the Australian Antarctic Division (AAD) – and additional data from universities, state based agencies and the private sector. The AODN provides discovery and access services, primarily through the AODN Portal. The marine data collections published in the AODN Portal are wide ranging, and all data collections are available to the public. Data covers approximately two hundred parameters in different ocean environments collected from ocean-going ships, robots, moorings and other platforms.  Contributing organisations publish data in the AODN by providing ISO 19115 compliant metadata marked up with OGC web compliant services. The metadata is harvested from contributors and stored in a GeoNetwork instance that backs the AODN portal.  The AODN portal is a consumer of web services including OGC CSW which is used to query the catalogue.  The AODN portal also consumes web services defined in the metadata and it uses these to provide visualisation, subsetting, and data download to end users.

CLOUDIFICATION

IMOS began running its information infrastructure in the cloud at the inception of NeCTAR in 2011. In 2015 IMOS extended its move to the cloud by hosting core production systems at AWS, while continuing to use NeCTAR for research and development. The initial migration to the cloud was a classic “lift and shift” operation. This task was made relatively easy thanks to configuration management tools such as Chef. The result of this migration did not, however, yield a truly cloud-optimised architecture. The architecture looked more like a traditional physical server based design that happened to be running in the cloud. Over the last two years IMOS has transformed its infrastructure to leverage the advantages of cloud computing. This presentation discusses the architectural design patterns and cloud computing services that IMOS has put into effect. These include serverless architectures for big data aggregations, and infrastructure-as-code approaches for spinning up fully configured test and dev replicas of complex production environments.

SERVERLESS ARCHITECTURE

Serverless computing is beneficial in several situations, including when compute utilisation is lumpy. IMOS has such a use case related to the subsetting and aggregation of large gridded geospatial data collections. A single processing job may take an hour or more to complete, followed by several hours of no jobs at all. The serverless approach chosen by IMOS can run aggregation workloads at any scale. It automatically provisions compute resources without the need to install or manage the host framework, which has led to a faster and more reliable service for end users.

INFRASTRUCTURE-as-code

The main IMOS software stack comprises open source applications such as GeoServer, GeoNetwork, THREDDS, the AODN Geospatial Portal, Talend as well as object storage and a relational database. IMOS requires multiple instances of this stack for use cases such as production, testing, development, and for demonstration. The stacks must be quick to run up and tear down, and each component must scale automatically without limit. This presentation includes a discussion of how this was achieved at IMOS.


Biography:

Alex McKeown is a Software Engineer at the Australian Ocean Data Network (AODN) a facility of the Integrated Marine Observing System (IMOS). He holds a Bachelor of Computing (Honours) from the University of Tasmania and has a background of working with Sensor Web technologies.

OpenHeart Project- An open source community designing heart pumps

Miss Amanda Miotto1Jo Pauls2

1Griffith University, Brisbane Australia – a.miotto@griffith.edu.au
2The University of Queensland, Brisbane Australia – j.pauls@uq.edu.au

 

Abstract

A medical engineer presented us with a unique challenge. His team wanted to collaborate worldwide on designing 3D printable heart pumps – while keeping them open source and affordable for third world countries.

This presentation will provide an overview of the project, the processes and approach involved, learnings from the project and suggestions on how it could apply for other projects.

Details

Mechanical Circulatory Support (MCS) pumps are used to treat patients with advanced heart failure. A mechanical pump is surgically implanted to supplement or replace the blood flow generated by the native heart. Often proprietary pumps are completely unaffordable for citizens in third world countries.With the advances of 3D printing technology, the life saving ability to print an open source heart pump has made treatment more attainable.

Currently research is often undertaken in isolation within each laboratory, limiting inter-laboratory collaboration and thus limiting the full potential of the field of Mechanical Circulatory Support (MCS).  There was a clear need and untapped potential for improved collaborative efforts, improved education and standardisation and subsequent improvement of research quality and outcomes within the field.

The system needed to be easy to use and learn, adaptable for different technologies, low-cost and sustainable long term. The repository needed to include not just the files needed for the 3D printers, but testing data, testing code and human-readable information about the mechanics.

Combining a workflow of technologies with sustainable solutions, we were able to provide engineers with a way to share their work with a low barrier to entry. This open repository platform was named OpenHeart. The platform incorporates web-based version control and documentation mechanisms, with attached collaborative tools providing networking opportunities and discussion over research questions and online courses and educational material for upskilling.

Coupled with this was the dual copyright licence, protecting both their programmatic code and their intellectual designs and testing data using both a Creative Commons (CC-BY 4.0) licence and a Open Source licence (BSD-3-Clause). This protects not just the designs of the heart pumps themselves but the code that resides in the 3D printable file. Paired with the licence is unique URLS and DOI minting for publication purposes, encouraging reuse and transparency.

This system offers access to the current developments in the MSC field for researchers in developing countries with a low barrier of entry. By sharing existing solutions (e.g. experimental set-up, equipment design, data analysis strategy) it will be possible to save research time and money while giving emerging researchers a head start.


Biography:

Amanda Miotto is an Senior eResearch Analyst and Software Developer for Griffith University. She started off in the field of Bioinformatics and learnt to appreciate the beauty of science before discovering the joys of coding. She is also heavily involved in Software Carpentry, Hacky Hours and ResBaz, and has developed on platforms around HPC, microscopy & scientific database portals.

Dr. Jo P. Pauls is a postdoctoral research fellow at the Innovative Cardiovascular Engineering and Technology Laboratory (ICETLAB), Critical Care Research Group (CCRG) at The Prince Charles Hospital, Brisbane Australia and the School of Medicine at The University of Queensland, Brisbane Australia. His research interests are focused on the development and evaluation of mechanical circulatory support devices and physiological control systems. In addition, Dr. Pauls is interested in open science and data sharing.

About the conference

eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.

© 2017 - 2018 Conference Design Pty Ltd