Medical Imaging: Federation and Compute

Chris Albone1, Ryan P Sullivan1,2

1Information and Communications Technology, University of Sydney, Sydney Australia

2Core Research Facilities, DVC-R, University of Sydney, Sydney Australia

chris.albone, ryan.sullivan@sydney.edu.au

 

SUMMARY

XNAT is an imaging data platform that has been rapidly gaining popularity throughout Australian research institutions and facilities, and worldwide [1]. It has been adopted as part of the National Imaging Facilities (NIF) Trusted Data Repository (TDR) program to provide a standard framework on medical imaging and data provenance.

Similar efforts are underway on the computational component with the Characterization Virtual Labs (CVL) under the Data Enhanced Virtual Lab (DeVL) program funded by NRDC, providing a workbench dedicated to neuroimaging. NIF@UQ has also been working on a DICOM2Cloud project to facilitate automated anonymization of data for computation on public cloud environments.

The University of Sydney is using XNAT as a key component of our Imaging Data Service, and have combined it with compute on our HPC, and VRD, as well as the CVL and GUI informatic pipeline platforms. We are also a participant in the C-DeVL program developing a windows version of CVL workbenches. Research is inherently multi-institutional, and projects will be spanning multiple repositories and computation infrastructure. We would like to raise the natural question of federation of these aligned projects.

FORMAT

We propose a 60 min roundtable with representatives of institutions with XNAT, or who are looking at deploying XNAT systems. The roundtable will discuss the following:

  • What is the current status of deployments? Plans for the immediate future. (20 min)
  1. What might XNAT federation look like? Federated metadata search? Federated data search? (15 min)
  2. CVL is being federated. What about other characterization and informatics workflow platforms? Shared repository of Singularity/Docker pipelines to use in XNAT and/or HPC? (15 min)
  3. Should a standard anonymization toolset be adopted when transferring between these repositories and centers of compute? (10 min)

Biography:

Dr Sullivan is a biophysicist with an interest in neural implants. His research led him into software development for automatic characterization of implants and neural tissue. Dr Sullivan joined the University of Sydney in 2017 where he now works on eResearch projects focusing on characterization domains.

Delivering Software Solutions to Astronomy Researchers

Prof. Jarrod Hurley1, Dr Jenni Harrison2, Dr Rory Smith4Dr Greg Poole3

1Swinburne University of Technology, Melbourne, Australia, jhurley@swin.edu.au

 2Pawsey Supercomputing Centre, Perth, Australia, jenni.harrison@pawsey.org.au

3Swinburne University of Technology, Melbourne, Australia, gpoole@swin.edu.au

4Monash University, Melbourne, Australia

 

DESCRIPTION

Astronomy Data and Computing Services (ADACS) was established in early 2017 by Astronomy Australia Limited (AAL) to empower the national astronomy community to maximize the scientific return from their data and eResearch infrastructure. ADACS is delivered through a partnership that has been created between Swinburne University, Curtin University and the Pawsey Supercomputing Centre – comprising Melbourne-based and Perth-based nodes.

A key element of ADACS is to provide professional software development and data management services to astronomy researchers. By developing partnerships between researchers and computational scientists – adding professional software engineering, project management, system analysis and design methodologies to projects – we aim to address the quality and performance benchmarks that can be lacking in platforms developed by researcher-only teams. Targeted eResearch fields include big-data analysis and processing, optimization of workflows for high-performance computing, parallel computing techniques, data-sharing and dissemination, large-scale visualization and construction of real-time data analysis platforms. The long-term goal is to provide a central hub for resources in these areas that can be accessed by astronomers – embedding the service within projects to develop nationally significant software pipelines and data platforms.

The proposed session is aligned with the Research Software Engineering stream and specifically the Software as a Service and Development Methods, Tools & Practices themes.

Astronomers apply for ADACS services through a merit-based allocation scheme that is operated twice per year, first submitting an expression of interest, then working with an ADACS member to develop the technical requirements for their project and finally submitting a full proposal which is considered by an independent Time Allocation Committee drawn from the national astronomy community by AAL. Proposals are ranked on merit and matched against the developer weeks (and expertise) available (generally the equivalent of 2-3 full-time developers per year). To date the scheme has focused on short to mid-range projects with an over-subscription rate of 300-500%. This clearly demonstrates a need for such services and a lack of provision in the past. Projects have ranged from developing graphics programming unit (GPU) algorithms for speeding up gravitational wave analysis to enhancing the user interface and back-end for citizen science projects.

A key aspect for the success of these projects is a constructive working relationship between the computational scientists and the researcher (the sponsor) and how this operates within a project management framework. Such a framework can be foreign environment to a researcher, e.g. working in sprints, providing user-stories upfront, so there is a need to be adaptable on both sides. In this session we aim to explore this relationship, providing ADACS case studies from both perspectives, with the aim of sharing our experiences to date and starting discussions with others who have worked through similar experiences. We are also interested in discussions around how we ensure the long-term sustainability of such schemes, how the ongoing needs of delivered projects should be managed and how we can prioritise larger-scale technically challenging projects while still meeting the needs of the general research community.

The proposed BoF session is intended to be 40 minutes in duration.

The proposed format of the session is a mixture of presentations, contributed talks and facilitated discussion as follows:

  • Introduction to ADACS and the methodology behind the delivery of software as a service to researchers within the national astronomy community [10 mins];
  • Case study of an ADACS development project from the developer perspective, focusing on project management, techniques applied and delivered outcomes [10 mins];
  • Case study of an ADACS development project from the researcher perspective, focusing on the science goals, project interaction experience and application of the delivered outcome [10 mins];
  • Facilitated discussion on the developer-researcher relationship when delivering software as a service to a research community, including best practice and lessons learnt to date [10 mins].

The case study from a researcher perspective will be sourced from the pool of completed ADACS projects as a contributed talk. The introduction, case study from a developer perspective and discussion will be provided/convened by the listed convenors/presenters who are all ADACS members.

The targeted audience for the session includes researchers and technical staff with an interest in bringing professional software development practices into the methodology of the scientific research community. The audience need not be astronomy specific. In fact, a primary goal for the facilitated discussion is to initiate conversations aimed at translating support services across scientific domains, promoting collaboration and skill sharing between like-minded entities.

ADDITIONAL INFORMATION

An example of a success story for an already completed ADACS project can be found here:

Below are two examples of ADACS projects currently under development. Both are on track for completion by end of June 2018.

  1. An Automated Data Reduction Pipeline for AAO Data Central

Lead Researcher – Simon O’Toole (Australian Astronomical Observatory: AAO)

Development – ADACS Perth node

Summary: This project will create a data reduction pipeline using python and django to manage CLI functions of an application 2DFDR using a restful API. The API will be accessible by the AAO team in addition to their partners.

  1. GPU Acceleration of Gravitational Wave Signal Models

Lead Researcher – Rory Smith (Monash)

Development – ADACS Swinburne node

Summary: This project will develop a CUDA-based GPU implementation of highly parallelizable gravitational-wave signal models to alleviate the computational bottleneck in Laser Interferometer Gravitational wave Observatory (LIGO) parameter estimation codes. Parameter estimation is an optimal tool for gravitational-wave signal detection but the current high cost prohibits use as a search pipeline.


Biography:

Professor Jarrod Hurley has led the supercomputing program at Swinburne for the past decade and is the manager of the NCRIS-funded OzSTAR national facility. Hurley obtained his PhD in astrophysics at the University of Cambridge (UK) before research positions as a Hubble Fellow at the American Museum of Natural History (USA) and at Monash University. Hurley has a strong research background in computational astrophysics, specialising in realistic N-body simulations of star cluster evolution. He was a founding member of the Astronomy Supercomputing Time Allocation Committee (ASTAC), a former steering committee member for the Australia National Institute for Theoretical Astrophysics (ANITA), a member of the Astronomy eResearch Advisory Committee (AeRAC) for Astronomy Australia Limited (AAL) and manager of the Swinburne node of the Astronomy Data and Computing Services (ADACS) initiative. Hurley is passionate about creating a HPC environment that readily adopts new technology and enables researchers to meet their research goals.

Untitled Article

Mr Kheeran Dharmawardena1, Mr Paul Box2

1Atlas of Living Australia, Melbourne, Australia, Kheeran.Dharmawardena@csiro.au

2CSIRO, Black Mountain, Australia, Paul.J.Box@csiro.au

A BoF was held on Problem and Solution patterns on Community Development at the eResearch Australiasia conference in 2017.  During this session a number of patterns were identified and described.  This BoF will build upon the work done at eResearch 2017 and further explore the idea of a pattern library in Community Development around research information infrastructures.

Information infrastructure used by research comprising systems, data, processes and people providing this infrastructure (provider community) has evolved to underpin specific communities (user communities) with specialised software and hardware requirements. Underpinning research user communities is challenging: software and data in cutting edge areas advances quickly meaning that software infrastructure can fast become irrelevant; research is naturally competitive, which makes collaboration a finely tuned balance; and building models for sustainability is challenging.

A pattern language is a method of describing good design practices or patterns of useful organization and through a set of interconnected patterns, attempt to express a deeper understanding of the relationship between different patterns.

Some patterns identified to-date include:

  • Connecting rowing and steering – governance is the decision making process that sets the ‘rules of the game’ to ‘steer’ collective activity’. Individual orgs and people do the heavy lifting ‘rowing’ to achieve agreed outcomes. If there is a real or perceived inability to influence decision outcome in governance mechanisms there is likely to be a disincentive to taking action to achieve the outcomes particularly where collaborative efforts are in-kind volunteered effort, rather than being centrally funded.
  • Pigs and chicken – decision rights should be allocated in ways that are appropriate to the needs of the community and the respective roles of individual actors. Assigning decision authority – decider (as opposed to decision input roles) can be used to give more voice in collective decision making to those who will have more skin in the implementation game i.e. the ‘pigs’
  • Understanding and leveraging Coalitions of the Willing (COWs) – What incentivizes the folks who drive and contribute to initiatives? How can this be replicated and scaled up?
  • Working with frenemies – Difficult to navigate the various individual and organisational (dis)incentives for collaboration within a competitive environment that hamper eResearch adoption and growth

There are sure to be many more patterns.

This interactive BoF will look at these socio-technical challenges and seek to identify emergent problems & solutions patterns towards building communities that help underpin research communities in the use of information systems.


Biography:

Paul Box leads a CSIRO research team developing interoperable systems of systems or ‘Information Infrastructure’. Paul has worked for more than 25 years in geospatial information technology field.

More recently, Paul has focused attention on addressing the social rather than technical challenges of building Information Infrastructure. Coherent integrated approaches to addressing the social, institutional and economic challenges of infrastructure development are being elaborated through ‘social architecture’.

Mr. Kheeran Dharmawardena, MBA, BComp, is the Program manager at the Atlas of Living Australia.  Kheeran has over 2 decades of experience in delivery of many ICT services within the higher education and research sector, including infrastructure delivery, service delivery, data management, IT & enterprise architecture and eResearch.  He has a special interest in the socio-technical challenges involved in the delivery of effective services.

(orcid.org/0000-0002-4292-7475)

Trusted Data Repositories: From Pilot Projects to National Infrastructure

Keith Russell1Andrew Mehnert2,3 , Heather Leasor4, Mikaela Lawrence5

1Australian Research Data Commons

2National Imaging Facility, Australia, andrew.mehnert@uwa.edu.au

3Centre for Microscopy, Characterisation and Analysis, The University of Western Australia, Perth, Australia

4Australian National University

5CSIRO

 

DESCRIPTION

In FY 2016/17, ANDS funded the Trusted Data Repository program. This aimed to look at how to provide more trusted storage through three projects chosen to examine a number of dimensions:

  • NIF: multi-institutional (UQ, MU, UNSW, UWA), image/non-image instrument data, data generating facilities
  • ADA: single institution (ANU), social science data, data holding facility with a national role
  • CSIRO: single institution (not a university), range of data types, institutional data store

The primary focus of the program was on the trustedness of the repository containers, not on the data they contained. In other words Trusted (Data Repositories) not (Trusted Data) Repositories. However In the case of the NIF project they did consider both aspects: (1) Requirements necessary and sufficient for a basic NIF trusted data repository service; and (2) NIF Agreed Process (NAP) to obtain trusted data from NIF instruments.

The main challenges addressed across the program were how to:

In this BoF, the projects will present what they learned by undertaking this journey and reflect on how to generalize what they learned to the national context (noting that NIF is a national facility, ADA is a national repository, and CSIRO is a national agency).

Following this there will be an open discussion about next steps, including how to expand this initial set of projects to a national infrastructure of trusted data repositories serving a range of domains.

Format

The BoF will be a mix of presentation of content via slides (contributed by Love, McEachern and Mehnert), followed by an open discussion among all those presenting (facilitated by Treloar).

Timing

0-10: Overview of Trusted Data Repository (TDR) program run in 2017 and international relevance

10-40: 3 ten minute presentations from each of the pilot TDR projects

50-60: Role of Trusted Data Repositories in the NRDC

60-75: Open discussion

75-80: Next steps

Upskilling library staff: from zero to heroes

Steven Chang1, Rachel Salby2, Janice Chan3, Julie Toohey4, Susannah Bacon5

1La Trobe University, Melbourne, Australia, s.chang@latrobe.edu.au

2La Trobe University, Melbourne, Australia, r.salby@latrobe.edu.au

3Curtin University, Perth, Australia, janice.chan@curtin.edu.au

4Griffith University, Gold Coast, Australia, julie.toohey@griffith.edu.au

5Australian Research Data Commons (ARDC), Canberra, Australia susannah.bacon@ardc.edu.au

 

BACKGROUND

Information professionals and library staff are taking on an increasingly central role in developing research data management services. This trend means institutions have an imperative to upskill staff and empower them to cultivate expertise in this area. The wider library and research data community have emphasized the need for librarians to develop solid research data management skills, as seen in the 2016 Ithaka S+R Library Survey 2016 [1], European Union’s RECODE: Open Access to Research Data [2], initiatives such as the ALIA Research Data specialization, and other literature.

Key skillsets include supporting researchers and institutions with data discovery, data citation, data storage, data formats, collaboration, research data management plans, DOIs, FAIR principles, copyright and intellectual property, sensitive data,  metadata management, data retention, open access, publishing and sharing data. It is vital that library staff with no prior research data management or research background can gain the knowledge to feel confident supporting and advising researchers on best practice.

A common challenge in upskilling staff comes from overcoming staff resistance and fear of new technology, according to the literature [3, 4, 5]. The experience from University of California, Berkeley, illustrated that, while many libraries have made concerted efforts to train staff in research data management, the success of these programs depends on how closely aligned staff feel with the training [6]. It is therefore important that training programs allow staff to feel comfortable with this new domain of skills and engage staff with hands-on experience that they can relate to their work.

In Australia, a variety of programs have been used, including the 23 Research Data Things developed by the Australian National Data Service (ANDS, now known as Australian Research Data Commons), Library Carpentry workshops, using a variety of mediums including online modules, face-to-face classes, and blended learning. La Trobe University has developed an extensive evidence-based upskilling model that goes beyond single training sessions and incorporates a series of hands-on training, living documentation, encouraging ambassadors and champions, and developing a participatory community of practice. These initiatives have encouraged librarians to become ambassadors, create communities of practice, and involve library voices in events such as ANDS webinars, Research Bazaar (ResBaz) events, and Research Support Community Days. We will discuss the outcomes of these programs for staff knowledge and confidence at this session.

SESSION FORMAT

1 hour ‘Birds of a Feather’ session focusing on informal discussion and reflection. The conveners will chair the session, which will also include four brief lightning talks.

Each lightning talk will provide an overview of the ways each institution has trained library staff in research data management and overcome barriers that prevent staff from embracing these new roles. Attendees are encouraged to informally share their own experiences and reflections. The focus will be on collectively sharing reflections on best practice. We also want to hear about a range of outcomes, including poor take-up of programs and key challenges faced. An assigned facilitator will take notes reviewing the main discussion points, and collate these to circulate after the conference is over.

This session is targeted at both library leadership and managers, as well as newcomers to data librarianship, plus any others who are interested in identifying the best approaches to learning and teaching research data management for support personnel.

 

REFERENCES

  1. Wolff-Eisenberg, C. (2017). US Library Survey 2016. New York : Ithaka.
  1. RECODE Project Consortium. (2014).  RECODE: Policy Recommendations for Open Access to Research Data.Luxembourg: Office for Official Publications of the European Communities. DOI: 5281/zenodo.50863
  1. Blessinger, K. & Hrycaj, P. (2013). Workplace Culture in Academic Libraries: The Early 21st Century.  Boston: Emerald Publishing.
  1. Matteson, M. & Hines, S. (2017). Emotion in the Library Workplace. Boston: Emerald Publishing. https://doi.org/10.18665/sr.303066
  1. Edwards, McClean, & Cleave (2016, February 10th), “Have you tried turning it off and on again?” Exploring a state-wide ICT skills training project for Victorian public library staff, VALA2016, Melbourne. Melbourne: VALA. https://www.vala.org.au/vala2016-proceedings/vala2016-session-10-edwards/
  1. Wittenberg, J., Sackmann, A. & Jaffe, R. (2018). Situating Expertise in Practice: Domain-Based Data Management Training for Liaison Librarians. Journal of Academic Librarianship, 44(3), 323-329. https://doi.org/10.1016/j.acalib.2018.04.004

 


Biographies:

Rachel Salby is Acting Senior Coordinator, Research Data at La Trobe University Library. She has particular expertise in research data management, project management, and library systems. She is a passionate advocate for improving researcher access to data and information.

https://orcid.org/0000-0002-8955-3589

Steven Chang is Research Data Outreach Officer at La Trobe University Library. He is interested in open scholarship, systematic review methodology, research data management, and health librarianship. Steven comes from a medical librarian background, and is the former editor of the publication Health Inform.

https://orcid.org/0000-0002-3193-7969

Janice Chan is Coordinator, Research Services at Curtin University, Perth, Western Australia. Janice’s experience is in repository management and scholarly communications. She is interested in open research, metrics and impact assessment, research data management, library-led publishing, data analysis and visualisation, and innovative practice in library service delivery.

https://orcid.org/0000-0001-7300-3489

Julie has worked in academic libraries for 23 years and is currently the Health Discipline Librarian at Griffith University, Gold Coast campus.  Julie is passionate about research data management practices and is in the process of publishing her first co-authored journal article. Throughout 2016, Julie co-facilitated the Australian National Data Services 23 Things (research data) Health and Medical Data Community Group webinar series and is a member of the Queensland University Libraries Office of Cooperation (QULOC) Research Support Working Party.

https://orcid.org/0000-0002-4249-8180

Susannah has worked for ANDS (now ARDC) since 2010 where she has been involved in community building, outreach and training, website content delivery, webinars and all aspects of communications. She was a key person in the delivery of the internationally renowned 23 (research data) Things program that was directed specifically at the librarians community, and is constantly ensuring that it remains relevant and up to date. She holds a Bachelors of Agricultural Science, a Grad Dip in land rehabilitation and a Masters in Social Research.

https://orcid.org/0000-0002-8606-0703

Scientific workflow uptake – What are the challenges?

Siddeswara Guru1, Minh Dinh1, David Abramson1, Gareth Price 1, Damien Watkins2, Lachlan Hetherton2, Alan Both3

1University of Queensland, Brisbane, Australia, s.guru@uq.edu.au, m.dinh1@uq.edu.au, david.abramson@uq.edu.au, g.price@imb.uq.edu.au

2 Data61 CSIRO, Melbourne, Australia, damien.watkins@data61.csiro.au, Lachlan.hetherton@data61.csiro.au

3RMIT University, Melbourne, Australia, alan.both@rmit.edu.au  

DESCRIPTION

A scientific workflow is a series of well-defined coordinated, structured activities that define a particular investigation or experiment process in scientific context [1]. Workflow in science is useful because it enables scientists to:

  • describe, manage, share and execute scientific analyses;
  • provide a high-level abstract view of scientific computation, while hiding underlying details;
  • interface with distributed computing environment;
  • capture complete workflow as an artefact and make it a reusable entity [2];
  • capture provenance information for further analysis and knowledge re-use.

In an eResearch 2017 BoF session, we provided an overview presentation of some of the scientific workflow management systems (SWMS) (e.g., Kepler, Galaxy, Workspace) used in different science disciplines. Notably, an interactive Q&A panel discussed the motivations and the use cases of scientific workflows, how to choose the right tool for particular application, and developed a community around workflow management system.

While some SWMSs have proven their success in improving the rate of scientific discovery, overall uptake of scientific workflows for eResearch is still limited. In this year BoF, we address the challenges in the uptake of these SWMSs from the perspectives of domain scientists, eResearch analysts, workflow engine developers and decision makers. Especially, we will engage technical issues in the following areas:

  • developing workflows and subsequent tools;
  • debugging individual workflow components and the workflow as a whole;
  • leveraging cloud resources and capabilities;
  • scheduling workflows jobs in cloud;
  • provenance tracking and propagation;
  • platforms to use and run workflows;
  • reproducibility challenges;
  • deploying and sharing workflows.

BoF Details:

  • Short presentations from domain scientists and eResearch analysts on their experience in developing and using workflow management systems including Kepler, Galaxy, KNime, and Workspace. An open discussion on challenges in operationalising some of the complex processes using workflows and lessons from different tools. The BoF will conclude with a concrete plan to improve the practice in the scientific workflow for knowledge sharing and capacity building.
  • The BoF session will run for 60 minutes. The first 20 minutes is allocated for an introduction to the BoF and short presentations, next 30 minutes for panel discussion to discuss the challenges of uptake and 10 minutes to discuss future coordination and planning.

REFERENCES

  1. Talia, D. Workflows Systems for Science: Concepts and Tools. ISRN Software Engineering, 2013.
  2. Guru, S.M., I.C. Hanigan, H.A. Nguyen, E. Burns, J. Stein, W. Blanchard, D. B. Lindenmayer, and T. Clancy, Development of a cloud-based platform for reproducible science: the case study of IUCN Red List of Ecosystems Assessment. Ecological Informatics, 2016.

Biography:

Siddeswara Guru is a program lead for the TERN data services capability. He has experience in the development of domain-specific research e-infrastructure capabilities.

How to Choose the ‘Right’ Repository for Research Data

Shawn A Ross1, Steven McEachern2, Peter Sefton3, Brian Ballsun-Stanton4

1Macquarie University, Sydney, Australia, shawn.ross@mq.edu.au

2Australian National University, Canberra, Australia,steven.mceachern@anu.edu.au

3University of Technology Sydney, Australia, peter.sefton@uts.edu.au

4Macquarie University, Sydney, Australia, brian.ballsun-stanton@mq.edu.au

DESCRIPTION

In Australia, multi-institutional, domain-specific information infrastructure projects, such as those funded through the Australian National eResearch Collaboration Tools and Resources (NeCTAR) program, are typically open-source software (OSS). National infrastructure such as AARNet’s Cloudstor, built on OwnCloud, is also OSS. Even publications repositories and data management planning tools are often OSS (DSpace, DMPOline, DMPTool, RedBox, etc.). The trend in institutional research data repository software amongst institutions who prefer not to build an in-house solution, however, appears to favour proprietary software (e.g., Figshare). In comparison to Europe and North America, OSS is much less popular in Australia (e.g., Dataverse, CKAN). Dataverse, for example, has 33 installations on five continents containing 76 institutional ‘Dataverses’ (some installations house more than one) – but Australia has only one installation or institutional Dataverse, (the Australian Data Archive) [1]. By contrast, Figshare has been or is being implemented by at least five Australian universities [2], with others actively considering it.

This BoF session examines the reasons why institutions choose proprietary versus OSS for research data infrastructure. We compare the practical advantages, disadvantages, and considerations around each approach. We propose for discussion the idea that the advantages of proprietary software are overstated, as is the burden of implementing and administering OSS. For example, costs like requirements analysis, systems integration and engagement, outreach, and training – which together likely account for the majority of a software project’s budget – are similar whether proprietary or OSS. Deployment and maintenance of modern OSS platforms, facilitated by approaches like containerisation and automation, is lower than in the past. SaaS options for OSS are also sometimes overlooked. Proprietary software, moreover, is not always an ‘out-of-the-box’ turn-key solution for software at universities, especially regarding specialised software for research (as opposed to commodity). As such, it may require the creation of separate but interoperable systems to fill gaps in capacity, dramatically raising costs. Conversely, the flexibility and capabilities of an OSS solution are neglected: if a feature is missing or inadequate, it can be built (often with support from the community) and made available for reuse, without having to work around the edges of a proprietary system. The Australian Data Archive, for example, has added significant new features to Dataverse to support mediated access to sensitive data, which are available to other users. However, a deeper exploration of the tradeoffs and demands of both approaches in the context of specialised academic software is warranted. The focus of the discussion will be practical, but it may extend to the potential impact of various software business models on research data, a core output and asset of universities.

The session will be 60 minutes in duration. It will include brief presentations by the organisers based on their experience, followed by open discussion. Audience participation is essential – we encourage a candid exchange of experience with either proprietary or OSS for research data management at various institution, so that we can learn from each others’ successes and challenges. The outcome will be information to guide decision making around repository platform procurement at universities.

REFERENCES

  1. The Dataverse Project. Available from: https://dataverse.org/ accessed 22 June 2018. See also https://dataverse.org/metrics accessed 22 June 2018.
  2. The University of Adelaide. Available from: https://www.adelaide.edu.au/figshare/. The University of Melbourne. Available from: https://melbourne.figshare.com/. Monash University. Available from: https://monash.figshare.com/. La Trobe University. Available from: https://latrobe.figshare.com/. Federation University Australia (planned). Available from: https://federation.edu.au/staff/governance/projects/current-projects.

Biographies:

Shawn Ross (Ph.D. University of Washington, 2001) is Associate Professor of History and Archaeology and Director of Data Science and eResearch at Macquarie University. A/Prof Rossʼs research interests include the history and archaeology of pre-Classical Greece and the Balkans, and the application of information technology to research. He supervises a large-scale landscape archaeology and palaeo-environmental study in central and southeast Bulgaria. Since 2012, he has also directed a large information infrastructure project developing data capture and management systems for field research. Previously, A/Prof Ross worked at the University of New South Wales (Syndey, Austrlalia) and William Paterson University (Wayne, New Jersey).

Steve McEachern is Director and Manager of the Australian Data Archive at the Australian National University, where he is responsible for the daily operations and technical and strategic development of the data archive. He has high-level expertise in survey methodology and data archiving, and has been actively involved in development and application of survey research methodology and technologies over 15 years in the Australian university sector. Steve holds a PhD in industrial relations from Deakin University, as well as a Graduate Diploma in Management Information Systems from Deakin University, and a Bachelor of Commerce with Honours from Monash University. He has research interests in data management and archiving, community and social attitude surveys, organisational surveys, new data collection methods including web and mobile phone survey techniques, and reproducible research methods. Steve has been involved in various professional associations in survey research and data archiving over the last 10 years.

Peter Sefton is the Manager, eResearch Support at the University of Technology, Sydney (UTS). Before that he was in a similar role at the university of Western Sydney (UWS). Previously he ran the Software Research and development Laboratory at the Australian Digital Futures Institute at the University of Southern Queensland. Following a PhD in computational linguistics in the mid-nineties he has gained extensive experience in the higher education sector in leading the development of IT and business systems to support both learning and research. At UTS Peter is leading a team which is working with key stakeholders to implement university-wide eResearch infrastructure, including an institutional data repository, as well as collaborating widely with research communities at the institution on specific research challenges. His research interests include repositories, digital libraries, and the use of The Web in scholarly communication.

Brian Ballsun-Stanton is Solutions Architect (Digital Humanities) for the Macquarie University Faculty of Arts with a PhD from UNSW in Philosophy. He is working with researchers from across Australia to deploy digital technologies and workflows for their research projects. He has developed a new methodology (The Social Data Flow Network) to explore how individuals in the field understand the nature of data. Brian’s current research interests are in exploring the Philosophy of Science’s interactions with the Open Data movement, and building tools for rapid analysis and bulk manipulation of large ancient history corpora.

The Australian Research Data Commons – Building on the foundations of ANDS, Nectar and RDS to become a transformational investment

Convener: Ian Duncan1

1Acting Executive Director, ARDC, QLD, ian.duncan@ardc.edu.au

 

DESCRIPTION

A key recommendation in the 2016 National Research Infrastructure Roadmap1 was to bring together ANDS, Nectar and RDS projects into a single entity.  This was achieved in June 2018 with the establishment of the Australian Research Data Commons (ARDC).

As part of the establishment of the ARDC, a strategic plan has been developed which sets out the strategic intent for the next 5 years.  This plan has been developed in consultation with key partners and the community, with the intent of building on the strengths of ANDS, Nectar and RDS as a transformational investment, partnering to facilitate a coherent research environment that will enable Australia’s researchers to find, access, contribute to and effectively use leading data-intensive eResearch infrastructure to maximise research quality and impact by developing a world-leading data advantage, facilitating accelerated innovation, fostering collaboration for borderless research, and enhancing researchers’ ability to translate their research into benefits for society.

This BoF will set out the roadmap for the next 5 years and provide an opportunity for the community to engage with the ARDC as it embarks on this journey as well as examine the opportunities for the ARDC to partner with NCRIS facilities, institutions and research communities.

Proposed Format

The session will start with a short overview presentation to set the scene following by a series of round tables focused around the ARDC strategic pillars, concluding with a panel discussion and summary.

REFERENCES

  • Commonwealth Department of Education and Training, 2016 National Research Infrastructure Roadmap, 2016

Biography:

Ian is the Acting Executive Director of the Australian Research Data Commons (ARDC).  He has held roles including Director of the RDS NCRIS project, has led programs within ANDS, and has been Associate Director of Enterprise Support and Associate Director of Infrastructure and Operations at the University of Queensland, as well as founding, running, and selling his own Internet Payment Gateway company and ISP, and working for the Shell Oil Company and National Australia Bank. He has a degree in Economics and Politics, is married to a Professor researching Alzheimer’s Disease, has two fantastic teenage kids, and sees opportunities for collaboration, reinforcement, support, and excellence throughout the research sector and feels immensely positive and optimistic about the impact the ARDC, together with our partners, can bring about.

Describe, Manage and Discover Research Software

Dr Mingfang Wu1, Dr Jens Klump2, Ms Sue Cook2, Dr Carsten Friedrich2, Dr David Lescinsky3, Dr Lesley Wyborn4Paola Petrelli5Margie Smith3Geoffrey Squire2

1 Australian Research Data Commons, mingfang.wu@ardc.edu.au

2 CSIRO, jens.klump@csiro.au, Sue.Cook@csiro.au, Carsten.Friedrich@data61.csiro.au

3 Geoscience Australia, David.Lescinsky@ga.gov.au

4 National Computational Infrastructure, lesley.wyborn@anu.edu.au

5CLEX, Centre of Excellence for Climate Extremes

 

DESCRIPTION

Software is pervasive in research. A UK Research Software Survey of 1000 randomly chosen researchers [2] shows: more than 90% of researchers acknowledge software is important for their own research, about 70% of say their research would not be possible without it. In a separate study, Carver et al [3] examined 40 papers published in Nature from Jan to March 2016, 32 of them explicitly mentioned software. These surveys provide evidence that software plays an important role in research, and hence, software should be treated in the same way as other research inputs and outputs that are part of the record of research such as research data and paper publications. But of greatest importance, to enable research reproducibility, any software that underpins research should be discoverable and accessible.

Beyond making software discoverable and accessible, best practice in open source software also recommends choosing an Open Source licence that complies with third-party dependencies to clarify the legal framework for reuse and distribution of the source code.   Furthermore, the long-term sustainability of an Open Source project is supported by clear and transparent communication and processes describing how developers can contribute to the project and how these contributions are governed. It is important that the community (both developers and software users) is involved early in the software development process, to ensure that developed software is more reusable and sustainable [4].

Current International initiatives working on to make research software reproducible and reusable can be summarized in three areas:

  1. Open research and scholarly communication.  Working groups/projects (e.g. the FORCE11 Software Citation Implementation WG, the RDA Software Source Code Interest Group and the CodeMeta Project), repositories and catalogues (e.g. DataCite, Zenodo and Code Ocean), as well as publishers (e.g. Journal of Open Source Software, Nature, Elsevier), are setting up software dissemination, cataloguing, discovery and review processes.
  2. Sustainable software. Working towards Sustainable Software for Science (WSSSPE) and Research Software Sustainability Institutes in UK, US and elsewhere are encouraging, exchanging experiences or providing training courses for software development to ensure it is sustainable.
  3. Sustainable community. Research Software Engineering Association (and their chapters) has been working on advocating career path and funding for research software engineers. Parallel initiatives in communities such as the FORCE11 software citation implementation working group, research groups and publishers on citation metrics and credit models to research software engineers should ensure appropriate accreditation for contributions to software [1].

We propose a 60-minute BoF session that will mix presentations with round-table discussions. We will first provide an overview of international initiatives and activities along the above three areas, and three lighting talks on software description, curation and publishing workflow.  The presentations will be followed by a round-table group discussion on current practices and barriers people are facing in managing and describing software. The outcome from this discussion will be actions for various software interest or working groups, including an Australian software citation Interest Group.

This work is being done in partnership with the Earth Systems Information Partners (ESIP) of the US, in particular the ESIP Software and Services Cluster.  ESIP is supported by NASA, NOAA, USGS and 110+ member organizations.

REFERENCES

  1. Smith A. M., Katz D. S., Niemeyer K. E., FORCE11 Software Citation Working Group. (2016) Software Citation Principles. PeerJ Computer Science 2:e86. DOI:10.7717/peerj-cs.86.
  2. S. J., et al. (2014). UK Research Software Survey 2014 [Data set]. doi:10.5281/zenodo.14809
  3. Carver, J.C., Gesing, S., Katz, D. S., Ram, K., and Weber, N., (2018). Conceptualization of a US Research Software Sustainability Institute (URSSI), in Computing in Science & Engineering, vol. 20, no. 3, pp. 4-9, May./Jun. 2018. doi:10.1109/MCSE.2018.03221924
  4. Jiménez RC, Kuzak M, Alhamdoosh M et al., (2017). Four simple recommendations to encourage best practices in research software [version 1; referees: 3 approved]. F1000Research 2017, 6:876 (doi:12688/f1000research.11407.1)

Biographies:

Mingfang Wu is a senior business analyst at ANDS/Nectar/RDS.  https://orcid.org/0000-0003-1206-3431

Jens Klump is a geochemist by training and OCE Science Leader Earth Science Informatics in CSIRO Mineral Resources.  Follow him on Twitter as @snet_jklump.

Sue Cook is a Data Librarian with the Research Data Support team of CSIRO Information Management and Technology.

Carsten Friedrich is a Research Team Leader at CSIRO Data61.  At CSIRO he worked in a variety of areas including Cloud Computing, Cyber Security, Virtual Laboratories, and Scientific Software Registries.

David Lescinsky is currently the team lead of GA’s Informatics Team and is responsible for facilitating and managing GA’s eResearch projects.

Lesley Wyborn currently has a joint adjunct fellowship with NCI.  She is Chair of the Australian Academy of Science ‘Data for Science Committee’ and on the AGU Data Management Advisory Board and the Steering Committee of the AGU-led FAIR Data Publishing Project.

BoF for RSEs: Recognition and Career Development for Researchers who Code

Ms Kerri Wait1, Dr Rebecca Lange2, Ms Amanda Miotto3, Dr Manodeep Sinha4,5, Dr Jens Klump6, Mr Rowland Mosbergen7, Dr Steven Manos8, Ms Heidi Perrett9

1 Monash eResearch Centre, Monash University, Australia, kerri.wait@monash.edu

2 Curtin Institute for Computation, Curtin University, Perth, Australia, rebecca.lange@curtin.edu.au

3 eResearch Services Griffith/QCIF, Griffith University, Nathan, Australia, a.miotto@griffith.edu.au

4 Centre for Astrophysics & Supercomputing, Swinburne University of Technology, Australia, msinha@swin.edu.au

5 ARC Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D)

6 CSIRO Mineral Resources, Perth, Australia, jens.klump@csiro.au

7 University of Melbourne, Parkville, Australia, rowland.mosbergen@nectar.org.au

8 University of Melbourne, Australia,  smanos@unimelb.edu.au

9 eResearch Support Services, Griffith University, Nathan, Australia, h.perrett@griffith.edu.au 

 

DESCRIPTION

This BoF is to build awareness of the Research Software Engineer (RSE) community and to identify volunteers to help implement the practical solutions for the RSE community that have been identified at the pre-conference workshop. If you are an academic/researcher who codes but are lacking recognition and metrics needed to progress your academic career; a professional software engineer working in the research space; or a  system administrator who maintains research systems, you should attend.

The term RSE, originally coined by the UK RSE association (rse.ac.uk), says the following about RSEs: “A growing number of people in academia combine expertise in programming with an intricate understanding of research. Although this combination of skills is extremely valuable, these people lack a formal place in the academic system.”

Inspired by the success of the RSE Association in UK, we are seeking to establish an Australasian Chapter of the RSE Association. Together with international bodies and support from our national organisations such as AeRO, NeSI, CAUDIT, the Australian Research Data Commons (ARDC), and research institutions, we aim to campaign for the recognition and adoption of the RSE role within the research ecosystem. Alongside this, appropriate recognition, reward and career opportunities for RSEs are needed. We plan to organise regular events to allow RSEs to meet, exchange knowledge and collaborate on methods to create these opportunities.

We ran an initial Expression of Interest survey in 2017 among Australian and New Zealand researchers and found that majority of the respondents prioritised:  (1) Increased recognition of the RSE role, (2) More appropriate, consistent and representative position descriptions and KPIs, and (3) Community development through regular events and gatherings.

Please join us on this event to actively work on how we can grow this community and advocate for others. Together, we can build a sustainable community that benefits research software engineers, and ultimately contributes to more efficient and reproducible research.

FORMAT

A quick presentation of “Who is an RSE?”, the problems that RSEs face, the status of different institutions with regards to employing RSEs and suggested solutions will kick off the BoF. There will then be a breakout session to identify who would like to volunteer their time for which solution.

REFERENCES

  • Research Software Engineers Association. Available from: http://rse.ac.uk/, accessed 6th June

Biography:

Kerri Wait is an HPC Consultant at Monash University. As an engineer, Kerri has a keen interest in pulling things apart and reassembling them in novel ways. She applies the same principles to her work in eResearch, and is passionate about making scientific research faster, more robust, and repeatable by upskilling user communities and removing entry barriers. Kerri currently works with the neuroscience and bioinformatics communities.

About the conference

eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

Conference Managers

Please contact the team at Conference Design with any questions regarding the conference.

© 2018 - 2019 Conference Design Pty Ltd