Supporting the discovery and reuse of workflows in bioinformatics

Dr Ove Johan Ragnar Gustafsson1, Dr Tracy Chew2, Dr Georgina  Samaha2, Dr Sarah  Beecroft3, Dr Marco de la Pierre3, Dr Rosemarie  Sadsad2, Mark Gray3, Dr Gareth  Price4, Lisa  Phippard1, Dr Nigel  Ward5, Prof Andrew  Lonie1

1Australian BioCommons University of Melbourne, Melbourne, Australia, 2Sydney Informatics Hub University of Sydney, Sydney, Australia, 3Pawsey Supercomputing Centre, Perth, Australia, 4Galaxy Australia, Brisbane, Australia, 5Australian BioCommons, Brisbane, Australia

Bioinformatics workflows connect software packages into multi-step processes that are used to transform raw data into a form that can be analysed and interpreted to answer a variety of research questions in life science. Some examples include workflows for genome assembly, variant calling, metagenomics and proteome quantitation.

Each workflow represents intellectual property, with significant time and effort required during development, configuration and maintenance. To support FAIR principles, workflows should be 1) published, 2) discoverable, 3) reusable and 4) citable, as this allows for sharing of best practice approaches, accelerated science, reduced duplication of effort, and recognition for bioinformaticians.

To facilitate this ideal, the Australian BioCommons has established a presence on the WorkflowHub registry. Now, workflows tested on Australian national computational infrastructures (i.e. National Computational Infrastructure (NCI), Pawsey Supercomputing Centre, QRISCloud, Galaxy Australia) are registered on WorkflowHub, improving reusability and findability respectively. Using WorkflowHub also aligns with community requirements for integration with git development repositories and the ability to mint digital object identifiers (DOIs).

Since March 2021, 31 workflows have been registered. In total, the workflows have been viewed >16,000 times, downloaded >250 times and are now being cited in publications. Together, these observations demonstrate growing interest in reusable workflows. The BioCommons is working with our communities to further expand their use of registries, and that these registries align well with research processes in the life


Biography:

Johan is part of the community engagement team in the Australian BioCommons. He has a background in mass spectrometry and proteomics, with a PhD in Biochemistry focused on mass spectrometry imaging (MSI). Since joining the Australian BioCommons in 2019, Johan has led engagements with the proteomics, metabolomics and bioinformatics workflows communities, as well as the Bioplatforms Australia Framework Initiatives. He led creation of the ToolFinder service, and has coordinated the contribution of best practice BioCommons workflows to the WorkflowHub registry.

Categories