Mr Derek Benson1, Dr Tim Ho2, Dr P. Scott Chandry3, Dr Glenn Mellor4
1CSIRO, Pullenvale, Australia
2CSIRO, Clayton, Australia
3CSIRO, Werribee, Australia
4CSIRO, Coopers Plains, Australia
Galaxy is a workflow platform that enables scientists to connect powerful computational analysis tools into pipelines which can be offloaded to high performance computing (HPC) systems. This work demonstrated how we applied Galaxy to a scientific problem important to Australia’s red meat industry through a genomic analysis pipeline.
As part of an eResearch Collaboration project, we allocated 20% of an FTE over a 6-month period to work with a research project team to perform genomics analyses for a partner in Australia’s red meat industry. During the project, we integrated multiple tools required by the pipeline into the Galaxy service to create a reproducible genomic analysis workflow.
The workflow was created to deploy a bacterial characterisation pipeline for CSIRO’s support of the meat industry. Isolated sequence data was processed on CSIRO’s Galaxy platform using genomic analysis tools in a process different from a traditional genus / species / serotype approach that facilitates improved bacterial hazard characterisation. The pipeline includes quality control, assembly of bacterial genomes, and searching and reporting on genes and virulence factors to build a risk profile for predicting foodborne disease potential. It makes extensive use of the HPC facility at CSIRO to improve the speed of processing with HPC resources dynamically matched to the size of input data and the tools being used.
This work used a mixture of high performance computing and storage resources to support a genomic analysis pipeline. The CSIRO Galaxy platform hides the underlying infrastructure complexities, allowing researchers to focus on creating reproducible science.
Bio to come