Streamlining software deployment and access: two container stories
Marco De La Pierre1, Audrey Stott1, Sarah Beecroft1, Vanessa Sochat2, Paolo Di Tommaso3, Todd Gamblin2 1Pawsey Supercomputing Research Centre, Perth, WA, Australia2Lawrence Livermore National Labs, Livermore, CA, United States of America3Seqera Labs, Barcelona Spain
Abstract
Scientific software stacks are becoming increasingly large and complex, making their deployment especially challenging. For example, bioinformatics tools tend to have many dependencies and require frequent updates, which complicates provisioning on shared resources such as supercomputing facilities that require builds of multiple architectures. Software containers (e.g. Singularity) can significantly ease the deployment process, but require a non-trivial amount of expertise to use, and creativity to easily integrate into existing workflows.
In this talk, we present two projects that we have been developing with international collaborators over the past two years to address these challenges. The first is Singularity Registry HPC (SHPC), which enables automated deployment of containers as modules, allowing users to interact with the applications as if they were executables. The second is the highly successful Nextflow workflow engine’s service called Wave, which allows for on-the-fly augmentation of software containers to add desired packages before provisioning. Wave also includes support for package managers such as Spack, thereby enabling the deployment of thousands of custom-compiled, architecture optimised scientific packages.
Both projects are publicly available, providing tangible benefits to Pawsey researchers as well as bioinformatics communities such as the Australian BioCommons. More broadly, these projects can have a much wider national and international reach in many scientific domains.
Biography
Marco is a Supercomputing Applications Specialist at Pawsey Supercomputing Centre in Perth, Australia, holding a PhD in Materials Science with a theoretical and computational curriculum. Joining Pawsey in 2018, he engages with researchers in the fields of computational materials science, computational chemistry, bioinformatics and, recently, quantum computing. He is experienced in software development, workflow automation, software containers, scientific software stack management, and training.