Finding sustainable and user accepted solutions to the space and number of files needs of software environments

Dr Marlies Hankel1, Miss Shelly Harris1

1The University of Queensland, Brisbane, Australia

Biography:

Dr Marlies Hankel has a master in mathematics and a PhD in computational chemistry. She moved to The University of Queensland in 2004 and has held positions in the Australian Institute for Bioengineering and Nanotechnology and School of Mathematics as a senior research fellow and lecturer. Her research was in energy storage materials, and she taught numerical methods. She built her first high performance computing cluster in 2004 and has built 4 clusters over the years. She enjoys teaching and helping others and finally moved full time to the Research Computing Centre (RCC) in 2021. Her role in RCC is skills training and support of users and the development of operational procedures.

Abstract:

Situation

Software environments (SE), installed via conda, python or pip, provide a convenient way for users to install software for life sciences and artificial intelligence research domains. However, some SEs include many dependencies, databases or models which can quickly run into millions of files and terabytes of data. In a high-performance computing (HPC) environment available space is limited and the number of files and space for user data are constrained by quotas. Therefore, many users, especially those new to HPC, run into trouble with file system quotas.

Task

During this Birds of a Feather (BoF) we will discuss the type of user spaces provided on different HPC facilities and the common problems users encounter with these when installing SEs. Our goal is to identify what current measures are used and how these could be combined or improved to provide a sustainable way forward for users and HPC providers.

Action

We will discuss commonly employed solutions such as increasing user quotas, provision of centrally installed SEs, databases, and models, and user education. Participants will give an overview of their setup and outline the barriers they face. We will then discuss how common measures can be improved and better utilised.

Result

The key outcome of this BoF is to find sustainable solutions to the space and file problem of SEs that would be more readily adopted by users.

 

Categories