Optimizing Scratch Filesystems for HPC and AI Workloads

Mr Christopher Schlipalius1, Ms Gin Tan2, Ms Sarah Walters3, Mr Paul Hiew4, Mr Gary Mancuso5, Mr Chris Hines2

1The Pawsey Supercomputing Centre, Perth, Australia, 2Monash eResearch Centre, Melbourne, Australia, 3University of Queensland, Brisbane, Australia, 4NSCC, Singapore, Singapore, 5Hitachi Vantara, Melbourne, Australia

Biography:

Chris is an experienced presenter and Storage Manager with over 25 years of experience working and managing block storage, SANs, Tape and POSIX filesystems for large data holdings at The Curtin University of Technology and The Pawsey Supercomputing Centre.

He is a member of the SC25 Committee and has presented at and organised a number of workshops on Storage and Filesystems in Australia, USA, Germany and Singapore.

He also presented at the main Spectrum Scale Usergroup at SC18 on Improving Spark work load performance with Spectrum Conductor on Spectrum Scale (for scratch).

He ran the Versity User Group in Atlanta in 2025, the Spectrum Scale Usergroup in Singapore, and cities across Australia and was on the Spectrum Scale Usergroup worldwide organising committee and IBM Champion for 5 years.

Abstract:

This half-day technical workshop examines the types of scratch filesystems in supporting high-performance computing (HPC) and AI workloads.

Attendees will gain insight into multiple scratch filesystem and storage technologies through a set of four real-world scenarios utilizing IO-intensive (challenging) HPC algorithms.

Vendors will describe their filesystem via IO 500 Benchmark, then show their results for four applications via a highlight of performance before and then after.

How the after result was achieved will be shown via demonstrations and led exercises.

The scratch technologies investigated include Lustre, DAOS, VAST, Weka, and GPFS.

The session will highlight strengths, use cases, and trade-offs of different technology and tuning and look at the approach of each vendor and technology in how it would address these cases and the needs of each use case.

Participants will walk away with a deeper understanding of scratch storage strategies aligned with domain-specific computing challenges.

Target Audience:

HPC systems architects, storage administrators, AI infrastructure engineers, data scientists, and research computing professionals.

Preparation description:

The plan is to provide 4 cases or such ahead of time to those demonstrating as well as the requirement to IO500 to baseline each test scratch system to frame the environment.

The four subject domain algorithm examples and data sets:

1. Radio Astronomy (Pawsey reference data set example)

2. Bioinformatics Cryo-EM or similar

3. AI – e.g. LLM Training –

4. Popular choice – Forecasting model or genomic sequencing and analysis)

There will be co-presenters from companies and technical leaders to facilitate this workshop.

 

 

 

Categories