Mr Andrew Elwell1, Ms Boney Davis1
1Pawsey Supercomputing Research Centre, Kensington, Australia
Biography:
Andrew has been a Systems Administrator for over 20 years and has been using Linux since kernel 2.0 was released. Although he has a degree in biochemistry, he realised that sitting at a warm computer was preferable to working in a cold-room doing enzyme preparations.
He joined Pawsey just as the first Cray hardware was being delivered, moving from Europe where he had been working at CERN on the LHC computing grid software stack.
He has an MSc in Computer security and is a CISSP. His interests include systems monitoring and his work role has expanded from a HPC system administrator to the site security lead. When he's not at work he relaxes by tinkering on home automation systems and electronic projects.
https://orcid.org/0000-0002-7485-6077
Abstract:
We describe the requirements for system and event logging including those imposed by legislation, highlighting the issues relying on on-node copies alone. A brief review of some of the available products and solutions is given, leading to the choice of the software used by Pawsey. We describe the architecture chosen to allow for further future scaling, and the configuration used showing how this can be replicated at other sites with minimal overhead. We further explore issues found with the vendor product as shipped, and explain development and workarounds to optimise log quality. Finally we demonstrate typical reporting, dashboard and investigation workflows that can be achieved with this tooling.