ARDC Nectar Research Cloud 101 for Beginners

Conveners: Dr Glenn Moloney5, Wilfred Brimblecombe1
Presenters: Andy Botting2, Sam Morrison3, Jake Yip4

1Australian Research Data Commons (ARDC) (built from ANDS, Nectar, RDS), wilfred@nectar.org.au
2Australian Research Data Commons (ARDC) (built from ANDS, Nectar, RDS), andrew.botting@unimelb.edu.au
3Australian Research Data Commons (ARDC) (built from ANDS, Nectar, RDS), sam.morrison@unimelb.edu.au
4Australian Research Data Commons (ARDC) (built from ANDS, Nectar, RDS), jake.yip@unimelb.edu.au
5Australian Research Data Commons (ARDC) (built from ANDS, Nectar, RDS), glenn.moloney@nectar.org.au

GENERAL INFORMATION

  • Half day workshop
  • Includes a hands-on component
  • Maximum of 30 people

DESCRIPTION

This workshop will provide an introduction to using the Australian Research Data Commons (ARDC) Nectar Research Cloud.  It  is for people who have no or limited experience in setting up and using Virtual Machines (VMs) in any cloud environment.

The workshop will have a “hands on” component so please bring your laptop.

The following topics/activities will be covered:

  1. What is the ARDC Nectar Research Cloud and how does it compare with commercial offerings (at technical usage level only – will not go into cost comparisons etc.) 20 minutes
  1. What is the Research Cloud good for? And what it is not good for? 10 minutes
  1. Access to the Research Cloud – How to get some of the Nectar Research Cloud. 20 minutes
    1. How to request access
    2. How to request resources
    3. How to request support to help you deploy and use your VM’s
  1. Exercise – Spin up a VM and install some software on it using Nectar Research Cloud. 120 minutes

WHO SHOULD ATTEND

Researchers or research support people who have perhaps heard of cloud computing or of the Nectar Research Cloud and would like an introduction.

WHAT TO BRING

Bring your laptop.  Required prerequisite knowledge – Moderate understanding of Unix or any other software development environment.  If you are not from an Australian or New Zealand University you will need an AAF account to gain access for the hands-on component.


BIOGRAPHIES

Andy Botting – Senior Engineer at the Australian Research Data Commons (ARDC), Nectar Research Cloud.  I’m a cloud-native Systems Engineer with a background in Linux, HPC.  Specialities: Linux, Android, Puppet, OpenStack and AWS.

Wilfred Brimblecombe – ICT Manager at the Australian Research Data Commons (ARDC), Nectar Research Cloud, is an IT management veteran with over 20 years of leadership experience across various organisations.

Sam Morrison – Senior Engineer at the Australian Research Data Commons (ARDC), Nectar Research Cloud.  Specialties: Linux system administration, Python/Django web programming, Security, Openstack cloud technologies.

Jake Yip – DevOps Engineer at Australian Research Data Commons (ARDC), Nectar Research Cloud. Specialities: Puppet, OpenStack, Networks, DevOps and Security.

 

Galaxy architecture and deployment experiences: a case study in how to build complex analysis systems for data-focussed science

Mr Simon Gladman1, Mr Derek Benson4, Dr Jeff Christiansen2, Dr Gareth Price3, A/Prof. Andrew Lonie1

1Melbourne Bioinformatics, University of Melbourne, Melbourne, Australia, simon.gladman@unimelb.edu.au
2Queensland Cyber Infrastructure Foundation, Brisbane, Australia, j.christiansen@uq.edu.au
3Queensland Facility for Advanced Bioinformatics, Brisbane, Australia, g.price@qfab.org
4Research Computing Centre, University of Queensland, Brisbane, Australia, d.benson.imb.uq.edu.au
5Melbourne Bioinformatics, University of Melbourne, Melbourne, Australia, alonie@unimelb.edu.au

GENERAL INFORMATION

  • Half day Workshop
  • includes a hands-on component
  • Up to 20 attendees

DESCRIPTION

Galaxy (https://galaxyproject.org) is a widely used, highly capable bioinformatics analysis platform. It provides users with a large library of analysis and visualization tools, reference datasets, interfaces to global databases, and evolving workflow capabilities that provide provenance and reproducibility. Users build complex analysis jobs in a highly accessible interface, which are then deployed via a scheduler to underlying computational resources. Galaxy has a relatively sophisticated approach to managing user jobs to compute resources and can, for instance, be configured to schedule jobs to disparate HPC and/or cloud resources depending on the job characteristics.

In this workshop we will explore the architecture of Galaxy Australia (http://usegalaxy.org.au), understanding how it is architected to deploy jobs from a common front end to compute resources in Queensland and Victoria. Jobs have access to a common multi-hundred-terabyte reference dataset collection that is intelligently mirrored in real time from the US-based  Galaxy Main (http://usegalaxy.org) using the CernVM file system (https://cernvm.cern.ch/portal/filesystem). We will explore the technologies, cover our experiences of how they work in practice, and discuss the ambitions of a global Galaxy infrastructure network that can leverage the efforts of a global community to maintain and support critical data and software resources.

OUTLINE OF WORKSHOP CONTENT:

  1. Overview of Galaxy. Technical overview of the componentry of Galaxy as a software platform and as a workflow generation and deployment system30 minutes
  1. Galaxy Australia architecture. Overview of the Galaxy Australia archtictural and deployment model.30 minutes
  1. Underlying technologies. Detailed exploration of the job distribution and data sharing technologies being used for Galaxy Australia.90 minutes
  1. Galaxy ‘World’ – roadmap discussion. How can multiple instances of Galaxy make use of complex, high maintenance resources including a tool library which is dependency-free and growing global reference datasets, whilst appearing as a seamless experience to non-expert users?30 minutes

WHO SHOULD ATTEND

Research infrastructure staff  interested in complex, distributed software systems and cutting edge technologies for job and data distribution.

WHAT TO BRING

A laptop, no special software required. We hope to demonstrate some of the technologies being used in Galaxy.


BIOGRAPHY

Andrew Lonie is Director of the Melbourne Bioinformatics, Director of the EMBL Australia Bioinformatics Resource (EMBL-ABR: http://embl-abr.org.au), and an associate professor at the Faculty of Medicine, Dentistry and Health Sciences at the University of Melbourne, where he coordinates the MSc (Bioinformatics). Andrew directs a group of bioinformaticians, computational biologists and HPC specialists within the Melbourne Bioinformatics and EMBL-ABR to collaborate with and support life sciences researchers in a variety of research projects across Australia.

 

Pawsey Supercomputing Centre – Engaging for the Future

Dr Neil Stringfellow1, Dr Daniel Grimwood1

1Pawsey Supercomputing Centre, Kensington, Australia

 

ABSTRACT

The Pawsey Supercomputing Centre continues to evolve and grow.  Recent developments at Pawsey include the new advanced technology testbed called Athena, as well as the expansion of the Zeus commodity Linux cluster.  The Athena testbed includes Intel Xeon Phi “Knight’s Landing” processors as well as Nvidia “Pascal” GPUs.

I will also touch on the longer term vision for Pawsey and the Federal infrastructure roadmap.

 


Biography:

Dr Neil Stringfellow is the Executive Director of the Pawsey Supercomputing Centre.

Neil has led the Pawsey Supercomputing Centre since 2013, overseeing the operational launch of of the Magnus Cray supercomputer.  Neil joined Pawsey from the Swiss National Supercomputing Centre (CSCS) where he was involved in application and science support, the management of strategic partnerships and Switzerland’s flagship supercomputing systems.

How to Improve Fault Tolerance on HPC Systems?

Dr Ahmed Shamsul Arefin1

1Scientific Computing, Information Management & Technology, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Canberra, Australia

INTRODUCTION

HPC systems of today are complex systems made from hardware and software that were not necessarily designed to work together as one complete system. Therefore, in addition to regular maintenance related downtimes, hardware errors such as voltage fluctuation, temperature variation, electric breakdown, manufacturing defects as well as software malfunctions are common in supercomputers. There are several ways to achieve some fault tolerance which includes (but not limited to): checkpoint/ restart, programming model-based fault tolerance and algorithm-based theoretical solutions, but in real-life none of these cover all the situations mentioned above [1]. In this work, we have reviewed a few of the checkpoint/restart methods and we listed our experience with them including two practical solutions to the problem.

TEST DESCRIPTION

On a small cluster consisting only two nodes (master and compute) constructed using Bright Cluster Manager (BCM) ver. 7.3, we deployed a couple of existing methods (see below) that can support operating systems and software level fault tolerances. Our experience described below. In our model cluster, each node had 16 CPU cores in 2 x Intel Xeon CPU E5-2650 0 @ 2.00GHz, min 64GB RAM, 500GB local HDD, and Ethernet/ Infiniband connections for networking purposes. Slurm ver. 16.05 was installed as a part of BCM’s provisioning on the both nodes along with SLES 12Sp1 as the operating system. The master was set to act as both head and login nodes as well as the job scheduler node.

OUTCOMES

Our first test candidate was the BLCR (Berkeley Lab Checkpoint/Restart). It can allow programs running on Linux to be checkpointed i.e., written entirely to a file and then later restarted. BLCR can be useful for jobs killed unexpectedly due to power outages or exceeding run limits. It neither requires instrumentation or modification to user source code nor recompilation. We were able to install it on our test cluster and successfully checkpointing a basic Slurm job consisting only one statement: sleep 99. However, we faced the following difficulties during its pre and post-installation:

  • The BLCR website [2] lists an obsolete version of the software (ver. 0.8.5, last updated on 29 Jan 2013) which does not work with newer kernels such as what we get on the SLES12SP1. We had to collect a newer version along with a software patch from the BCLR team to get it installed on the test cluster. Slurm had to be recompiled and reconfigured several times following the BLCR installation, requiring ticket interactions with the Slurm, BCM and BLCR teams. Unfortunately, at the end the latest BLCR installation process did not work on the SLES12SP2 due to newer kernel version again (>4.4.21).
  • MPI incompatibility: The BLCR when installed along with Slurm can only work with MVAPICH2 and doesn’t support Infiniband network [2]. This MPI variant was not available in our HPC apps directory, therefore could not be tested.
  • Not available to interactive jobs: The BLCR + Slurm combination could not checkpoint/restart interactive jobs properly.
  • Not available to GPU jobs: The BLCR did not support GPU or Xeon Phi jobs.
  • Not available to licensed software: Talking to a license server, after a checkpoint/ restart did not work.

Our next candidate was the DMTCP (Distributed Multi Threaded Checkpointing). This application level tool claims to transparently checkpoint a single-host or distributed MPI computations in user-space (application level) – with no modifications to user code or to the O/S. As of today, Slurm cannot be integrated with DMTCP [2,4]. It was also noted that after a checkpoint/restart computation results can become inconsistent when compared against the non-checkpoint output, validated by a different CSIRO-IMT (User Services) team member.

Our next candidate was the CRIU (Checkpoint/Restore In Userspace). It claims to freeze a running application or at least a part of it and checkpoint to persistent storage as a collection of files. It works in user space (application level), rathe r than in the kernel. However, it provides no Slurm, MPI and Infiniband supports as of today [2,5]. We therefore have not attempted it on the test cluster.

PRACTICAL SOLUTIONS

After reviewing some of the publicly available tools that could potentially support the HPC fault tolerance, we decided propose the following practical solutions to the CSIRO HPC users.

CHECKPOINT AT THE APPLICATION LEVEL

Application level checkpointing: In this method, user’s application (based on [1]) will be set to explicitly read and write the checkpoints. Only the data needed for recovery was written down and checkpoints need taken at “good” times. However, this can result in a higher time overhead, such as several minutes to perform a single checkpoint on larger jobs (increase in program execution time) as well as user-level development time. Therefore, users may need further supports when applying this solution.

NO-CHECKPOINT, KILL-REQUEUE THE JOB

Slurm supports job preemption, by the act of stopping one or more “low-priority” jobs to let a “high-priority” job run. When a high-priority job has been allocated resources that have already been allocated to one or more low priority jobs, the low priority job(s) are pre-empted (suspended). The low priority job(s) can resume once the high priority job completes. Alternately, the low priority job(s) are killed, requeued and restarted using other resources. To validate this idea,  we’ve  deployed a  new  “killable job”  partition implementing a  “kill  and  requeue” policy.  This  solution was successfully tested by killing low priority jobs and can be further tuned to auto requeue lost jobs following a disaster, resulting an improved fault tolerance on our HPC systems.

CONCLUSION AND FUTURE WORKS

Using a test cluster, we investigated and evaluated some of the existing methods to provide a better fault tolerance on our HPC systems. Our observations suggests that the program level checkpointing would be the best way to safeguard a running code/ job, but at the cost of development times. However, we have not yet attempted a few other methods that could potentially provide an alternative solution, such as: SCR- Scalable Checkpoint/restart for MPI, FTI – Fault Tolerance Interface for hybrid systems Docker, Charm++ and so on, which remains as our future work.

REFERENCES

[1] DeBardeleben et al., SC 15 Tutorial: Practical Fault Tolerance on Today’s HPC Systems SC 2015

[2] Hargrove, P. H., & Duell, J. C. (2006). Berkeley lab checkpoint/restart (blcr) for linux clusters. In Journal of Physics: Conference Series (Vol. 46, No. 1, p. 494). IOP Publishing.

[3] Rodríguez-Pascual, M. et al. Checkpoint/restart in Slurm: current status and new developments, SLUG 2016

[4] Ansel, J., Arya, K., & Cooperman, G. (2009). DMTCP: Transparent checkpointing for cluster computations and the desktop. In IPDPS 2009. IEEE International Symposium on (pp. 1-12). IEEE.

[5] Emelyanov, P. CRIU: Checkpoint/Restore In Userspace, July 2011.


Biography:

Dr Ahmed Arefin works as an IT Advisor for HPC Systems at the Scientific Computing, IM&T, CSIRO. In the past, he was as a Specialist Engineer for the HPC systems at the University of Southern Queensland. He has done his PhD and Postdoc in the area of HPC & parallel data mining from the University of Newcastle and  published articles in PLOS ONE and Springer journals and IEEE sponsored conference proceedings. His primary research interest focuses on the application of high performance computing in data mining, graphs/networks and visualization.

https://orcid.org/0000-0002-9290-3551

ASCI, a Service Catalog for Docker

Mr John Marcou1, Robbie Clarken1, Ron Bosworth1, Andreas Moll1

1Australian Synchrotron, Melbourne, Victoria, robbie.clarken@synchrotron.org.au, john.marcou@synchrotron.org.au, ron.bosworth@synchrotron.org.au, andreas.moll@synchrotron.org.au

 

MOTIVATION

The Australian Synchrotron Computing Infrastructure (ASCI) is a platform to deliver users easy access to a remote desktop to process their experiment  data. Every experiment  station, or beamline, can start and connect to their own desktop environment with a web-browser, find their specific processing applications, and access their experiment data.

ASCI acts as a Service Catalog for Docker. It is used to start remote interactive desktops or processing services which run inside Docker containers.

KEY FEATURES

  • Remote Desktop instances
  • Web-browser based
  • User home folders
  • Desktop sharing feature between users
  • CUDA and OpenGL on NVIDIA support
  • Run multiple sessions on multiple nodes in parallel
  • Label-based scheduler to distribute the load on cluster

ARCHITECTURE

ASCI is a stateless  micro-services  solution.  Every  component  runs within  a Docker container.  The  infrastructure   is  based  on  the  CoreOS  Container  Linux  operating system, providing the main tooling to run containerized  applications.  This operating system is deployed using Terraform which is an infrastructure management tool supporting  multiple  providers,  and  allows  automated  machine  deployment.  With ASCI,  we  use  Terraform  manifests  to  deploy  CoreOS  Container  Linux  to  virtual machines on VMware vSphere cluster, or as stateless operating system on bare-metal, using CoreOS Matchbox.

THE CONTROL PLANE

The ASCI Control Plane provides two main components:

  • the ASCI Web UI contacts the ASCI API to interact with ASCI and start Desktops
  • the ASCI API contacts the Docker API on the compute nodes to schedule and manage new ASCI instance

All the proxy elements are used to route the request within the infrastructure in order to reach the Web-UI or a specific Desktop. The ASCI Admin UI is a convenient way to customize the scheduler and list the running ASCI instances.

THE DESKTO

The user connects to a web-interface which lists the environments available for creation. When a desktop is requested, the ASCI API schedules the Docker container on the cluster. When the user connects to a desktop, the ASCI API generates a one-time-password  and which is delivered to the user’s  web-browser  to establish  a VNC connection over WebSockets (noVNC).

The desktop  instance  is running  in a Docker  container.  A base image is built with standard tools and libraries, shared by  every  environment,   such  as  the  NVIDIA,  CUDA  and VirtualGL libraries, and the graphical environment (MATE).

This  image  is  use  as  parent  for  every  child  environment, which provides specialised scientific applications.

Users can store their documents in their home folder. The sharing feature allow users to share their Desktop with others.

 

 

                                                                                                                                                                             ASCI architecture

 

OPENGL SUPPORT

Supporting OpenGL is complex since the Xorg implementation doesn’t allow multiple desktops attached to the same GPU to process GLX instructions. A solution is the VirtualGL approach. Under this system there is a single GPU- attached Xorg server, called 3DX, and multiple non-GPU desktops, called 2DX. When an application started on a 2DX desktop needs to process a GLX instruction, the VirtualGL library catches and forwards the instruction to the 3DX server for processing on the GPU.

VirtualGL architecture


I
NFRASTRUCTURE SERVICES

ASCI relies on the following infrastructure services:

  • DNSmasq provides DNS resolution for the ASCI DNS sub-domain
  • BitBucket is the Git repository manager used is this environment
  • ASCI delivered applications are built as RPM packages which are stored on a local YUM Repository
  • Autobuild is an application which builds Docker image on new commit event, and push them to the Docker Registry
  • Docker Registry stores the Docker images. These images are downloaded as needed by the Docker hosts
  • Matchbox  is  the  PXE  server  to  provide  Boot-On-Lan.  It  is  configurable  via  API.  This  system  is  used  to  boot  ASCI workers on the network

 The monitoring solution is built with these components:

  • JournalBeat  and  MetricBeat  run  on  every  monitored  system  and  collect  log  and  metrics  to  send  to  the  logging database
  • HeartBeat monitors a list of services to report theirs state in the logging database
  • ElasticSearch is a search engine used to store and index logs and metrics
  • Kibana is a Web interface for ElasticSearch
  • ElastAlert, the alert manager, watches logs and metrics to trigger alerts and notifications based on rules

 


Biography:

I work at the Australian Synchrotron as DevOps/HPC engineer.

HPC at The University of Sydney: balancing user experience with system utilisation

Dr Stephen Kolmann1

1The University Of Sydney, Sydney, Australia stephen.kolmann@sydney.edu.au

The University of Sydney’s High Performance Computing cluster, called Artemis, first came online with 1344 cores of standard compute, 48 cores with high memory and 5 nodes with 2 K40 GPUs each. These resources were made available as one large resource pool, shared by all users, with job priority determined by PBS Professional’s fairshare algorithm.1 This, coupled with three-month maximum walltimes, led to high system utilisation. However, wait times were long, even for small, short jobs, resulting in sub-optimal end user experience.

To help cater for strong demand and improve the end user experience, we expanded Artemis to 4264 cores. We knew this expansion would help lower wait times, but we did not rely on this alone. In collaboration with our managed service provider (Dell Managed Services at the time, now NTT Data Services), we designed a queue structure that still caters for a heterogeneous workload, but lowers wait times for small jobs, at the expense of some system utilisation. Figure 1 shows how we partitioned compute resources on Artemis to achieve this balance.

Figure 1: Distribution of Artemis’s compute cores. The left pie chart shows the coarse division of all Artemis’s compute cores, and the right pie chart shows the nominal distribution of compute cores within the shared area of the left pie chart.

The cores are divided into three broad categories: condominiums, strategic allocations and shared cores.

  1. Condominiums are compute nodes that we manage on behalf of condominium owners
  • Strategic allocations are dedicated to research groups who won access via a competitive application process
  • Shared cores are available to any Sydney University researcher who wants Artemis access

The shared cores are further sub-divided into separate resource pools that cater for different sized jobs.  This division was made to segregate small, short jobs from large, long running jobs. The idea behind this partitioning is that short, small should start quickly, but larger, longer running jobs should be willing to tolerate longer wait times.

This poster will explore our experience with this queue structure and how it has impacted metrics such as job wait times, system utilisation and researcher adoption.

REFERENCES

1. PBS Professional Administrators Guide, p. 165. Available from:

http://www.pbsworks.com/documentation/support/PBSProAdminGuide12.pdf, accessed 1 Sep 2017.


Biography:

Stephen Kolmann is currently working at The University of Sydney as an HPC Specialist where he provides end-user HPC documentation and training and acts as a consultant for internal HPC-related projects. He completed a PhD in Computational Chemistry, where he made extensive use of HPC facilities both at The University of Sydney and NCI

Challenges of Real-time Processing in HPC Environments The ASKAP Experience

Mr Eric Bastholm1, Mr Juan Carlos Guzman1

1CSIRO, Kensington, Australia eric.bastholm@csiro.au juan.guzman@csiro.au

Overview

CSIRO Astronomy and Space Science operates the Australian Square Kilometer Array Pathfinder (ASKAP), a state-of-the-art radio telescope located at the Murchison Radio Observatory (MRO) in Western Australia. The data rates produced are significant enough that it is necessary to process in real-time in a multi-user high performance computing (HPC) environment. This presents a number of challenges with respect to resource availability and the performance variability resulting from interactions of many parallel software and hardware components. These issues include: disk performance and capacity, inter-node and intra-node parallel communication overheads, and process isolation.

Typically, supercomputing facilities are used for batch processing of large established data sources. Small intermittent processing delays resulting from temporary resource contention issues do not affect these processes as the total compute time is the dominating factor and small wait times are insignificant. However, for real-time processing of high data ingest rates these small and unpredictable contentions for resources can be problematic. In our case, even delays of seconds can have a negative effect on our ability to reliably ingest the data.

We have learned much from addressing these challenges and our experience and solutions will provide valuable input to the radio astronomy community, especially for larger projects, like the Square Kilometer Array, where the same challenges will present at even larger scales.

Disk Performance

A key component in the system is the high performance, high capacity Lustre file system. Our findings suggest that even though the hardware can be shown to perform at the required spec (10 GB/s sustained read/write) in practice other factors come into play which can lower that rate significantly or introduce a higher variance in the rate lowering its average. These include: the number of active writers, the I/O patterns used, reliance on external code libraries with unknown I/O behaviours, and the remaining capacity of the disk.

Inter-process Performance

To ingest and process high data rates it is necessary to parallelise the implementation. We have found that there are complicated issues surrounding process interaction which cause performance degradation in memory access, inter-process communication and threading. This manifests as dead regions in the processing stream which can result in lost data, errors, or a reduction in output quality.

Process Isolation

Our experience indicates that it is important to isolate key processes from the rest of the system as much as possible. Especially, those processes responsible for data ingest. In principle this seems fairly straight forward, but in practice it is non-trivial because the HPC environment is a complex mesh of devices, software, and networks which essentially prevents total isolation of any process.


Biography:

Eric Bastholm

Team Leader ASKAP Science Data Processor

Leads a team of software developers with backgrounds in astronomy, software engineering and data science whose purpose is to develop and test the data reduction software pipelines for the Australian Square Kilometre Array Pathfinder (ASKAP) radio telescope managed by CSIRO.

Teaching High Throughput Computing: An International Comparison of Andragogical Techniques

Mr Lev Lafayette1

1University Of Melbourne, Parkville, Australia

 

The importance of High Throughput Computing (HTC), whether through high performance or cloud-enabled, is a critical issue for research institutions as data metrics are increasing at a rate greater than the capacity of user systems [1]. As a result nascent evidence suggests higher research output from institutions that provide access to HTC facilities. However the necessary skills to operate HTC systems is lacking from the very research communities that would benefit from them. Apart from reducing research output this also places additional unnecessary pressure on system administrators who need to respond to uneducated user requests.

With empirical data spanning several years correlating training and utilization from the Victorian Partnership for Advanced Computing (VPAC), the University of Melbourne, and Goethe-University Frankfurt Am Main, theoretical and practical issues are raised. As advanced adult learners, the postgraduate researchers involved in the course offerings from these institutions should be an ideal fit for andragogical teaching techniques [2]. However exploring the practices and course offerings of the listed institutions indicates a combination of experience and cultural factors reinforces the notion of a continuum between pedagogical and andragogical techniques, but also raises whether there are correlative stages for the components within this continuum or whether they are independent.

The results of this international investigation with variant teaching methods, classes, and results, provide opportunities for other institutions to initiate or review their own training programmes and create the conditions for dynamic HTC user communities and increased informed utilization of their systems.

REFERENCES

  1. Martin Hilbert, Priscila López. The World’s Technological Capacity to Store, Communicate, and Compute Information, Science Vol. 332 no. 6025 1 April 2011, pp. 60-65.
  2. Malcolm Shepherd Knowles, Elwood F. Holton, Richard A. Swanson, The Adult Learner: The Definitive Classic in Adult Education and Human Resource Development, Gulf Publishing Company, 1998

Biography

Lev Lafayette is the HPC Support and Training Officer at the University of Melbourne. Prior to that he worked for several years at the Victorian Partanership for Advanced Computing in a similar role. An active advocate of HPC and Linux, he has presented at numerous conferences and user groups on these subjects over the past decade.

Maximising Research Potential with Multipurpose Hardware

Miss Sara Drieman1

1eRSA, Thebarton, Australia

 

eResearch hardware is typically purchased with academic grant funding, resulting in large scale solutions using current market high end hardware. Due to the periodic nature of the funding and the continuous advances in technology, solutions purchased today can often appear archaic in even a year’s time. Without the funding to upgrade, this tends to lead to hardware solutions exceeding warranty and falling short of researcher needs towards the end of the lifecycle. In addition, funding is increasingly uncertain and grants may not be available to replace end of life infrastructure.

This was the cycle eRSA needed to break out of in order to accommodate the ever changing researcher needs. With new hardware required for both HPC and Cloud infrastructure, the problem had to be approached uniquely in order to provide both. Fortunately the solution was made possible with hardware leasing and by selecting hardware to meet both needs based on demand. Rather than the traditional “build it and they will come” approach and only focusing on a single service capability, we were able to create a multipurpose cluster with 8 nodes being used for Cloud and 12 for HPC.

As this solution is easily scalable, the cluster can grow to meet demand, rather than building a large underutilised platform. This provides the flexibility of increasing each service as required as well as meeting the researchers needs as they evolve. By leasing the hardware from the vendor, we are able to stay ahead of the hardware lifecycle, while ensuring capabilities progress as the technology does.

Of course, constant change of hardware components can lead to a high overhead in system administration. This challenge is handled with configuration management tools, ensuring new nodes need minimal human input to join the cluster.

This presentation details how we configured the multipurpose cluster, an overview of the technical specifications of the cluster, how we overcame the challenges of increased overheads, and other cost saving aspects we  adopted; all to ensure we provide our users with a service that allows them to do what they do best, leaving the technical knowhow to us.

 


Biography

Sara Drieman is the Operations Team Leader at eResearch SA, where she works with researchers as well as commercial users. Having an extensive background in both IT Support and System Administration roles, Sara takes pride in helping users adopt technologies as well as improving services for her team and eRSA clients.

ASCI: An interactive desktop computing platform for researchers at the Australian Synchrotron

Mr Robbie Clarken1, Mr John Marcou1, Mr Ron Bosworth1, Dr Andreas Moll1

1Australian Synchrotron, Melbourne, Australia

 

MOTIVATION

The volume and quality of scientific data produced at the Australian Synchrotron continues to grow rapidly due to advancements in detectors, motion control and automation. This means it is critical that researchers have access to computing infrastructure that enables them to efficiently process and extract insight from their data. To facilitate this, we have developed a compute platform to enable researchers to analyse their data in real time while at the beamline as well as post-experiment by logging in remotely. This system, named ASCI, provides a convenient web-based interface to launch Linux desktops running inside Docker containers on high-performance compute hardware. Each session has the user’s data mounted and is preconfigured with the software required for their experiment.

ARCHITECTURE

ASCI consists of a cluster of high performance compute nodes and a number supporting applications. These include an application for launching and managing instances (asci-api), a web interface (asci-webui) and a proxy for relaying connections (asci-proxy).

 

Figure 1: Sequence for creating an ASCI desktop session

Users connect to ASCI by logging in to the web interface and selecting an environment appropriate for processing their data. The webui will send a request for a new instance of that environment type to the asci-api. The asci-api selects the best compute node to launch the instance on based upon the requirements of the environment and the load on the cluster. Once it has picked a node, the asci-api launches a Docker container based upon the requested environment. The user is then presented with an icon representing the running session and they can connect to this desktop from their web browser.

When the user initiates a connection, a VNC session is created inside the Docker instance with a one-time password. This password is used to launch a NoVNC connection in the user’s browser and the user is presented with their desktop and can commence analysing their data.

DOCKER and ASCI ENVIRONMENTS

Docker containers are a technology for creating isolated process environments on Linux. We chose this technology for the ASCI user environments because they deliver almost identical performance compared with running on the bare-metal operating system, while enabling multiple users to simultaneously use the node. Docker also enables us to create predefined environments, tailored with the applications required for different types of experiments. Each environment is based on a docker image which is defined by a text file outlining how to prepare the desktop. These image recipes support inheritance, enabling us to have a base image with installs the ASCI infrastructure applications and then child images with the specialised scientific software for the different experiments.

WEB INTERFACE

An early goal of the ASCI project was to allow users to connect to desktop sessions through their standard browser rather than requiring users run a specialised application. This greatly lowers the barrier to entry to using the system and allows users to access it from any operating system, including tablets and mobiles.

We built the web interface using Flask for the server and React for generating the front-end. For rendering the desktops in the browser, we utilise NoVNC which delivers a VNC connection over WebSockets. This results in a responsive interface that runs on all platforms, including mobile and tablet.

VIRTUALGL

In order to provide GPU hardware acceleration to multiple ASCI instances on one node, we need to use a modification of the traditional X architecture. Usually, the X server has direct access to the GPU hardware and this allows graphical application to execute OpenGL instruction. The challenge when running multiple desktops on a single node, is that multiple X servers cannot share direct access to the same GPU hardware.

To address this, we run a single X server directly on the node; this is known as the 3DX server. Then every ASCI instance runs its own internal 2DX server which handles graphical applications, such as the desktop environment. When applications make use of the GPU, we launch them with special environment variables which causes them to load VirtualGL libraries in place of the standard OpenGL libraries. The VirtualGL libraries will catch and forward all OpenGL instructions on to the 3DX server which then executes them on the GPU.

DEPLOYMENT

Every component of the ASCI system runs inside its own Docker container. This enables us to precisely define the environment of each application, such as the operating system and dependencies, and to easily reproduce the applications on different hosts. It also means when a developer tests the application on their local machine, they are doing so in the same environment as it will run in production. To facilitate deploying updates we created an application called Autobuild which receives notifications from our Bitbucket server whenever a tag is added to an application’s git repository. When Autobuild sees a new tag it clones the code from Bitbucket and uses Docker to build an image for the application based on a Docker file in the repository. The built image is then pushed to our internal Docker registry ready for deployment.

MONITORING

To monitor ASCI we use a collection of open source tools known as the Elastic Stack. This includes a database, Elastic Search, for capturing logs and metrics, and the front-end website, Kibana, for viewing logs and creating dashboards. To harvest logs we have the applications inside the docker containers log to standard out and configure docker to forward the logs to journald. A utility called Journalbeat then collects the logs and sends them to an Elastic Search pipeline based on the source of the log. The pipeline parses the log and ingests the output into Elastic Search. For alerting, we have an application called ElastAlert monitor the Elastic Search database and trigger a Slack notification based on certain rules. This enables us to be instantly alerted whenever an error occurs or in the case of unusual user behaviour on the website which may be indicative of an attack on the system.

CONCLUSION

ASCI is now in use at the Australian Synchrotron for processing data being produced at the Medical Imaging, X-ray Fluorescence Microscopy and Micro-crystallography beamlines. The simple web interface and tailored environments provide an easy and intuitive platform for users to process their data and the automated build systems allow fast and painless deployment of updates. Future upgrades to the system will include supporting alternative interfaces to the environments, such as Jupyter Notebooks, and integrating a batch job submission system to distribute processing tasks across multiple nodes.


Biography

Robbie Clarken is a Scientific Software Engineer at the Australian Synchrotron. Robbie has a BSc in Nanotechnology from Flinders University and in previous roles at the Synchrotron has been a Particle Accelerator Operator and Robotics Controls Engineer. Currently he helps researchers at the Synchrotron extract insight from their data by developing data processing systems.

Recent Comments

    Categories

    About the conference

    eResearch Australasia provides opportunities for delegates to engage, connect, and share their ideas and exemplars concerning new information centric research capabilities, and how information and communication technologies help researchers to collaborate, collect, manage, share, process, analyse, store, find, understand and re-use information.

    Conference Managers

    Please contact the team at Conference Design with any questions regarding the conference.

    © 2018 - 2020 Conference Design Pty Ltd