Mr John Marcou1, Robbie Clarken1, Ron Bosworth1, Andreas Moll1
The Australian Synchrotron Computing Infrastructure (ASCI) is a platform to deliver users easy access to a remote desktop to process their experiment data. Every experiment station, or beamline, can start and connect to their own desktop environment with a web-browser, find their specific processing applications, and access their experiment data.
ASCI acts as a Service Catalog for Docker. It is used to start remote interactive desktops or processing services which run inside Docker containers.
- Remote Desktop instances
- Web-browser based
- User home folders
- Desktop sharing feature between users
- CUDA and OpenGL on NVIDIA support
- Run multiple sessions on multiple nodes in parallel
- Label-based scheduler to distribute the load on cluster
ASCI is a stateless micro-services solution. Every component runs within a Docker container. The infrastructure is based on the CoreOS Container Linux operating system, providing the main tooling to run containerized applications. This operating system is deployed using Terraform which is an infrastructure management tool supporting multiple providers, and allows automated machine deployment. With ASCI, we use Terraform manifests to deploy CoreOS Container Linux to virtual machines on VMware vSphere cluster, or as stateless operating system on bare-metal, using CoreOS Matchbox.
THE CONTROL PLANE
The ASCI Control Plane provides two main components:
- the ASCI Web UI contacts the ASCI API to interact with ASCI and start Desktops
- the ASCI API contacts the Docker API on the compute nodes to schedule and manage new ASCI instance
All the proxy elements are used to route the request within the infrastructure in order to reach the Web-UI or a specific Desktop. The ASCI Admin UI is a convenient way to customize the scheduler and list the running ASCI instances.
The user connects to a web-interface which lists the environments available for creation. When a desktop is requested, the ASCI API schedules the Docker container on the cluster. When the user connects to a desktop, the ASCI API generates a one-time-password and which is delivered to the user’s web-browser to establish a VNC connection over WebSockets (noVNC).
The desktop instance is running in a Docker container. A base image is built with standard tools and libraries, shared by every environment, such as the NVIDIA, CUDA and VirtualGL libraries, and the graphical environment (MATE).
This image is use as parent for every child environment, which provides specialised scientific applications.
Users can store their documents in their home folder. The sharing feature allow users to share their Desktop with others.
Supporting OpenGL is complex since the Xorg implementation doesn’t allow multiple desktops attached to the same GPU to process GLX instructions. A solution is the VirtualGL approach. Under this system there is a single GPU- attached Xorg server, called 3DX, and multiple non-GPU desktops, called 2DX. When an application started on a 2DX desktop needs to process a GLX instruction, the VirtualGL library catches and forwards the instruction to the 3DX server for processing on the GPU.
ASCI relies on the following infrastructure services:
- DNSmasq provides DNS resolution for the ASCI DNS sub-domain
- BitBucket is the Git repository manager used is this environment
- ASCI delivered applications are built as RPM packages which are stored on a local YUM Repository
- Autobuild is an application which builds Docker image on new commit event, and push them to the Docker Registry
- Docker Registry stores the Docker images. These images are downloaded as needed by the Docker hosts
- Matchbox is the PXE server to provide Boot-On-Lan. It is configurable via API. This system is used to boot ASCI workers on the network
The monitoring solution is built with these components:
- JournalBeat and MetricBeat run on every monitored system and collect log and metrics to send to the logging database
- HeartBeat monitors a list of services to report theirs state in the logging database
- ElasticSearch is a search engine used to store and index logs and metrics
- Kibana is a Web interface for ElasticSearch
- ElastAlert, the alert manager, watches logs and metrics to trigger alerts and notifications based on rules
I work at the Australian Synchrotron as DevOps/HPC engineer.