Mr. Jake Carroll1, Prof. David Abramson1
1University Of Queensland, QLD, Australia
Biography:
David Abramson is a Professor of Computer Science at the University of Queensland. He has been involved in computer architecture and high performance computing research since 1979. He has held appointments at Griffith University, CSIRO, RMIT and Monash University.
Abramson has expertise in High Performance Computing, distributed and parallel computing, computer architecture and software engineering.
He has produced in excess of 230 research publications, and some of his work has also been integrated in commercial products. One of these, Nimrod, has been used widely in research and academia globally, and is also available as a commercial product, called EnFuzion, from Axceleon.
His world-leading work in parallel debugging is sold and marketed by Cray Inc, one of the world’s leading supercomputing vendors, as a product called ccdb.
He is a Fellow of the Association for Computing Machinery (ACM), the Institute of Electrical and Electronic Engineers (IEEE), the Australian Academy of Technology and Engineering (ATSE), and the Australian Computer Society (ACS).
Abramson is currently the president of AeRO.
Abstract:
The scientific protocols, experiments and instruments that generate data are now an integral part of the research lifecycle. It follows that almost every scientific research institution will require a Research Data Storage System (RDSS). However, RDSS implementations vary significantly due to factors such as cost, geography, workloads, policy, risk tolerance and available technical skills. Further, RDSS may be on premises, in public cloud infrastructure or a mixture of both.
In a previous paper (and a presentation at eResearch Australasia 2023) we outlined 10 key features of a RDSS and defined an abstract high-level Research Data Reference Architecture (RDRA). This reference architecture has been rehearsed and revised in an AeRO forum in 2023.
In this talk we survey six significant RDSS implementations, describing their context, communities and what makes them unique. We analyse each implementation against the RDRA and illustrate the significant flexibility it provides for building or procuring such systems. We contribute a new meta-model for describing an RDSS implementation and describe a higher-level structure – the Research Data Storage System Aggregator (RDSS-A).
The work will be of value to groups charged with building or procuring a RDSS for their organisation and will provide a good survey of some of the most significant international implementations.