Managing data provenance for gravitational wave data analysis

Mr Patrick Clearwater1,2,3, Dr Surya Nepal3, Prof Andrew Melatos1,2

1The University Of Melbourne, Melbourne, Australia, 

2ARC Centre of Excellence for Gravitational Wave Discovery (OzGrav), Australia

3Data61, CSIRO, Marsfield, Australia


First detected in 2015 by the Laser Interferometer Gravitational-wave Observatory (LIGO), gravitational waves are an exciting new area in astronomy, allowing astrophysicists to directly observe hitherto-invisible objects in the sky. However, this breakthrough brings with it a plethora of challenges in managing the terabytes of data LIGO produces. Many of the expected future detections will be from using complex signal processing algorithms to search large parameter spaces. To make such a detection scientifically credible, understanding data provenance—what original source data were used, and exactly how they were processed—is crucially important. In particular, without a thorough record of what was done to produce a scientific result, that result can neither be reproduced, nor can it be extended by other researchers.

One approach to solving this problem is through a virtual laboratory — an environment that knows what data are available, and understands the semantics of the tools used to process it. As well as addressing the data provenance problem, this approach also allows the existing LIGO software suite to be repackaged in an easy-to-use way that makes it accessible to astronomers who are not necessarily gravitational wave domain experts. We discuss some of the work that has been done to apply these techniques to gravitational wave astronomy, as well as how a gravitational wave virtual laboratory could solve the data provenance problem.


Patrick Clearwater is a PhD student at in the Astrophysics group, School of Physics at the University of Melbourne. His research interests lie in analysing gravitational wave data to find evidence of signals at very low signal to noise, such as those expected from neutron stars; as well as building tools to make such searches computationally efficient, easy to use, and reproducible. He is a member of the ARC Centre of Excellence for Gravitational Wave Discovery (OzGrav) and the LIGO Scientific Collaboration.