Mr Michael Milton1
1Walter and Eliza Hall Institute of Medical Research, Parkville, Australia
Biography:
Michael Milton has worked as a bioinformatician and research software engineer in Melbourne for the past 9 years, with an ongoing focus on developing high-quality open-source software. Michael is currently part of WEHI's Research Computing Platform, who aim to facilitate medical research software through project collaborations, developing software and providing training. https://orcid.org/0000-0002-8965-2595
Abstract:
Metadata needs to play a key role in WEHI's new research data management system. We must ensure that future researchers can identify the files they need from a huge collection of past data, along with the process that originally created them. Building a system that supports this is filled with difficult decisions that will be explored in this talk.
We will firstly examine some metadata standards that were considered as the basis for the system. In particularly, we will explain the advantages of linked data that encouraged us to build a metadata system on RO-Crate.
When a new dataset is processed by the system, baseline metadata is generated based on prior process modelling. This step is powered by RdfCrate, a WEHI-developed Python package. Next, the metadata is reviewed by the scientists involved in generating the data, which required a solution for graphically summarising and editing RO-Crates. Finally, the system validates the final crate using profiles built with SHACL, allowing the enforcement of minimum metadata standards.
Once a crate is safely in the research data system, the metadata is used for a number of purposes. These include exporting image data crates to OMERO with attached metadata, searching past crates with the SPARQL query language, and exporting metadata into a format for database or journal submission.