Vocabularies, Vocabularies, Vocabularies: Which ones should I use? Which ones should I trust? Which Ones….?

Dr Adrian Burton1, Dr Simon Cox2, Dr Lesley Wyborn1, Mr Rowan Brownlee1

1Australia Reserach Data Commons, Canberra, Australia
2CSIRO, Clayton, Australia

Use of shared or harmonised terminology is required to maximise semantic interoperability, particularly across different domains and in multiple communities. The term ‘vocabulary’ is used to denote any semantic asset containing terms and information about those terms, including value sets (i.e., bag of terms, term list), controlled-vocabularies, glossaries, thesauri, and taxonomies, and potentially even concept maps, ontologies, and knowledge graphs.

Online vocabularies and vocabulary services are proliferating, and users are uncertain as to which ones they should select. In 2019, a review of Research Vocabularies Australia for the Australian Research Data Commons (ARDC) noted multiple cases of semantically overlapping vocabularies even within the same service. To improve utility these vocabularies should be consolidated, or trust metrics be developed. Vocabulary metadata should clearly display governance and particularly usage information, which is a proxy for trust.

Consolidation is difficult. Requirements from different applications and contexts can lead to the development of new vocabularies with what appears to be the same scope as existing vocabularies. However, we need guidelines to help users determine if an existing vocabulary meets their needs, and which ones are reliable and sustainable.

The BoF will start with short presentations to set the scene, followed by two breakout sessions to begin development of guidelines for the selection of vocabularies that are fit for purpose, authoritative, governed, persistent. The first will canvass what users need to know in order to make a selection. The second will explore development of a “5-star vocab” ranking like the Five Star Open Data.


Adrian Burton is Director, Data, Policy and Services at the Australian Research Data Commons (ARDC). Adrian has provided strategic input into several national infrastructure initiatives,  is active in building national policy frameworks to unlock the value in the research data outputs of publicly funded research.

Lesley Wyborn is an Adjunct Fellow at the National Computational Infrastructure at ANU and works part-time for the Australian Research Data Commons. She had 42 years’ experience in Geoscience Australia in scientific research and in geoscientific data management. She is currently Chair of the Australian Academy of Science ‘National Data in Science Committee’ and is on the American Geophysical Union Data Management Advisory Board and the Earth Science Information Partners Executive Board. She was awarded the Public Service Medal in 2014, the 2015 Geological Society of America Career Achievement Award in Geoinformatics and the 2019 US ESIP Martha Maiden Award.

Recent Comments