CEOS Development Environment

Vocabulary Interoperability: Speaking the Same Language


Interoperability of Earth observation (EO) data and services depends not only on shared technical standards but also on a common understanding of language and meaning. Ensuring that data, messages and services are not only transmitted but also correctly understood across different stakeholders is fundamental to achieving interoperability in EO. Without a shared understanding of terms, concepts and relationships, interoperability remains limited, hampered by inconsistencies, misunderstandings and other integration challenges.

‘Lost in Translation’ Paper by Peter Strobl, Emma Woolliams and Katrin Molch

Due to this importance, Vocabulary (Semantics) was identified as one of the five factors of Interoperability, described in the CEOS Interoperability Handbook published in 2025. The Vocabulary chapter of the handbook describes recommendations on how to provide interoperable semantics, alongside recommendations for how to develop an interoperable thesaurus. 

Semantics deal with general aspects of meaning and relationships between terms and concepts in a domain, while vocabularies such as thesauri, glossaries, terminologies, ontologies, taxonomies and controlled vocabularies provide standardised definitions that facilitate common understanding. Standardised vocabularies enable diverse entities to describe, classify and relate data and services in a way that is human and machine-readable and reusable across the whole domain.

CEOS began working on this topic in 2021 with a cross-disciplinary Terminology Task Group. Their work led to the publication of a peer-reviewed paper: Lost in Translation: The Need for Common Vocabularies and an Interoperable Thesaurus in Earth Observation Sciences, which highlighted the challenges of consistent vocabulary in the EO-domain. As EO is a highly interdisciplinary field, encompassing expertise from sensor development to data processing and decision-making support, there is a wide range of experts and communities who have independently developed vocabularies to define critical terminology. When compared, terminology can be inconsistent, controversial and superficial, limiting the ability to communicate expertise effectively. This establishes the need for common vocabularies that are consistent, interrelated, understandable, educational and updateable. The work also found that many existing vocabularies often lacked structure, cross-links and version control, making them difficult to navigate and prone to misinterpretation.

Building on this work, the CEOS Interoperability Handbook recommendations describe how consistent, interrelated, understandable, educational and updateable vocabularies can be constructed. The 14 recommendations across the  Semantic and Thesaurus subsections can be found below. Review and provide feedback on GitHub.

CEOS has already started implementing these recommendations by developing the CEOS EO Glossary – a GitHub-based community resource. CEOS will continue building out this resource and invites everyone to contribute new terms or propose revisions to existing ones. 

Term classification on the CEOS EO Glossary

IDSemantic Recommendations
SEM#1 Terms and definitions should be collected into the CEOS Earth observation glossary on GitHub.
SEM#2Capability should be provided to enable public comment and discussion on existing and new terms and definitions.
SEM#3Enable version control and change management at the individual term level and link to historical and alternative definitions.
SEM#4Use of project or document specific vocabularies should be discouraged e.g., in the form of ‘terms and definitions’ chapters. Source (via url), maintain, and develop all terms that serve or might serve in more than one context in the online, shared repository.
SEM#5Community members should promote the common thesaurus, including through ISO/TC 211, OGC5, WMO, GEO and other stakeholders in Earth System Sciences, to strive for domain wide adoption.
SEM#6Common online repositories for abbreviations and acronyms should be used. Agreed metadata fields with unified and binding lists of options should be included. Keywords from controlled vocabularies that allow lookup of keyword information via Linked Data principles, e.g., HTTP URI de-referencing or SPARQL interfaces are preferred. The use of GCMD controlled keywords is encouraged.
IDThesaurus Recommendations
THES#1 The terms used in the thesaurus should be consistent and divided into classes such as Base, Core, Controversial and High Impact.
The ‘Base Terms’ should have cross community agreement and should not have circular or ambiguous definitions. The ‘Core Term’ should be using the ‘Base Term’ consistently and can be allowed to have minor tweaks with approval from the identified committee. The ‘Controversial Term’ should have qualifiers attached to them with links to discussions, which led to the association of the qualifier. The ‘High Impact Term’ should be approved by a specialist committee and should be linked to a document providing details of the term.
THES#2The definition of a term may not contain the term itself nor other circular definitions (e.g., where term A is defined using term B and term B is defined using term A). A clear set of base terms should be used.
THES#3The terms used in the thesaurus should have clear and mappable relationships with other terms (parent, sibling, child). Overlaps between terms that are supposed to delineate more generic concepts (siblings) should be avoided or minimized.
THES#4Definitions have to be kept unambiguous and short, and written in a form such that they can replace the term in a sentence.
THES#5Explanations should be given in a separate ‘Notes’ sections, and Examples in a separate ‘Examples’ section. Both complement the definition, and should not be included as part of the main definition.
THES#6Every definition should have an accompanying ‘Sources’ section, where all source documents are listed or link to register maintained by source is provided, wherever possible as urls.
THES#7Thesaurus terms should be version controlled at the individual term level.
THES#8Where a term is deemed ‘controversial’ then contradictory definitions can be provided, but only with clear links to alternative definitions and explanations as to what context a term is used in.