You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ENVO frequently has shadow classifications between zones and other branches such as landform.
And example is "desert", which is in both zone and landform branches:
Often things in the zone branch are labeled area, but not consistently. For example, here we have a class "rocky desert" which is defined as A desert plain characterized by a surface veneer of rock., which would lead us to think it would be in this branch:
[] ENVO:00000191 ! solid astronomical body part
[i] ENVO:01001886 ! landform
[i] ENVO:01001884 ! surface landform
[i] ENVO:01001357 ! desert
However, it's in this separate branch:
[] ENVO:01001199 ! terrestrial environmental zone
[i] ENVO:01000752 ! area of barren land
[i] ENVO:00000097 ! desert area
[i] ENVO:00000172 ! sandy desert
[i] ENVO:00000173 ! rocky desert
[i] ENVO:00000183 ! stony desert
It's not clear when curators should use one branch or another. What we see right now is people picking a mix of these, but this means that the hierarchies don't line up, and when groups build faceted browsing tools, "rocky desert" samples don't roll up under "desert" samples.
This is repeated elsewhere, for tundra, wetlands, grasslands, etc.
It looks like many of the area terms were added to provide precise equivalents to NLCD (National Land Cover Database) terms. There is an argument that a land cover based classification system should be different since this encompasses a different perspective and use case, eg. annotating remote sensing data.
The NLCD mapped terms are in bold here, interwoven with existing terms:
It looks like grouping classes such as "wetland area" were added to provide some kind of structure for the NLCD terms, which causes concept duplication.
It's not clear from the original request whether the use case dictated that these be modeled as a distinct branch or more woven in to the existing ENVO hierarchy.
I propose that we make all of this more consistent and less confusing for users, by picking one of the following strategies:
Merge concepts
Continue to have separate branches, but have this be more systematic and better documented
Separate out alternative classification schemes into orthogonal mapped ontologies
Merge concepts
We would merge "X area" into X where an existing X term exists
If no "X" exists, then pick "X environment" or "X ecosystem" where these exist
This would mean e.g "area of woody wetland" would become a subclass of the existing "wetland ecosystem"
I propose we also making the naming more consistent, so "area of woody wetland" would simply be named "woody wetland"
We would keep the NLCD names as tagged synonyms. This is very consistent with what we do with other ontologies
This is my favored approach. It makes for a simpler ontology
Separate branches, but systematic
Here we would keep separate branches, but make the naming, coordination across branches consistent. We would have clear and simple top level documentation and inline documentation in the ontology that specifies what this separate branch is for. There would be clear use cases for when one branch should be picked over another.
As well as providing simple curator documentation, we'd need to work with external groups to make sure that reporting standards and submission tools pick from the correct branch. For example, for someone submitting metagenomic sample data to INSDC, when would they use "glassland area" vs "grassland ecosystem"
Naming should consistent, so that it's clear when one is picking an area vs ecosystem. I suggest a rule that if a term has a suffix "area" then all is-a children should also be suffixed this way.
Separate out alternative classification schemes
Recognize it is hard if not impossible to superimpose multiple alternative classification schemes in one ontology. Pick one broad and uncontroversial way of doing things, and work with other groups to make an ontological representation, and then map between the systems.
The text was updated successfully, but these errors were encountered:
ENVO frequently has shadow classifications between zones and other branches such as landform.
And example is "desert", which is in both zone and landform branches:
Often things in the zone branch are labeled area, but not consistently. For example, here we have a class "rocky desert" which is defined as A desert plain characterized by a surface veneer of rock., which would lead us to think it would be in this branch:
However, it's in this separate branch:
It's not clear when curators should use one branch or another. What we see right now is people picking a mix of these, but this means that the hierarchies don't line up, and when groups build faceted browsing tools, "rocky desert" samples don't roll up under "desert" samples.
This is repeated elsewhere, for tundra, wetlands, grasslands, etc.
It looks like many of the area terms were added to provide precise equivalents to NLCD (National Land Cover Database) terms. There is an argument that a land cover based classification system should be different since this encompasses a different perspective and use case, eg. annotating remote sensing data.
The NLCD mapped terms are in bold here, interwoven with existing terms:
It looks like grouping classes such as "wetland area" were added to provide some kind of structure for the NLCD terms, which causes concept duplication.
These were added in 2017: #458 (comment)
It's not clear from the original request whether the use case dictated that these be modeled as a distinct branch or more woven in to the existing ENVO hierarchy.
I propose that we make all of this more consistent and less confusing for users, by picking one of the following strategies:
Merge concepts
This is my favored approach. It makes for a simpler ontology
Separate branches, but systematic
Here we would keep separate branches, but make the naming, coordination across branches consistent. We would have clear and simple top level documentation and inline documentation in the ontology that specifies what this separate branch is for. There would be clear use cases for when one branch should be picked over another.
As well as providing simple curator documentation, we'd need to work with external groups to make sure that reporting standards and submission tools pick from the correct branch. For example, for someone submitting metagenomic sample data to INSDC, when would they use "glassland area" vs "grassland ecosystem"
Naming should consistent, so that it's clear when one is picking an area vs ecosystem. I suggest a rule that if a term has a suffix "area" then all is-a children should also be suffixed this way.
Separate out alternative classification schemes
Recognize it is hard if not impossible to superimpose multiple alternative classification schemes in one ontology. Pick one broad and uncontroversial way of doing things, and work with other groups to make an ontological representation, and then map between the systems.
The text was updated successfully, but these errors were encountered: