From bb0d15fae3242949eacfc1a179f3dffe15dee340 Mon Sep 17 00:00:00 2001 From: Matthew Brush Date: Tue, 13 Feb 2024 10:53:49 -0700 Subject: [PATCH] Update knowledge_level_agent_type_specification.md update links to external implementation support documents. Added clarifying test to term examples. --- ...nowledge_level_agent_type_specification.md | 20 +++++++------------ 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/ImplementationGuidance/Specifications/knowledge_level_agent_type_specification.md b/ImplementationGuidance/Specifications/knowledge_level_agent_type_specification.md index ee43885..c35552c 100644 --- a/ImplementationGuidance/Specifications/knowledge_level_agent_type_specification.md +++ b/ImplementationGuidance/Specifications/knowledge_level_agent_type_specification.md @@ -54,12 +54,12 @@ Longer term we will define distinct properties and enumerations that classify ag 2. `knowledge_level`: Biolink edge property describes the level of knowledge expressed in a statement, based on the reasoning or analysis methods used to generate the statement, or the scope or specificity of what the statement expresses to be true. Permissible values are defined in the **biolink:KnowledgeLevelEnum** enumeration: - - `knowledge_assertion`: a statement of purported fact that is put forth by an agent as true, based on assessment of direct evidence. Assertions are likely but not definitively true. - - `logical_entailment`: a statement reporting a conclusion that follows logically from premises that are established facts or knowledge assertions, aka a 'Deductive Inference'. (e.g. fingernail part of finger, finger part of hand → fingernail part of hand)). - - `prediction`: a statement of a possible fact based on more probabilistic (non-deductive) forms of reasoning over indirect forms of evidence, that lead to more speculative conclusions. - - `statistical_association`: a statement that reports concepts representing variables in a dataset to be statistically associated in the context of a particular cohort or dataset (e.g. “Metformin Treatment (variable 1) is correlated with Diabetes Diagnosis (variable 2) in EHR dataset X”). - - `observation`: a statement reporting (and possibly quantifying) a phenomenon that was observed to occur - absent any analysis or interpretation that generates a statistical association or supports a broader conclusion or inference. - - `not_provided`: the knowledge level/type is not provided, typically because it cannot be determined from available information. + - `knowledge_assertion`: a statement of purported fact that is put forth by an agent as true, based on assessment of direct evidence. Assertions generally have a high confidence of being true based on the strength of evidence supporting them. + - `logical_entailment`: a statement reporting a conclusion that follows logically from premises, which are typically well-established facts or knowledge assertions. (e.g. fingernail part of finger, finger part of hand → fingernail part of hand)). Logical entailments are based on dedictive inference, and generally have a high degree of confidence when based on sound premises and inference logic. + - `prediction`: a statement of a possible fact based on more probabilistic (non-deductive) forms of reasoning over indirect forms of evidence, that lead to more speculative conclusions. Predictions often have a lower degree of confidence based on the indirect nature of their evidence and reasoning supporting them. + - `statistical_association`: a statement that reports concepts representing variables in a dataset to be statistically associated in the context of a particular cohort or dataset (e.g. “Metformin Treatment (variable 1) is correlated with Diabetes Diagnosis (variable 2) in EHR dataset X”). These associations are inherently true in that they simple report the results of some statistical analysis, but do not interpret these data to draw broader conclusions about general types in the domain of discourse. + - `observation`: a statement reporting (and possibly quantifying) a phenomenon that was observed to occur - absent any analysis or interpretation that generates a statistical association or supports a broader conclusion or inference. Observation statements are also inherently true in that they simple report what an agent observed - without any interpretation or inference. + - `not_provided`: the knowledge level/type fora statement is not provided, typically because it cannot be determined from available information. - NOTE that the notion of a 'level' of knowledge can in one sense relate to the strength of a statement - i.e. how confident we are that it says something true about our domain of discourse. Here, we can generally consider Knowledge Assertions to be stronger than Entailments to be stronger than Predictions. But in another sense, 'level' of knowledge can refer to the scope or specificity of what a statement expresses - on a spectrum from context-specific results of a data analysis, to generalized assertions of knowledge or fact. Here, Statistical Associations and Observations represent more foundational statements that are only slightly removed from the data on which they are based (the former reporting the direct results of an analysis in terms of correlations between variables in the data, and the latter describing phenomena that were observed/reported to have occurred). @@ -81,11 +81,5 @@ Longer term we will define distinct properties and enumerations that classify ag "attribute_source": "infores:molepro" } -3. The main challenge in applying this standard concerns selecting appropriate agent type and knowledge level terms for a given Edge. Here we offer three forms of guidance: - - a. More detailed descriptions of agent type and knowledge level terms can be found [here](https://docs.google.com/document/d/1_Iol_nQhONsRyQp6ibDUBbtiY0zp7Txbs7mg6xSMXSU/edit#heading=h.1ptdqc6t27xt). - - b. Specific guidance for assigning agent type and knowledge level to Edges in the context of Translator knowledge graphs - including mappings between Knowledge Provider resources and the types of agent type and knowledge level values relevant to each, can be found [here](https://docs.google.com/document/d/1_Iol_nQhONsRyQp6ibDUBbtiY0zp7Txbs7mg6xSMXSU/edit#heading=h.1ptdqc6t27xt) - soon to be imported as Appendix II in this document). - - c. A catalog of examples illustrating how agent type and knowledge level terms are applied to annotate the diverse kinds of Edges provided in Translator KGs, can be found [here](https://docs.google.com/document/d/1_Iol_nQhONsRyQp6ibDUBbtiY0zp7Txbs7mg6xSMXSU/edit#heading=h.g44g32y7i8lo). +3. The main challenge in applying this standard concerns selecting appropriate agent type and knowledge level terms for a given Edge. To assist KPs in this task, a [Supplemental Guidance document](https://docs.google.com/document/d/140dtM5CjWM97JiBRdAmDT-9IKqHoOj-xbE_5TWkdYqg/edit) provides additional implementation support beyond the base specification above. This includes clarification of key distinctions, tips for proper term selection, and a corpus of examples illustrating how agent type and knowledge level terms are applied to the diverse kinds of Edges provided in Translator knowledge graphs.