From 2e11f13b4ebb627e3c2b0480f6e9fc9cb0ae8ad7 Mon Sep 17 00:00:00 2001 From: Andreas Kollegger Date: Mon, 30 Sep 2024 09:50:32 +0100 Subject: [PATCH] initial notes about gram notation, to be used for compact human and machine readable descriptions of each pattern --- src/content/docs/appendices/notation.md | 153 ++++++++++++++++-------- src/content/docs/concepts/template.md | 4 +- src/content/docs/guides/template.md | 4 +- src/content/docs/reference/template.md | 5 +- src/content/docs/tutorials/template.md | 5 +- 5 files changed, 115 insertions(+), 56 deletions(-) diff --git a/src/content/docs/appendices/notation.md b/src/content/docs/appendices/notation.md index 0250a63..f8774a7 100644 --- a/src/content/docs/appendices/notation.md +++ b/src/content/docs/appendices/notation.md @@ -3,14 +3,31 @@ title: Graph Pattern Notation description: Compact notation for describing knowledge graph structures --- - ![Graph pattern railroad diagram](../../../assets/images/railroad/pattern.svg) *Graph pattern railroad diagram, a comma-separated list of PatternElement* -## About Notation +## About Graph Pattern Notation + +The GraphRAG pattern catalog uses a data notation called `gram` to describe +logical graph structures called patterns that are composed of nodes, relationships +and subjects. + +The Gram notation is intended to be self-descriptive and explicit, able to +represent data and structures that are often implicit in a physical graph models. +For example, paths are present in any connected graph, however storing path-level +information isn't normally supported. You can find paths, even store paths, but +there is no way to "say something" about a path. + +Gram starts with a notion of "subjects" as a self-describing data structure +in two parts: -The GraphRAG pattern catalog uses a graph notation to describe logical graph structures -called patterns that are composed of nodes, relationships and subjects. +1. intrinsic attributes - similar to an object, but with explicit identifier + and descriptive labels +2. associated elements - zero or more sequential elements that are related + to the subject + +That's enough to be explicit about any graph structure, from a single node +up to a complex component. ![PatternElement is a Node, Relationship or Subject](../../../assets/images/railroad/pattern-element.svg) *PatternElement is a Node, Relationship or Subject* @@ -18,89 +35,131 @@ called patterns that are composed of nodes, relationships and subjects. ### Nodes ![Node is delimted by parentheses](../../../assets/images/railroad/node.svg) -*Node is delimted by parentheses* +*Node: delimted by parentheses* Nodes are individual records in a graph. `(a)` - - a single node identified as `a` -`(a:Thing)` - - - a single node labeled as a `Thing` - -`(a:Thing:Special)` +`(c:Chunk)` + - a single node labeled as a `Chunk` +`(s:Summary:Synthesized)` - single node with two labels -`(a:Thing {k:"v"})` - - labeled node with a record defining property `k` to be `"v"` +`(c:Chunk {text:"something something"})` + - labeled node with a record defining property `text` to be a string value -`(a:Thing {k::string})` - - record declaring `k` to be a `string`, but with an undefined value +`(::Chunk {text::string})` + - node with record declaring `text` to be a `string` on Nodes labeled as `Chunk` -`(a:Thing {x@"meta"})` - - record with metadata about property `x` - - metadata is extra information +`(c:Chunk {seq@42})` + - record defining metadata property `seq` as an integer value + - metadata property names are in a separate namespace + - typically used for implementation detail rather than intrinsic qualities ### Relationships ![Relationship starts with a node, then an arrow followed by a Path](../../../assets/images/railroad/relationship.svg) -*Relationship starts with a Node, then an Arrow followed by a Path* +*Relationship: starts with a Node, then an Arrow followed by a Path* + ![Arrow looks like an arrow](../../../assets/images/railroad/arrow.svg) -*Arrow looks like an arrow* +*Arrow: delimited by a head and tail, with square-bracketed content* + ![Path is either a node or a Relationship (which starts with a Node)](../../../assets/images/railroad/path.svg) -*Path is either a node or a Relationship (which starts with a Node)* +*Path: either a Node or a Relationship (which starts with a Node)* -Relationships pair two nodes, a 'from' and a 'to' node. +Relationships are ordered pairs of nodes. -`(a)-[:KNOWS]->(b)` - - a relationship from `a` to `b` labeled as `:KNOWS` +`(a)-[:SIMILAR]->(b)` + - a relationship from `a` to `b` labeled as `SIMILAR` -`(a)-[:KNOWS {confidence:0.8}]` +`(a)<-[:SIMILAR]-(c)` + - a relationship from `c` to `a` labeled as `SIMILAR` + +`(a)-[:SIMILAR {score:0.8}]->(b)` - a relationship with a record +`(a)=[::SIMILAR {score::0.0..1.0}]=>(b)` + - a relationship declaration, stating that the `score` value of `SIMILAR` relationships should be a decimal in the range of 0.0 and 0.1 + ### Subjects ![Subject is surrounded by square brackets](../../../assets/images/railroad/subject.svg) *Subject is delimted by square brackets* -Subjects compose multiple graph elements. +Subjects associate multiple elements. They're a general purpose described pattern. + +#### Subjects with no association + +Subjects with no association are equivalent to Nodes and can have similar attributes. `[a]` - - subject about itself - - equivalent: `(a)` + - Subject `a` about no other elements + +`[c:Chunk]` + - Subject labeled as a `Chunk` + +`[s:Summary:Synthesized]` + - Subject with two labels + +`[c:Chunk {text:"something something"}]` + - labeled Subject with a record defining property `text` with a string value + +#### Subject with Set Membership + +Subjects can be associated with a set of members by using a separator +called an "associator" followed by a Pattern which +may use identifier references or inline element definitions. + +The basic association is set membership, introduced with a `|` associator. + +`|` pipe associator indicates set membership + - the subject is "about" the associated members + - no implied, direct relationship among members `[a | i]` - - subject about another element, an "annotation" - - equivalent: `(a)-->(i)` + - subject `a` about another element `i` + - known as an "annotation" + - example: ```[:Validation {approved:true} | (c:Chunk)]``` `[a | i,j,k]` - - a subject about 3 other elements, like set-builder notation - - equivalent: `(a)-->(i), (a)-->(j), (a)-->(k)` + - subject about 3 other elements `[a |:R| i,j,k]` - - a subject about 3 other elements, like set-builder notation - - equivalent: `(a)-[:R]->(i), (a)-[:R]->(j), (a)-[:R]->(k)` + - subject with labeled association of members + - provides explicit + - example: ```[d:Document |:SPLIT_INTO| c1, c2, c3]``` + + +#### Subject with Sequence Composition + +Composition takes pairs of elements, applying to each pair +within the association. + +`->` arrows for composition + - Subject connects each of the member pairs + - requires at least 2 members `[a:R -> i,j]` - - a subject composing 2 elements into a sequence, a relationship - - equivalent: `(i)-[:a:R]->(j)` + - Subject composing 2 elements into a pair + - equivalent to a Relationship when `i` and `j` are both Nodes + - equivalent to a Path when `i` and `j` are composable Relationships + (a Relationship where the right-side of `i` is the left side of `j`) + - example: ```[:Similar {score:0.87} -> (a:Entity), (b:Entity)]``` + +`[:R -> i,j,k]` + - Subject composing 3 elements into a sequence + - equivalent to a sequence of Relationships, when members are all Nodes + - example: ```[:NEXT -> (c1:Chunk), (c2:Chunk), (c3:Chunk)]``` -`[a:R {k:"v"} -> i,j,k]` - - a subject composing 3 elements into a sequence, a described path - - equivalent: `(i)-[a:R {k:"v"}]->(j)-[a:R {k:"v"}]->(k)` +`[a:R {k:"v"} -> i,j,i,k]` + - Subject composing 3 elements (one of which is repeated) into a sequence + - example: ```[:NEXT {cyclic:false} -> (c1:Chunk), (c2:Chunk), c1, (c3:Chunk)]``` -`[:R {seq:1...} --> i,j,i,k]` - - a described path with an applied range value - - equivalent: `(i)-[:R {seq:1}]->(j)-[:R {seq:2}]->(i)-[:R {seq:3}]->(k)` - -`[a:R {k:"v"} |-> i,j,k]` - - a subject about 3 elements, which are in a sequence - - equivalent: `(:a:R {k:"v"}), (a)-->(i), (a)-->(j), (a)-->(k), (i)-[a:R]->(j)-[a:R]->(k)` ## Further reading -- See [gram-data/nearley-gram](https://github.com/gram-data/nearley-gram/) for full EBNF +- See [gram-data/nearley-gram](https://github.com/gram-data/nearley-gram/) for EBNF - See [gram-data/tree-sitter-gram](https://github.com/gram-data/tree-sitter-gram/) for a parser diff --git a/src/content/docs/concepts/template.md b/src/content/docs/concepts/template.md index 7cb96e9..9e78190 100644 --- a/src/content/docs/concepts/template.md +++ b/src/content/docs/concepts/template.md @@ -37,11 +37,11 @@ Explanations motivate why things are the way they are. . -**Further reading**: +## Further reading **Instructions**: 1. Provide links to external sources -2. Don't keep the link below, it's for your reference as the author +2. Don't keep the link below or these instructions, they're for your reference as the author - Read [about explanations](https://diataxis.fr/explanation/) in the Diátaxis framework diff --git a/src/content/docs/guides/template.md b/src/content/docs/guides/template.md index 47394be..cbe45fe 100644 --- a/src/content/docs/guides/template.md +++ b/src/content/docs/guides/template.md @@ -33,11 +33,11 @@ How-to guides describe how to do accomplish some goal. 1. Add extra sections with step-by-step directions 2. Avoid implementation specific details (programming language or framework -specific) -**Further reading**: +## Further reading **Instructions**: 1. Provide links to external sources -2. Don't keep the link below, it's for your reference as the author +2. Don't keep the link below or these instructions, they're for your reference as the author - Read [about guides](https://diataxis.fr/how-to-guides/) in the Diátaxis framework diff --git a/src/content/docs/reference/template.md b/src/content/docs/reference/template.md index aa03a84..e7580d4 100644 --- a/src/content/docs/reference/template.md +++ b/src/content/docs/reference/template.md @@ -32,11 +32,12 @@ Reference material describes what something is. 1. Add extra sections with step-by-step directions 2. Avoid implementation specific details (programming language or framework -specific) -**Further reading**: + +## Further reading **Instructions**: 1. Provide links to external sources -2. Don't keep the link below, it's for your reference as the author +2. Don't keep the link below or these instructions, they're for your reference as the author - Read [about reference](https://diataxis.fr/reference/) in the Diátaxis framework diff --git a/src/content/docs/tutorials/template.md b/src/content/docs/tutorials/template.md index e393893..16bd1ef 100644 --- a/src/content/docs/tutorials/template.md +++ b/src/content/docs/tutorials/template.md @@ -33,12 +33,11 @@ A tutorial in other words is a lesson. They apply how-tos into a practical solut 1. Add extra sections that outline a complete solution -**Further reading**: +## Further reading **Instructions**: 1. Provide links to external sources -2. Link to relevant how-tos and reference material used in the tutorial -3. Don't keep the link below, it's for your reference as the author +2. Don't keep the link below or these instructions, they're for your reference as the author - Read [about tutorials](https://diataxis.fr/tutorials/) in the Diátaxis framework