Grim brings the labeled property graph (LPG) data structure to the Nim language. This storage model is used in the Neo4j database and consists of labeled Nodes and Edges. The data itself is stored in key/value-pairs on these entities.
News: See the Changelog.
- [2021-01-03] v0.3.1 released!
- [2021-01-01] v0.3.0 released!
- [2020-02-07] v0.2.0 released!
Using grim | Documentation and API | Running the tests | Built with | Contributing | Authors | License
Grim is provided with the Nimble package manager.
Install the Nim compiler; see nim-lang for instructions.
Use nimble
to install grim
on your local machine:
nimble install grim
Use grim
in a project by adding
requires "grim"
to its .nimble file.
Basic | Iteration | Loading/saving | Graph-building DSL | Neo4j | Paths | Navigating paths
The grim
API is quite user-friendly. Check out usage examples in the tests/
folder. The Northwind tutorial shows how to translate a relational SQL model (of the sales in a small company) to a labeled property graph.
The grim
documentation is hosted on Github.
Create a new graph:
import grim
var g = newGraph("my graph")
echo g.name
# => "my graph"
doAssert g.numberOfNodes == 0 and g.numberOfEdges == 0
Add nodes to the graph:
let
c1 = g.addNode("Country", %(name: "Sweden", GDP_per_capita: 53208.0))
c2 = g.addNode("Country", %(name: "Norway", GDP_per_capita: 74536.0))
c3 = g.addNode("Country", %(name: "Germany", GDP_per_capita: 52559.0))
o1 = g.addNode("Organization", %(name: "NATO", founded: 1949, active: true))
o2 = g.addNode("Organization", %(name: "EU", founded: 1993, active: true))
o3 = g.addNode("Organization", %(name: "Nordic Council", founded: 1952))
Note:
- The
%
operator converts tuples to key/value properties which are stored as heterogeneous data tables on the LPG. addNode
andaddEdge
return theoid
for the entity (node or edge). Theoid
is a unique (string) identifier that can be set explicitly with theoid
argument toaddNode
andaddEdge
, or is auto-generated if theoid
argument is omitted.
Here is how data is extracted from the graph:
doAssert g.node(c1).label == "Country"
doAssert g.node(c1)["name"] == "Sweden"
Note:
- The
oid
s are used to identify the node.
Add edges to the graph:
let
e1 = g.addEdge(c1, o2, "MEMBER_OF", %(since: 1995))
e2 = g.addEdge(c1, o3, "MEMBER_OF", %(since: 1952))
e3 = g.addEdge(c2, o3, "MEMBER_OF", %(since: 1952, membership: "full"))
Note:
- The
oid
s are used to identify the nodes and create the new edges (labeled "MEMBER_OF") with key/value properties. - Since no
oid
argument is given toaddEdge
, theoid
is auto-generated (and returned in thee1
,e2
, ande3
variables).
All nodes:
for node in g.nodes:
echo node.label, ": ", node
Nodes with a certain label:
for node in g.nodes("Country"):
echo node.label, ": ", node
Nodes with filter (node
is a special variable available when filtering nodes):
for node in g.nodes("Person", node.name == "Alice" and node.age < 22):
echo node.label, ": ", node
All edges:
for edge in g.edges:
echo edge.label, ": ", edge
All edges with a certain label:
for edge in g.edges("MEMBER_OF"):
echo edge.label, ": ", edge
Edges with filter (edge
is a special variable available when filtering edges):
for node in g.edges("MEMBER_OF", edge.since > 2010):
echo node.label, ": ", node
All edges between two nodes:
for edge in g.edgesBetween(c1, o3):
echo edge.label, ": ", edge
Over neighbor nodes:
for node in g.neighbors(c1):
echo node.label, ": ", node
Note:
- The graph is directional so
addEdge(A, B, "LABEL")
adds an edge with label "LABEL" pointing from A to B. - All iterators take a
direction
argument that specifies whether to include outgoing (A->B
), incoming (A<-B
) or both (A<->B
) edges/neighbors. - The direction is specified with the enum values
Direction.Out
,Direction.In
, andDirection.OutIn
.
Graph structures can be loaded and saved in YAML format with the NimYAML library. The procs loadYaml
and saveYaml
can be used (there are examples in the tests/
folder).
import grim
var g = loadYaml("example.yaml") # Load graph from YAML file
g.saveYaml("example2.yaml") # Save a copy of the file
A small DSL is provided to reduce boilerplate when building graphs. A toy example:
import grim/dsl
graph g "Some people":
nodes:
Person:
"a nice guy":
name: "Santa Claus"
age: 108
"a smart girl":
name: "Jane Austen"
wealth: 10304.3
edges:
"a nice guy" -> "a smart girl":
DELIVERS:
category: "writing material"
value: 204
This will expose the graph in the remainder of the code as the mutable variable g
. This example shows how to access graph data:
let
p1 = g.node("a nice guy")
p2 = g.node("a smart girl")
doAssert p1.label == "Character" and p2.label == "Character"
doAssert p1["name"].getStr == "Santa Claus"
doAssert p1["age"].getInt == 108
doAssert p2["name"].getStr == "Jane Austen"
doAssert p2["wealth"].getFloat == 10304.3
for e in g.edgesBetween("a nice guy", "a smart girl"):
doAssert e.label == "DELIVERS"
doAssert e["category"].getStr == "writing material"
doAssert e["value"].getInt == 204
The neo4j
submodule is used to communicate with a Neo4j database. Data is transferred via Neo4j's http REST API since the bolt protocol is not supported at present.
import grim/[neo4j, utils]
The utils
module provides the getEnvOrRaise
proc, which reads an evironment variable or raises a runtime error when the variable is not defined.
let
username = getEnvOrRaise("NEO4J_USERNAME")
password = getEnvOrRaise("NEO4J_PASSWORD")
hostname = getEnvOrRaise("NEO4J_HOSTNAME")
The contents of NEO4J_USERNAME and NEO4J_PASSWORD are self-explanatory, and the NEO4J_HOSTNAME contains the address to the database on the form mydatabase.com
(or simply localhost
if you are running a local instance).
Start the client and dump the database as a grim LPG:
var
client = initNeo4jClient(hostname, auth = (username, password))
g = client.dump("my graph")
echo g.describe
A path in the graph is defined by a sequence of continuous edges (members), which link together a number of nodes. The path can be walked
(or traversed) by iterating from the beginning of the path. The paths starts at an anchor node.
var p = newPath(myNode)
p
is now an empty path starting at myNode
. We can now start building the path by repeatedly adding members to the path.
p = p.add(myFirstEdge).add(mySecondEdge).add(myThirdEdge)
The add proc returns the path and can therefore be chained. Note that paths and members are ref objects so to create a copy of a path we need to use the explicit copy
function
var p2 = p.copy
Walk the path by iterating
for edge in p:
echo edge
# myFirstEdge, mySecondEdge, myThirdEdge
Get the first, last, and n:th member by convenience functions:
echo p.first # myFirstEdge
echo p.last # myThirdEdge
echo p.nth(1) # mySecondEdge (zero-indexed)
The first two are O(1) operations, but the last is O(n) and slower on long paths.
The real power of paths emerge when navigating path via patterns. This is an efficient method for simple traversal of similar paths in the graph. The paths can be scanned and modified in a single sweep.
Let's start our example by building a graph:
import grim/dsl
graph g "People":
nodes:
Person:
"alice":
name: "Alice"
"bob":
name: "Bob"
"charlie":
name: "Charlie"
edges:
"alice" -> "bob":
KNOWS
"bob" -> "charlie"
KNOWS
"bob" -> "charlie"
KNOWS
The graph is Alice -KNOWS-> Bob =KNOWS=> Charlie
,where ->
and =>
denotes single and double edges, respectively. Let's say we want to navigate this graph by the pattern Person-KNOWS-Person. There are three such paths of length 1 (containing one member): one is Alice-Bob (which is a single edge) and the other two are Bob-Charlie (which is a double edge).
We start navigating with the graph's navigate
proc:
var pc = g.navigate("Person") # PathCollection
The navigation starts at an anchor node ("Person"). The result is a PathCollection
, which is exactly what it sounds like: A collection of paths matching the given pattern. In other words, the navigate
constructs a PathCollection with empty paths anchored at nodes matching the label Person. We can iterate over the matched paths in the PathCollection:
for path in pc:
echo path, path.anchor["name"]
# ("Empty Path", "Alice"), ("Empty Path", "Bob"), ("Empty Path", "Charlie")
# The matching order is not guaranteed.
Let's now expand the navigation to the pattern matched by taking the step:
pc = pc.step("KNOWS", "Person")
With the help of the anchors, we have now matched all paths fulfilling the pattern Person-KNOWS-Person. Each step taken when navigating returns a modified copy of the PathCollection, encouraging sequential steps to be chained:
var pc = g
.navigate("Person")
.step("KNOWS", "Person")
.step("KNOWS", "Person")
In fact, this pattern is so common that there is a convenience function for repeating a number of identical steps:
var pc = g
.navigate("Person")
.steps("KNOWS", "Person", 2)
This navigation will search the graph for motifs of the kind Person-KNOWS->Person-KNOWS->Person, i.e., friends of friends. Note that the match is exhaustive, i.e.,
var pc = g
.navigate("Person")
.step("KNOWS", "Person", 3)
will return no matches (i.e., a path collection of empty paths anchored at "Person" labels).
There is one more important path navigation function, namely follow
. This will match all patterns of variable length until there are no more matching paths. In our example,
var pc = g
.navigate("Person")
.follow("KNOWS", "Person")
will match Person-KNOWS-Person, Person-KNOWS-Person-KNOWS-Person, etc. In our graph, we expect matches:
- One 1-step path Alice-KNOWS-Bob.
- Two 1-step paths Bob-KNOWS-Charlie (because of two edges).
- Two 2-step paths Alice-KNOWS-Bob-KNOWS-Charlie (because of two edges between Bob and Charlie).
After matching patterns, we can simply iterate over the paths in the collection:
for path in pc:
echo path, path.anchor
# Three 1-step paths, two 2-step paths
# Anchors: Alice (1-step path), Bob (1-step path), Bob (1-step path), Alice (2-step path), Alice (2-step path).
The unit tests can be run with nimble, they test basic usage of the package such as creating and modifying graphs, the DSL, and loading/saving graphs as YAML.
nimble test
I'll be happy to accept and review PRs and discuss other things on submitted to Issues.
- Erik G. Brandt - Original author - ebran
This project is licensed under the MIT License.