Skip to content

Commit

Permalink
load-csv (#105)
Browse files Browse the repository at this point in the history
* load-csv

* fix typos

---------

Co-authored-by: Guy Korland <[email protected]>
  • Loading branch information
swilly22 and gkorland authored Dec 31, 2024
1 parent 555b205 commit 7d27c25
Show file tree
Hide file tree
Showing 8 changed files with 126 additions and 7 deletions.
6 changes: 6 additions & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ CLI
CMD
CSC
CSV
CSVs
Cailliau
Centos
ColumnType
Expand Down Expand Up @@ -82,6 +83,7 @@ RedisGraph
RedisInsight
Roi
Rueian
Saldana
SETUSER
SDK
SDKs
Expand All @@ -95,10 +97,12 @@ Subcommands
TF
TTY
UNIQIE
URIs
UUID
VKEY
ValueType
Vijay
Vin
acos
acyclic
algo
Expand All @@ -112,6 +116,7 @@ args
ascii
asin
atan
birthyear
boolean
ceil
cli
Expand All @@ -124,6 +129,7 @@ costProp
cpp
createNodeIndex
crflynn
csv
dbms
dedup
dockerfile
Expand Down
2 changes: 1 addition & 1 deletion cypher/algorithms.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Algorithms"
nav_order: 19
nav_order: 20
description: >
FalkorDB supported algorithms like BFS.
parent: "Cypher Language"
Expand Down
2 changes: 1 addition & 1 deletion cypher/cypher_support.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Cypher coverage"
nav_order: 21
nav_order: 22
description: >
FalkorDB implements a subset of the Cypher language, which is growing as development continues.
parent: "Cypher Language"
Expand Down
4 changes: 2 additions & 2 deletions cypher/functions.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Functions"
nav_order: 17
nav_order: 18
description: >
Information on all supported functions from the Cypher query language
parent: "Cypher Language"
Expand Down Expand Up @@ -364,4 +364,4 @@ The format for a relationship object in JSON is:
"start": src_node(node),
"end": dest_node(node)
}
```
```
2 changes: 1 addition & 1 deletion cypher/indexing.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Indexing"
nav_order: 20
nav_order: 21
description: >
FalkorDB supports single-property indexes for node labels and for relationship type. String, numeric, and geospatial data types can be indexed.
parent: "Cypher Language"
Expand Down
113 changes: 113 additions & 0 deletions cypher/load_csv.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
title: "LOAD CSV"
nav_order: 17
description: >
LOAD CSV allows a query to access data within a CSV file
parent: "Cypher Language"
---

# LOAD CSV

```sh
LOAD CSV FROM 'file://actors.csv' AS row
MERGE (a:Actor {name: row[0]})
```

`LOAD CSV FROM` accepts a string containing the path to a CSV file,
the file is parsed line by line, the current line is accessible through the
variable specified by `AS`. Each parsed value is treated as a `string`, use
the right conversion functions e.g. `toInteger` to cast a value to its
appropriate type.

Additional clauses can follow and accesses the `row` variable

## IMPORTING DATA

### Importing local files

FalkorDB defines a data directory ![see configuration](../configuration)
Under which local CSV files should be stored, all `file://` URIs are resolved
relatively to that directory.

In the following example we'll load the `actors.csv` file into FalkorDB.

### actors.csv

| Lee Pace | 1979 |
| ---------------|-----------|
| Vin Diesel | 1967 |
| Chris Pratt | 1979 |
| Zoe Saldana | 1978 |

```sh
LOAD CSV FROM 'file://actors.csv'
AS row
MERGE (a:Actor {name: row[0], birth_year: toInteger(row[1])})
RETURN a.name, a.birth_year
```

Note that we've used indices e.g. `row[0]` to access the value at the corresponding
column.

In case the CSV contains a header row e.g.

### actors.csv

| name | birthyear |
| ---------------|-----------|
| Lee Pace | 1979 |
| Vin Diesel | 1967 |
| Chris Pratt | 1979 |
| Zoe Saldana | 1978 |

Then we should use the `WITH HEADERS` variation of the `LOAD CSV` clause

```
LOAD CSV WITH HEADERS FROM 'file://actors.csv'
AS row
MERGE (a:Actor {name: row[name], birth_year: toInteger(row[birthyear])})
RETURN a.name, a.birth_year
```

Note when a header row exists and `WITH HEADER` is specified the `row` variable
is no longer an `array` but rather a `map`, accessing the individual elements
is done via their column name.


### Importing data from multiple CSVs

Building on our previous example we'll introduce a second csv file `acted_in.csv`
which ties actors to movies they've acted in


### acted_in.csv

| actor | movie |
| ---------------|----------------|
| Lee Pace | The Fall |
| Vin Diesel | Fast & Furious |
| Chris Pratt | Passengers |
| Zoe Saldana | Avatar |


We'll create a new graph connecting actors to the movies they've acted in

Load actors:

```sh
LOAD CSV WITH HEADER FROM 'file://actors.csv'
AS row
MERGE (a:Actor {name:row['name']})
```

Load movies and create `ACTED_IN` relations:

```sh
LOAD CSV WITH HEADER FROM 'file://acted_in.csv'
AS row

MATCH (a:Actor {name: row['actor']})
MERGE (m:Movie {title: row['movie']})
MERGE (a)-[:ACTED_IN]->(m)
```

2 changes: 1 addition & 1 deletion cypher/procedures.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Procedures"
nav_order: 18
nav_order: 19
description: >
Procedures calling using with CALL and YIELD.
parent: "Cypher Language"
Expand Down
2 changes: 1 addition & 1 deletion cypher/remove.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "REMOVE"
nav_order: 22
nav_order: 23
description: >
REMOVE is used to remove attributes from node and relationships, in addition to
removing labels from nodes.
Expand Down

0 comments on commit 7d27c25

Please sign in to comment.