neo4j-contrib · vga91 · Nov 28, 2024
diff --git a/docs/asciidoc/modules/ROOT/nav.adoc b/docs/asciidoc/modules/ROOT/nav.adoc
@@ -28,9 +28,13 @@ include::partial$generated-documentation/nav.adoc[]
     ** xref::import/html.adoc[]
     ** xref::import/parquet.adoc[]
     ** xref::import/gexf.adoc[]
+    ** xref::import/arrow.adoc[]
+
 
 * xref:export/index.adoc[]
     ** xref::export/xls.adoc[]
+    ** xref::export/arrow.adoc[]
+
 
 * xref:database-integration/index.adoc[]
     ** xref::database-integration/load-jdbc.adoc[]

diff --git a/docs/asciidoc/modules/ROOT/pages/export/arrow.adoc b/docs/asciidoc/modules/ROOT/pages/export/arrow.adoc
@@ -0,0 +1,104 @@
+[[export-arrow]]
+= Export to Apache Arrow
+:description: This section describes procedures that can be used to export data in Apache Arrow format.
+
+The export Apache Arrow procedures export data into a format that's used by many Apache tools and non.
+
+
+[[export-arrow-available-procedures]]
+== Available Procedures
+
+The table below describes the available procedures:
+
+[separator=¦,opts=header,cols="5,1m"]
+|===
+¦Qualified Name¦Type
+¦xref::overview/apoc.export/apoc.export.arrow.all.adoc[apoc.export.arrow.all icon:book[]] +
+`apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>)` - exports the full database as an arrow file.
+¦label:procedure[]
+
+¦xref::overview/apoc.export/apoc.export.arrow.graph.adoc[apoc.export.arrow.graph.adoc icon:book[]] +
+`apoc.export.arrow.graph(file STRING, graph ANY, config MAP<STRING, ANY>)` - exports the given graph as an arrow file.
+¦label:procedure[]
+
+¦xref::overview/apoc.export/apoc.export.arrow.query.adoc[apoc.export.arrow.query.adoc icon:book[]] +
+`apoc.export.arrow.stream.all(config MAP<STRING, ANY>)` - exports the full database as an arrow byte array.
+¦label:procedure[]
+
+¦xref::overview/apoc.export/apoc.export.arrow.stream.all.adoc[apoc.export.arrow.stream.all icon:book[]] +
+`apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>)` - exports the full database as an arrow file.
+¦label:procedure[]
+
+¦xref::overview/apoc.export/apoc.export.arrow.stream.graph.adoc[apoc.export.arrow.stream.graph.adoc icon:book[]] +
+`apoc.export.arrow.stream.graph(graph ANY, config MAP<STRING, ANY>)` - exports the given graph as an arrow byte array.
+¦label:procedure[]
+
+¦xref::overview/apoc.export/apoc.export.arrow.stream.query.adoc[apoc.export.arrow.stream.query.adoc icon:book[]] +
+`apoc.export.arrow.stream.query(query ANY, config MAP<STRING, ANY>)` - exports the given Cypher query as an arrow byte array.
+¦label:procedure[]
+|===
+
+
+[[export-arrow-file-export]]
+== Exporting to a file
+
+include::partial$enableFileExport.adoc[]
+
+
+[[export-arrow-examples]]
+== Examples
+
+
+[[export-cypher-query-arrow]]
+=== Export results of Cypher query to Apache Arrow file
+
+[source,cypher]
+----
+CALL apoc.export.arrow.query('query_test.arrow', 
+    "RETURN 1 AS intData, 'a' AS stringData,
+        true AS boolData,
+        [1, 2, 3] AS intArray,
+        [1.1, 2.2, 3.3] AS doubleArray,
+        [true, false, true] AS boolArray,
+        [1, '2', true, null] AS mixedArray,
+        {foo: 'bar'} AS mapData,
+        localdatetime('2015-05-18T19:32:24') as dateData,
+        [[0]] AS arrayArray,
+        1.1 AS doubleData"
+) YIELD file
+----
+
+.Results
+[opts="header"]
+|===
+| file         | source                        | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data
+| "query_test.arrow" | "statement: cols(11)" | "arrow" | 0     | 0             | 11         | 468  | 11   | 2000      | 1       | true | <null>
+|===
+
+
+
+[[export-cypher-query-arrow-stream]]
+=== Export results of Cypher query to Apache Arrow binary output
+
+[source,cypher]
+----
+CALL apoc.export.arrow.stream.query('query_test.arrow', 
+    "RETURN 1 AS intData, 'a' AS stringData,
+        true AS boolData,
+        [1, 2, 3] AS intArray,
+        [1.1, 2.2, 3.3] AS doubleArray,
+        [true, false, true] AS boolArray,
+        [1, '2', true, null] AS mixedArray,
+        {foo: 'bar'} AS mapData,
+        localdatetime('2015-05-18T19:32:24') as dateData,
+        [[0]] AS arrayArray,
+        1.1 AS doubleData"
+) YIELD value
+----
+
+.Results
+[opts="header"]
+|===
+| value
+| <binary Apache Arrow output>
+|===
diff --git a/docs/asciidoc/modules/ROOT/pages/export/index.adoc b/docs/asciidoc/modules/ROOT/pages/export/index.adoc
@@ -15,3 +15,5 @@ For more information on how to use these procedures, see:
 
 * xref::export/xls.adoc[]
 * xref::export/parquet.adoc[]
+* xref::export/arrow.adoc[]
+
diff --git a/docs/asciidoc/modules/ROOT/pages/import/arrow.adoc b/docs/asciidoc/modules/ROOT/pages/import/arrow.adoc
@@ -0,0 +1,123 @@
+[[load-arrow]]
+= Load Arrow
+:description: This section describes procedures that can be used to import Apache Arrow data from web APIs or files.
+
+
+
+The following procedures allow you to read an Apache Arrow file exported via xref::export/arrow.adoc[apoc.export.arrow.* procedures].
+It could also potentially read other arrow files not created via the export procedures.
+
+[[load-arrow-available-procedures]]
+== Procedure and Function Overview
+
+The table below describes the available procedures and functions:
+
+[separator=¦,opts=header,cols="5,1m"]
+|===
+¦Qualified Name¦Type
+¦xref::overview/apoc.load/apoc.load.arrow.adoc[apoc.load.arrow icon:book[]] +
+`apoc.load.arrow(file STRING, config MAP<STRING, ANY>)` - imports `NODE` and `RELATIONSHIP` values from the provided arrow file.
+¦label:procedure[]
+
+¦xref::overview/apoc.load/apoc.load.arrow.stream.adoc[apoc.load.arrow.stream.adoc icon:book[]] +
+`apoc.load.arrow.stream(source LIST<INTEGER>, config MAP<STRING, ANY>)` - imports `NODE` and `RELATIONSHIP` values from the provided arrow byte array.
+¦label:procedure[]
+
+¦xref::overview/apoc.import/apoc.import.arrow.adoc[apoc.import.json icon:book[]] +
+`apoc.import.json(urlOrBinaryFile ANY, config MAP<STRING, ANY>)` - imports a graph from the provided JSON file.
+¦label:procedure[]
+|===
+
+
+NOTE: The import procedure, i.e. `apoc.import.parquet` is currently located in the link:https://neo4j.com/labs/apoc/5/[APOC Extended library].
+
+
+[[load-arrow-available-procedures-apoc.load.arrow]]
+=== `apoc.load.arrow`
+
+
+This procedure takes a file or HTTP URL and parses the Apache Parquet into a map data structure.
+
+[separator=¦,opts=header,cols="1m"]
+|===
+¦signature
+¦apoc.load.arrow(file :: STRING, config = {} :: MAP) :: (value :: MAP)
+|===
+
+
+Currently, this procedure does not support any config parameters.
+
+include::includes/enableFileImport.adoc[]
+
+[[load-arrow-available-procedures-apoc.load.arrow.stream]]
+=== `apoc.load.arrow.stream`
+
+
+This procedure takes a byte[] source and parses the Apache Parquet into a map data structure.
+
+[separator=¦,opts=header,cols="1m"]
+|===
+¦signature
+¦apoc.load.arrow.stream(source :: LIST<INTEGER>, config = {} :: MAP) :: (value :: MAP)
+|===
+
+Currently, this procedure does not support any config parameters.
+
+
+[[load-arrow-examples]]
+== Examples
+
+The following section contains examples showing how to import data from various Apache Arrow sources.
+
+[[load-arrow-examples-local-file]]
+=== Import from local file
+
+Taking the output  xref::export/arrow.adoc#export-cypher-query-arrow[of this case]:
+
+.The following query processes a `test.arrow` file and returns the content as Cypher data structures
+[source,cypher]
+----
+CALL apoc.load.arrow('test.arrow')
+YIELD value
+RETURN value
+----
+
+.Results
+[options="header"]
+|===
+| value
+| {arrayArray -> ["[0]"], dateData -> 2015-05-18T19:32:24Z, boolArray -> [true,false,true], intArray -> [1,2,3], mapData -> "{"foo":"bar"}", boolData -> true, intData -> 1, mixedArray -> ["1","2","true",<null>], doubleArray -> [1.1,2.2,3.3], doubleData -> 1.1, stringData -> "a"}
+|===
+
+
+[[load-arrow-examples-binary-source]]
+=== Import from binary source
+
+Taking the output xref::export/arrow.adoc#export-cypher-query-arrow-stream[of this case]:
+
+.The following query processes a `test.arrow` file and returns the content as Cypher data structures
+[source,cypher]
+----
+CALL apoc.load.arrow.stream('<binary arrow file>')
+YIELD value
+RETURN value
+----
+
+.Results
+[options="header"]
+|===
+| value
+| {arrayArray -> ["[0]"], dateData -> 2015-05-18T19:32:24Z, boolArray -> [true,false,true], intArray -> [1,2,3], mapData -> "{"foo":"bar"}", boolData -> true, intData -> 1, mixedArray -> ["1","2","true",<null>], doubleArray -> [1.1,2.2,3.3], doubleData -> 1.1, stringData -> "a"}
+|===
+
+
+[[import-arrow]]
+=== Import Arrow file created by Export Arrow procedures
+
+The `apoc.import.arrow` procedure can be used to import JSON files created by the `apoc.export.arrow.*` procedures.
+
+This procedure should not be confused with the `apoc.load.arrow*` procedures,
+which just loads the values of the Arrow file, and does not create entities in the database.
+
+See xref::overview/apoc.import/apoc.import.arrow.adoc[this page] for more info.
+
diff --git a/docs/asciidoc/modules/ROOT/pages/overview/apoc.export/apoc.export.arrow.all.adoc b/docs/asciidoc/modules/ROOT/pages/overview/apoc.export/apoc.export.arrow.all.adoc
@@ -0,0 +1,56 @@
+= apoc.export.arrow.all
+:description: This section contains reference documentation for the apoc.export.arrow.all procedure.
+
+label:procedure[]
+
+[.emphasis]
+`apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>)` - exports the full database as an arrow file.
+
+[NOTE]
+====
+This procedure is not considered safe to run from multiple threads.
+It is therefore not supported by the parallel runtime (introduced in Neo4j 5.13).
+For more information, see the link:{neo4j-docs-base-uri}/cypher-manual/{page-version}/planning-and-tuning/runtimes/concepts#runtimes-parallel-runtime[Cypher Manual -> Parallel runtime].
+====
+
+== Signature
+
+[source]
+----
+apoc.export.arrow.all(file :: STRING, config = {} :: MAP) :: (file :: STRING, source :: STRING, format :: STRING, nodes :: INTEGER, relationships :: INTEGER, properties :: INTEGER, time :: INTEGER, rows :: INTEGER, batchSize :: INTEGER, batches :: INTEGER, done :: BOOLEAN, data :: ANY)
+----
+
+== Input parameters
+[.procedures, opts=header]
+|===
+| Name | Type | Default
+|file|STRING|null
+|config|MAP|null
+|===
+
+== Output parameters
+[.procedures, opts=header]
+|===
+| Name | Type
+|file|STRING
+|source|STRING
+|format|STRING
+|nodes|INTEGER
+|relationships|INTEGER
+|properties|INTEGER
+|time|INTEGER
+|rows|INTEGER
+|batchSize|INTEGER
+|batches|INTEGER
+|done|BOOLEAN
+|data|STRING
+|===
+
+== Config parameters
+include::partial$usage/config/apoc.export.arrow.all.adoc[]
+
+[[usage-apoc.export.arrow.all]]
+== Usage Examples
+include::partial$usage/apoc.export.arrow.all.adoc[]
+
+xref::export/arrow.adoc[More documentation of apoc.export.arrow.all,role=more information]
diff --git a/docs/asciidoc/modules/ROOT/pages/overview/apoc.export/apoc.export.arrow.graph.adoc b/docs/asciidoc/modules/ROOT/pages/overview/apoc.export/apoc.export.arrow.graph.adoc
@@ -0,0 +1,57 @@
+= apoc.export.arrow.graph
+:description: This section contains reference documentation for the apoc.export.arrow.graph procedure.
+
+label:procedure[]
+
+[.emphasis]
+`apoc.export.arrow.graph(file STRING, graph ANY, config MAP<STRING, ANY>)` - exports the given graph as an arrow file.
+
+[NOTE]
+====
+This procedure is not considered safe to run from multiple threads.
+It is therefore not supported by the parallel runtime (introduced in Neo4j 5.13).
+For more information, see the link:{neo4j-docs-base-uri}/cypher-manual/{page-version}/planning-and-tuning/runtimes/concepts#runtimes-parallel-runtime[Cypher Manual -> Parallel runtime].
+====
+
+== Signature
+
+[source]
+----
+apoc.export.arrow.graph(file :: STRING, graph :: ANY, config = {} :: MAP) :: (file :: STRING, source :: STRING, format :: STRING, nodes :: INTEGER, relationships :: INTEGER, properties :: INTEGER, time :: INTEGER, rows :: INTEGER, batchSize :: INTEGER, batches :: INTEGER, done :: BOOLEAN, data :: ANY)
+----
+
+== Input parameters
+[.procedures, opts=header]
+|===
+| Name | Type | Default
+|file|STRING|null
+|graph|ANY|null
+|config|MAP|null
+|===
+
+== Output parameters
+[.procedures, opts=header]
+|===
+| Name | Type
+|file|STRING
+|source|STRING
+|format|STRING
+|nodes|INTEGER
+|relationships|INTEGER
+|properties|INTEGER
+|time|INTEGER
+|rows|INTEGER
+|batchSize|INTEGER
+|batches|INTEGER
+|done|BOOLEAN
+|data|STRING
+|===
+
+== Config parameters
+include::partial$usage/config/apoc.export.arrow.graph.adoc[]
+
+[[usage-apoc.export.arrow.graph]]
+== Usage Examples
+include::partial$usage/apoc.export.arrow.graph.adoc[]
+
+xref::export/arrow.adoc[More documentation of apoc.export.arrow.graph,role=more information]