Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #4253: Missing apache arrow/jsonParams documentation #4255

Draft
wants to merge 1 commit into
base: dev
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/asciidoc/modules/ROOT/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,13 @@ include::partial$generated-documentation/nav.adoc[]
** xref::import/html.adoc[]
** xref::import/parquet.adoc[]
** xref::import/gexf.adoc[]
** xref::import/arrow.adoc[]


* xref:export/index.adoc[]
** xref::export/xls.adoc[]
** xref::export/arrow.adoc[]


* xref:database-integration/index.adoc[]
** xref::database-integration/load-jdbc.adoc[]
Expand Down
104 changes: 104 additions & 0 deletions docs/asciidoc/modules/ROOT/pages/export/arrow.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
[[export-arrow]]
= Export to Apache Arrow
:description: This section describes procedures that can be used to export data in Apache Arrow format.

The export Apache Arrow procedures export data into a format that's used by many Apache tools and non.


[[export-arrow-available-procedures]]
== Available Procedures

The table below describes the available procedures:

[separator=¦,opts=header,cols="5,1m"]
|===
¦Qualified Name¦Type
¦xref::overview/apoc.export/apoc.export.arrow.all.adoc[apoc.export.arrow.all icon:book[]] +
`apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>)` - exports the full database as an arrow file.
¦label:procedure[]

¦xref::overview/apoc.export/apoc.export.arrow.graph.adoc[apoc.export.arrow.graph.adoc icon:book[]] +
`apoc.export.arrow.graph(file STRING, graph ANY, config MAP<STRING, ANY>)` - exports the given graph as an arrow file.
¦label:procedure[]

¦xref::overview/apoc.export/apoc.export.arrow.query.adoc[apoc.export.arrow.query.adoc icon:book[]] +
`apoc.export.arrow.stream.all(config MAP<STRING, ANY>)` - exports the full database as an arrow byte array.
¦label:procedure[]

¦xref::overview/apoc.export/apoc.export.arrow.stream.all.adoc[apoc.export.arrow.stream.all icon:book[]] +
`apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>)` - exports the full database as an arrow file.
¦label:procedure[]

¦xref::overview/apoc.export/apoc.export.arrow.stream.graph.adoc[apoc.export.arrow.stream.graph.adoc icon:book[]] +
`apoc.export.arrow.stream.graph(graph ANY, config MAP<STRING, ANY>)` - exports the given graph as an arrow byte array.
¦label:procedure[]

¦xref::overview/apoc.export/apoc.export.arrow.stream.query.adoc[apoc.export.arrow.stream.query.adoc icon:book[]] +
`apoc.export.arrow.stream.query(query ANY, config MAP<STRING, ANY>)` - exports the given Cypher query as an arrow byte array.
¦label:procedure[]
|===


[[export-arrow-file-export]]
== Exporting to a file

include::partial$enableFileExport.adoc[]


[[export-arrow-examples]]
== Examples


[[export-cypher-query-arrow]]
=== Export results of Cypher query to Apache Arrow file

[source,cypher]
----
CALL apoc.export.arrow.query('query_test.arrow',
"RETURN 1 AS intData, 'a' AS stringData,
true AS boolData,
[1, 2, 3] AS intArray,
[1.1, 2.2, 3.3] AS doubleArray,
[true, false, true] AS boolArray,
[1, '2', true, null] AS mixedArray,
{foo: 'bar'} AS mapData,
localdatetime('2015-05-18T19:32:24') as dateData,
[[0]] AS arrayArray,
1.1 AS doubleData"
) YIELD file
----

.Results
[opts="header"]
|===
| file | source | format | nodes | relationships | properties | time | rows | batchSize | batches | done | data
| "query_test.arrow" | "statement: cols(11)" | "arrow" | 0 | 0 | 11 | 468 | 11 | 2000 | 1 | true | <null>
|===



[[export-cypher-query-arrow-stream]]
=== Export results of Cypher query to Apache Arrow binary output

[source,cypher]
----
CALL apoc.export.arrow.stream.query('query_test.arrow',
"RETURN 1 AS intData, 'a' AS stringData,
true AS boolData,
[1, 2, 3] AS intArray,
[1.1, 2.2, 3.3] AS doubleArray,
[true, false, true] AS boolArray,
[1, '2', true, null] AS mixedArray,
{foo: 'bar'} AS mapData,
localdatetime('2015-05-18T19:32:24') as dateData,
[[0]] AS arrayArray,
1.1 AS doubleData"
) YIELD value
----

.Results
[opts="header"]
|===
| value
| <binary Apache Arrow output>
|===
2 changes: 2 additions & 0 deletions docs/asciidoc/modules/ROOT/pages/export/index.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,5 @@ For more information on how to use these procedures, see:

* xref::export/xls.adoc[]
* xref::export/parquet.adoc[]
* xref::export/arrow.adoc[]

123 changes: 123 additions & 0 deletions docs/asciidoc/modules/ROOT/pages/import/arrow.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
[[load-arrow]]
= Load Arrow
:description: This section describes procedures that can be used to import Apache Arrow data from web APIs or files.



The following procedures allow you to read an Apache Arrow file exported via xref::export/arrow.adoc[apoc.export.arrow.* procedures].
It could also potentially read other arrow files not created via the export procedures.

[[load-arrow-available-procedures]]
== Procedure and Function Overview

The table below describes the available procedures and functions:

[separator=¦,opts=header,cols="5,1m"]
|===
¦Qualified Name¦Type
¦xref::overview/apoc.load/apoc.load.arrow.adoc[apoc.load.arrow icon:book[]] +
`apoc.load.arrow(file STRING, config MAP<STRING, ANY>)` - imports `NODE` and `RELATIONSHIP` values from the provided arrow file.
¦label:procedure[]

¦xref::overview/apoc.load/apoc.load.arrow.stream.adoc[apoc.load.arrow.stream.adoc icon:book[]] +
`apoc.load.arrow.stream(source LIST<INTEGER>, config MAP<STRING, ANY>)` - imports `NODE` and `RELATIONSHIP` values from the provided arrow byte array.
¦label:procedure[]

¦xref::overview/apoc.import/apoc.import.arrow.adoc[apoc.import.json icon:book[]] +
`apoc.import.json(urlOrBinaryFile ANY, config MAP<STRING, ANY>)` - imports a graph from the provided JSON file.
¦label:procedure[]
|===


NOTE: The import procedure, i.e. `apoc.import.parquet` is currently located in the link:https://neo4j.com/labs/apoc/5/[APOC Extended library].


[[load-arrow-available-procedures-apoc.load.arrow]]
=== `apoc.load.arrow`


This procedure takes a file or HTTP URL and parses the Apache Parquet into a map data structure.

[separator=¦,opts=header,cols="1m"]
|===
¦signature
¦apoc.load.arrow(file :: STRING, config = {} :: MAP) :: (value :: MAP)
|===


Currently, this procedure does not support any config parameters.

include::includes/enableFileImport.adoc[]

[[load-arrow-available-procedures-apoc.load.arrow.stream]]
=== `apoc.load.arrow.stream`


This procedure takes a byte[] source and parses the Apache Parquet into a map data structure.

[separator=¦,opts=header,cols="1m"]
|===
¦signature
¦apoc.load.arrow.stream(source :: LIST<INTEGER>, config = {} :: MAP) :: (value :: MAP)
|===

Currently, this procedure does not support any config parameters.


[[load-arrow-examples]]
== Examples

The following section contains examples showing how to import data from various Apache Arrow sources.

[[load-arrow-examples-local-file]]
=== Import from local file

Taking the output xref::export/arrow.adoc#export-cypher-query-arrow[of this case]:

.The following query processes a `test.arrow` file and returns the content as Cypher data structures
[source,cypher]
----
CALL apoc.load.arrow('test.arrow')
YIELD value
RETURN value
----

.Results
[options="header"]
|===
| value
| {arrayArray -> ["[0]"], dateData -> 2015-05-18T19:32:24Z, boolArray -> [true,false,true], intArray -> [1,2,3], mapData -> "{"foo":"bar"}", boolData -> true, intData -> 1, mixedArray -> ["1","2","true",<null>], doubleArray -> [1.1,2.2,3.3], doubleData -> 1.1, stringData -> "a"}
|===


[[load-arrow-examples-binary-source]]
=== Import from binary source

Taking the output xref::export/arrow.adoc#export-cypher-query-arrow-stream[of this case]:

.The following query processes a `test.arrow` file and returns the content as Cypher data structures
[source,cypher]
----
CALL apoc.load.arrow.stream('<binary arrow file>')
YIELD value
RETURN value
----

.Results
[options="header"]
|===
| value
| {arrayArray -> ["[0]"], dateData -> 2015-05-18T19:32:24Z, boolArray -> [true,false,true], intArray -> [1,2,3], mapData -> "{"foo":"bar"}", boolData -> true, intData -> 1, mixedArray -> ["1","2","true",<null>], doubleArray -> [1.1,2.2,3.3], doubleData -> 1.1, stringData -> "a"}
|===


[[import-arrow]]
=== Import Arrow file created by Export Arrow procedures

The `apoc.import.arrow` procedure can be used to import JSON files created by the `apoc.export.arrow.*` procedures.

This procedure should not be confused with the `apoc.load.arrow*` procedures,
which just loads the values of the Arrow file, and does not create entities in the database.

See xref::overview/apoc.import/apoc.import.arrow.adoc[this page] for more info.

Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
= apoc.export.arrow.all
:description: This section contains reference documentation for the apoc.export.arrow.all procedure.

label:procedure[]

[.emphasis]
`apoc.export.arrow.all(file STRING, config MAP<STRING, ANY>)` - exports the full database as an arrow file.

[NOTE]
====
This procedure is not considered safe to run from multiple threads.
It is therefore not supported by the parallel runtime (introduced in Neo4j 5.13).
For more information, see the link:{neo4j-docs-base-uri}/cypher-manual/{page-version}/planning-and-tuning/runtimes/concepts#runtimes-parallel-runtime[Cypher Manual -> Parallel runtime].
====

== Signature

[source]
----
apoc.export.arrow.all(file :: STRING, config = {} :: MAP) :: (file :: STRING, source :: STRING, format :: STRING, nodes :: INTEGER, relationships :: INTEGER, properties :: INTEGER, time :: INTEGER, rows :: INTEGER, batchSize :: INTEGER, batches :: INTEGER, done :: BOOLEAN, data :: ANY)
----

== Input parameters
[.procedures, opts=header]
|===
| Name | Type | Default
|file|STRING|null
|config|MAP|null
|===

== Output parameters
[.procedures, opts=header]
|===
| Name | Type
|file|STRING
|source|STRING
|format|STRING
|nodes|INTEGER
|relationships|INTEGER
|properties|INTEGER
|time|INTEGER
|rows|INTEGER
|batchSize|INTEGER
|batches|INTEGER
|done|BOOLEAN
|data|STRING
|===

== Config parameters
include::partial$usage/config/apoc.export.arrow.all.adoc[]

[[usage-apoc.export.arrow.all]]
== Usage Examples
include::partial$usage/apoc.export.arrow.all.adoc[]

xref::export/arrow.adoc[More documentation of apoc.export.arrow.all,role=more information]
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
= apoc.export.arrow.graph
:description: This section contains reference documentation for the apoc.export.arrow.graph procedure.

label:procedure[]

[.emphasis]
`apoc.export.arrow.graph(file STRING, graph ANY, config MAP<STRING, ANY>)` - exports the given graph as an arrow file.

[NOTE]
====
This procedure is not considered safe to run from multiple threads.
It is therefore not supported by the parallel runtime (introduced in Neo4j 5.13).
For more information, see the link:{neo4j-docs-base-uri}/cypher-manual/{page-version}/planning-and-tuning/runtimes/concepts#runtimes-parallel-runtime[Cypher Manual -> Parallel runtime].
====

== Signature

[source]
----
apoc.export.arrow.graph(file :: STRING, graph :: ANY, config = {} :: MAP) :: (file :: STRING, source :: STRING, format :: STRING, nodes :: INTEGER, relationships :: INTEGER, properties :: INTEGER, time :: INTEGER, rows :: INTEGER, batchSize :: INTEGER, batches :: INTEGER, done :: BOOLEAN, data :: ANY)
----

== Input parameters
[.procedures, opts=header]
|===
| Name | Type | Default
|file|STRING|null
|graph|ANY|null
|config|MAP|null
|===

== Output parameters
[.procedures, opts=header]
|===
| Name | Type
|file|STRING
|source|STRING
|format|STRING
|nodes|INTEGER
|relationships|INTEGER
|properties|INTEGER
|time|INTEGER
|rows|INTEGER
|batchSize|INTEGER
|batches|INTEGER
|done|BOOLEAN
|data|STRING
|===

== Config parameters
include::partial$usage/config/apoc.export.arrow.graph.adoc[]

[[usage-apoc.export.arrow.graph]]
== Usage Examples
include::partial$usage/apoc.export.arrow.graph.adoc[]

xref::export/arrow.adoc[More documentation of apoc.export.arrow.graph,role=more information]
Loading
Loading