Skip to content

Commit 3326484

Browse files
author
Ondrej Zizka
committed
Update README.md
1 parent 62c0de4 commit 3326484

File tree

1 file changed

+32
-9
lines changed

1 file changed

+32
-9
lines changed

README.md

+32-9
Original file line numberDiff line numberDiff line change
@@ -106,12 +106,14 @@ Each import config starts with `-in`, each export with `-out`.
106106
Both need a filesystem path to read from, resp. write to, and have further options.
107107
Some import options may also be taken from defaults, which are configured after `-all`.
108108

109+
```shell
109110
./crunch [<global options...>]
110111
-in <file.csv> [-as <alias>] [--format=JSON|CSV] [-indexed column1,column2,...] [other options...]
111112
-in <file.json> [-as ...] [-itemsAt /path/in/json/tree] [other options...]
112113
-out <resultFile.csv> [-sql <SQL query>] [--format=JSON|CSV] [other options...]
113114
-out ...
114115
-all [<default or global options>]
116+
```
115117

116118
### Options
117119

@@ -122,7 +124,8 @@ Leave me a comment in the respective GitHub issues if per-import/export configur
122124
* Input paths, comma and/or space separated.
123125
* Can be CSV files, JSON files (if ending `.json`), or directories with such files.
124126
* Multiple files may be imported to the same table, see `--combineInputs`.
125-
* The columns may get indexed to speed up the `JOIN`, `WHERE` and `GROUP BY` clauses. See `-indexed ...`
127+
* `-indexed ...` The columns may get indexed to speed up the `JOIN`, `WHERE` and `GROUP BY` clauses.
128+
* `-as` The name of the table this import will be loaded to.
126129

127130
* `-out`
128131
* Output path. If ends with `.json`, the output is JSON.
@@ -134,7 +137,7 @@ Leave me a comment in the respective GitHub issues if per-import/export configur
134137
* See [HSQLDB documentation](http://hsqldb.org/doc/2.0/guide/guide.html#sqlgeneral-chapt) for the vast SQL operations at hand.
135138
* The SELECT is subject to certain tweaks necessary to deliver some convenience of usage.
136139
They may, however, result in an unpredicted behavior. Please consult the logs if you hit some cryptic errors from HSQLDB.
137-
140+
138141
* `-db <pathToDatabaseDirectory>`
139142
* Determines where the files of the underlying database will be stored. Defaults to `hsqldb/cruncher`.
140143

@@ -185,7 +188,6 @@ Read the logs or use `-sql SELECT ... FROM INFORMATION_SCHEMA.*` to study the sc
185188
* `entries` (default) will create a JSON entry per line, representing the original rows.
186189
* `array` will create a file with a JSON array (`[...,...]`).
187190

188-
189191
This README may be slightly obsolete; For a full list of options, check the
190192
[`Options`](https://github.com/OndraZizka/csv-cruncher/blob/master/src/main/java/cz/dynawest/csvcruncher/Options.java) class.
191193

@@ -255,15 +257,15 @@ Notice the `.json` suffix, which tells CsvCruncher to produce JSON. `--json=entr
255257
Project status
256258
==============
257259

258-
I develop this project ocassionally, when I need it. Which has been surprisingly often in the last 10 years,
260+
I develop this project occasionally, when I need it. Which has been surprisingly often in the last 10 years,
259261
for various reasons:
260262
* It's faster than importing to a real DB server.
261263
* It's the only tool I have found which can convert any generic JSON to tabular data without any prior metadata.
262264
* NoSQL databases do not support joins so exporting parts of them to JSON and querying using CsvCruncher is often my only OLAP option.
263265
* Lack of other lightweight ETL tools.
264266

265-
That, however, makes it susceptible to being developed in isolated streaks, and lack on features I do not need.
266-
I try to avoid bugs by covering the promised features with tests but it's far from complete coverage.
267+
That, however, makes it susceptible to being developed in isolated streaks, and lack of features I do not need.
268+
I try to avoid bugs by covering the promised features with tests, but it's far from complete coverage.
267269
268270
Where can you help (as a developer)?
269271
--------------
@@ -278,8 +280,11 @@ What's new
278280
---------
279281
<details><summary>What's new</summary>
280282
281-
* `2023-12-01` Release 2.7.1 - Various fixes of annoying UX bugs. #151 #152 #153
282-
* `2023-11-xx` Rebased branch with reading from spreadsheets.
283+
* `2024-12-02` Release 2.10.1 - Added SQL functions to process JSON.
284+
* `2024-12-01` Release 2.9.0 - Added file type detection.
285+
* `2024-12-01` Release 2.8.0 - UX improvements - less garbage on the stderr.
286+
* `2024-12-01` Release 2.7.1 - Various fixes of annoying UX bugs. #151 #152 #153. HSQLDB 2.7.4.
287+
* `2024-11-xx` Rebased branch with reading from spreadsheets.
283288
* `2023-09-03` Release 2.7.0 - Allow output to STDOUT.
284289
* `2023-06-29` Release 2.6.0 - Allow setting table names (`-as`) for input files.
285290
* `2022-11-27` ~~Preparing 2.5.x - reading from spreadsheets (Excel/XLS, LibreOffice/ODS, etc.)~~ Still in progress.
@@ -333,7 +338,25 @@ In case you use this in your project, then beware:
333338
4. Consider donating to [HSQLDB "SupportWare"](http://hsqldb.org/web/supportware.html).
334339
335340
336-
*Easter Egg: The original text I sent to JBoss mailing list when introducing the tool in 2011 :)*
341+
342+
## What didn't fit elsewhere..
343+
344+
#### Custom SQL functions
345+
346+
CsvCruncher adds a couple of SQL functions to HSQLDB.
347+
348+
* `jsonSubtree(path, json)` - Returns a json subtree (as JSON) at a given slash-separated path (`foo/bar`). Arrays not supported, but could be added.
349+
350+
* `jsonLeaf(path, json)` - Like above, but expects the node to be a scalar, and returns the raw value rather than JSON serialization of it.
351+
352+
* `jsonLeaves(pathToArray STRING, leavesSubpath STRING, json STRING, nullOnNonArray BOOLEAN)` - returns the leaves form an array, extracted from the given subpath (of each item in that array). Returns it serialized to JSON - due to limitations of HSQLDB. Expects the leaves to be scalar.
353+
354+
* ~~`jsonSubtrees(pathToArray, subpath, json)`~~ - Not implemented. It would do the same as `jsonLeaves()`, except it would put the sub-nodes (rather than only scalars) to an array of subtrees. Let me know if you need it. (The reason why `jsonSubtrees()` is missing is that originally, `jsonLeaves()` was supposed to return a SQL type `ARRAY`, but that is not supported by HSQLDB.)
355+
356+
357+
#### Memories
358+
359+
The original text I sent to JBoss mailing list when introducing the tool in 2011 :)*
337360
338361
> Hi,
339362
>

0 commit comments

Comments
 (0)