From 67262b8df747a21ad356997d9a4af1a7cf96b65d Mon Sep 17 00:00:00 2001
From: kindly Change LogSemantic Versioning.
Large xlsx cell values being truncated causing panic if in unicode char.
Large xlsx cell values being truncated panic when multi threading.
Upgrade deps, low_memory option for API
Large xlsx cell values being truncated causing panic if in unicode char.
Upgrade deps, better build times due to latest duckdb
Upgrade deps, low_memory option for API
arrays_as_table
option added to convert all arrays to their own table.
Upgrade deps, better build times due to latest duckdb
Errors get raised for postgresql conversion.
arrays_as_table
option added to convert all arrays to their own table.
Parquet naming of headers where incorrect for dates.
Errors get raised for postgresql conversion.
Allow multiple files while downloading from s3
Stop detecting floats where precision is too low.
Parquet naming of headers where incorrect for dates.
CSV output to S3 broken in some cases.
Stop csv directory being made when using S3
Allow multiple files while downloading from s3
Stop detecting floats where precision is too low.
CSV output to S3 broken in some cases.
Stop csv directory being made when using S3
JSON Input sources from STDIN, HTTP, S3 and allow all inputs to be GZIPed if have .gz
ending.
Command line now accepts multiple files from any source.
Better type guessing for database inserts.
no_link
option that removes _link
fields in the output.
Truncate cell that is larger than xlsx allows.
Allow more rows in xlsx in non threaded mode.
Web Assembly version of libflatterer. Available to use here https://lite.flatterer.dev/.
Upgrade to vue 3 and vite for web frontend.
Ignore blank lines in json lines files
Better errors when too many files are open
Support python 3.11
Error not writing larger XLSX files
Support python 3.11
Cors for web api
Error not writing larger XLSX files
Local web interface for exploring flatterer features flatterer --web
.
Cors for web api
Local web interface for exploring flatterer features flatterer --web
.
evolve
option for sqlite and postgres. Can add data to existing tables and will alter tables if new fields are needed.
drop
option for sqlite
.
Postgres connection from environment variable
sql_script
option to export scripts that for sqlite and postgres to make output backward compatable with earlier versions.
pushdown
option. Copy data from top level objects down to child (one-to-many) tables. This is useful if the data has its own keys (such as id
fields) that you want to exist in the related tables. Also useful for denormalizing the data so querying on a common field, requires less joining.
postgres
option. Export to postgres database by supplying a connection string.
files
option, so multiple files can be supplied at once.
Threads option now can output xlsx
Threads option, so that can be run on all cores. Works best with ndjson input.
Parquet export option.
BREAKING: json-lines
option renamed to ndjson
New json-stream
option that works in same way as the old json-lines option and accepts concatonated json.
Datapackage output uses correct date type
Lists of strings are now escaped the same way as optional quoted CSVs
BREAKING: json-lines
option renamed to ndjson
New json-stream
option that works in same way as the old json-lines option and accepts concatonated json.
Clearer errors when error happens in rust. BREAKING CHANGE, if catching certain error types in python these may have changed.
datapackage output now has forign keys.
Datapackage output uses correct date type
Lists of strings are now escaped the same way as optional quoted CSVs
Python decimal converted to float not string.
Clearer errors when error happens in rust. BREAKING CHANGE, if catching certain error types in python these may have changed.
datapackage output now has forign keys.
SQLite export lower memory use
Python decimal converted to float not string.
SQLite export has indexes and foreign key contraints.
SQLite export lower memory use
main_table_name was number caused exception
SQLite export has indexes and foreign key contraints.
list of JSON strings supplied to flatten
fixed.
datapackage.json named correctly.
main_table_name was number caused exception
flatten
python function now accepts iterator
Docs for flatten
Tests for flatten
iterator_flatten now deprecated as it is just a subset of `flatten’
list of JSON strings supplied to flatten
fixed.
datapackage.json named correctly.
More lenient if tmp directory can not be deleted.
flatten
python function now accepts iterator
Docs for flatten
Tests for flatten
iterator_flatten now deprecated as it is just a subset of `flatten’
Preview option in python CLI and library.
More lenient if tmp directory can not be deleted.
SQLite export option
Preview option in python CLI and library.
SQLite export option
Support top level object. All list of objects are streamed and top level object data saved in main table.
New yaglish parser for both json stream and arrays.
Library has schema_guess function to tell if data is a JSON Stream or has an array of object.
Empty objects do not make a line in output.
ctrlc support added.
Logging output improved.
Traceback not shown for CLI use.
Occurences where output folder not being deleted.
tables.csv
input in order to control tab names. Tables File option
Beginning to use logging.
Better handling of long excel sheet names names. See https://github.com/kindly/flatterer/issues/12
field_type
no longer required in fields.csv.
More human readable error messages.
Bad characters in XLSX stripped and raise warning.
Check limits on XLSX files and raise error if found.
Removed unwrap on channel send, to remove possible panic.
Table ordering of output in JSON input order. Making xlsx
and fields.csv
table order reflect the input data.
Lib has new FlatFiles::new_with_dafualts()
to make using the library less verbose.
Use insta for more tests.
Lib has preview option, meaning CSV output will optionally only show specified number of lines.
Paths to data in sqlite and postgres start at root of output.
Clippy for linting and insta for tests.
Do less work when just exporting Metadata.
Minor speedup due to not using format
so much.
Change to pypi metadata
Tests run in action
Regression in speed due to new error handling
New error handling using anyhow
, giving errors more context.
Schema option to supply JSONSchema, to make field order the same as schema.
Table prefix option to namespece exported tables.
Postgresql and sqlite scripts to load CSV data into databases.
Wheel builds for Windows and MacOS, automatically published using github actions.
Inline One to One option to mean that if an array only has one item in for all the data then treat it as sub-object.