This repository has been archived by the owner on Jun 30, 2022. It is now read-only.
Version 0.2.2
The 0.2.2 release includes the following changes:
- Improved memory footprint for DirectPipelineRunner.
- Multiple bug fixes (BigQuerySink schema handling for record field types, more clear error messages for missing files, etc.).
- Several performance improvements (cythonize some files, reduced debug logging, etc.).
- New example
using more complex BigQuery schemas
This release supports only batch execution. Streaming processing is not available yet.
The batch execution can be done locally (for development/testing) or in the Google cloud using the Cloud Dataflow service. Running against the Google cloud requires whitelisting using this form.