Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master merge for 0.4.6 release #1054

Merged
merged 28 commits into from
Mar 6, 2024
Merged

master merge for 0.4.6 release #1054

merged 28 commits into from
Mar 6, 2024

Conversation

rudolfix
Copy link
Collaborator

@rudolfix rudolfix commented Mar 5, 2024

Description

devel to master merge for dlt release

Bl3f and others added 26 commits February 26, 2024 15:30
* is_fork env is back

* add skipifgithubfork to pdf-weaviate example
add missing extra for athena iceberg test workflow
* feat(airflow): expose the run method

* add docstrings
* Refactoring of pipe thread pool and reduce polling

* Decorator

* Tests for parallel

* Include items module

* Separate modules for pipe and pipe_iterator

* Overload arg order

* Keep parenthesis

* Use assert for sanity check

* Parallelize resource method, handle transformers

* Handle non-iterator transformers

* Rename WorkerPool -> FuturesPool, small cleanups

* Fix transformer, test bare generator

* Source parallelize skip invalid resources, docstring

* Handle wrapped generator function

* Don't test exec order, only check that multiple threads are used

* Poll futures with timeout, no check sources_count, test gen.close()

* Always block when submitting futures, remove redundant submit future

* adds additional sleep when futures pool is empty

* Update docs and snippets

* logs daemon signals message instead of printing

---------

Co-authored-by: Marcin Rudolf <[email protected]>
* iso datetime parser keeps naive datetimes

* warns when naive datetimes are used in incremental

* makes sure row values are always UTC in json incremental transform

* ensures that arrow scalar <-> python is always pendulum UTC

* overrides resource schema from pyarrow schema in extractor, logs warnings

* adds timezone end to end test, now disabled

* fixes dict merge, leave dst values flag
Edit arrow-pandas.md and fix a typo
Update "create destination" docs with new file layouts
* Docs update on how to set query limits.

* Update

* Updated

---------

Co-authored-by: Dave <[email protected]>
* fix missing arrow compute for incrementals on arrow loads

* fix numpy and pandas imports
* allow creation / initialiasation of loadpackage even if it exists already

* clear load package from normalizer stage before new load and add proper test

* moves test to normalize

* prefers storage schema to package schema in normalizer

* replaces schema content with linking

* warns when package schema is different

* loads storage schema in normalizer so import schema is found

---------

Co-authored-by: Marcin Rudolf <[email protected]>
* fix(core): validation error with TTableHintTemplate

* add a callable field test
* Updated for slack alerts.

* Updated

* Updated with production example

* updated

* Updated chess production

* Updated
* fix add_limit behavior in edge cases

* update docs
* helper to chunk iterators

* wraps add_limit without evaluating gen

* adds row order to incremental

* adds method to close pipe early

* auto updates docs

* merges typing and items in extract

* raises when pipe cannot be closed

* uses original pipe when re-binding internal incremental, fixes late binding issue with parallelize

* cleansup parallelize wrapper

* updates incremental docs

* updates incremental docs
…1051)

* Quick fix to serialize load metrics as list instead of a dictionary

* Revert data type change for metrics

* Add load id to metrics

* Enrich metrics with load_id in StepInfo.asdict

* Check last_trace schema hashes

* Extend test suite for pipeline load info schema

* Add faker to generate test data

* Describe test scenario

* Adjust test description docstring

* Adjust test doctstrings

* Rever airflow change

* Remove faker random seed

* Refactor test flow

* Remove faker

* Move tests to pipeline tests

* Move test data and resources inside the test

* fixes schema replace content hash tracking

* fixes trace shape test

---------

Co-authored-by: Marcin Rudolf <[email protected]>
* make sure import schema is respected if non other is present

* add some previously missing tests

* fix tests

* fix bug in loading import schema

* switch to dummy destination for tests

* runs import tests on ci

* load schema for live schemas will not committ live schema

* committs and rollbacks live schemas in with_schema_sync, saves imported schema when no exception. uses load_schema to import schema on extract

* moves import schema tests

* fix bug in stored schema comparison

* adds comment on restoring schema names and default schema name on exception

* adds more tests and fixes one bug

* align resource and pure data source schema with regular schema

* fix linting and revert changes in basic resource schema resolution

* fix one test

* fix 2 review comments

---------

Co-authored-by: Marcin Rudolf <[email protected]>
Copy link

netlify bot commented Mar 5, 2024

Deploy Preview for dlt-hub-docs ready!

Name Link
🔨 Latest commit 3761335
🔍 Latest deploy log https://app.netlify.com/sites/dlt-hub-docs/deploys/65e81a4a9d78e80008aab0a3
😎 Deploy Preview https://deploy-preview-1054--dlt-hub-docs.netlify.app/docs/examples
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@rudolfix rudolfix merged commit 1957384 into master Mar 6, 2024
49 of 53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.