Releases: dlt-hub/dlt
0.3.1
What's Changed
- add computed exhausted property by @sh-rp in #380
- removes the unpickable lambdas from destination caps and updates tests by @rudolfix in #404
- add secrets format option to dlt deploy by @sh-rp in #401
- Feat: Use compression to maximize network and disk space efficiency by @z3z1ma in #415
- 379 round robin pipe iterator by @sh-rp in #421
Docs
- adding article by @TongHere in #411
- GPT Training fix link by @TongHere in #417
- Docs: deploy airflow by @AstrakhantsevaAA in #410
- restructured docs: new Getting Started and dlt Ecosystem @rahuljo in #398 @adrianbr in #408
- Added Jira Docs by @dat-a-man in #425
- add structured data lake, fix titles by @adrianbr in #419
- adds duckdb->bigquery walkthrough by @rudolfix in #392
- Added sql_database pipeline by @dat-a-man in #396
- Added stripe setup guide by @dat-a-man in #394
- Added Workable pipeline docs by @dat-a-man in #395
- Added salesforce docs by @dat-a-man in #413
- Added Notion Docs by @dat-a-man in #409
- Added Mux docs by @dat-a-man in #412
New Contributors
Full Changelog: 0.3.0...0.3.1
0.3.0
Core Library
- renames Pipelines to Verified Sources by @rudolfix in #382
- adds tests to build containers, removes psutil by @rudolfix in #373
- finalizes where the resource state is stored in pipeline state by @rudolfix in #374
- accepts explicit values for unions if type of value is one of types by @rudolfix in #377
- add quotes to missing dependency exception output by @sh-rp in #387
- Feat/Add transaction management for filesystem operations using fsspec by @z3z1ma in #384
Minor Version Changes
- source name is now the key in pipeline state that stores all the source and resource state. previously the source section (which was the name of python module where source was defined) was used. this change will affect the already deployed pipelines that had name of the source different from the name of the module. they will not see the already stored state and may, for example, load some data twice. the only verified source affected by this is zendesk.
Docs
- rewrites the sections on source, resource and pipeline state by @rudolfix in #376
- minor changes to schema evolution doc by @rahuljo in #372
- pushing experiment 4 blog by @rahuljo in #371
- update docusaurus and fix gtag by @sh-rp in #385
- add section landing pages to docusaurus by @sh-rp in #386
New Contributors
Full Changelog: 0.2.9...0.3.0
0.2.9
Core Library
- dlt source decomposition into Airflow DAG by @rudolfix in #352
- airflow dlt wrapper to run dlt pipelines as DAGs by @rudolfix in #357
- dlt deploy airflow-composer by @AstrakhantsevaAA in #356
- new destination: filesystem/bucket with fsspec by @steinitzu in #342
- Update deprecated GitHub action by @tungbq in #345
- A base class for vault config providers with two implementations Google Secrets config provider and Airflow config provider
Docs
- pushing experiment 3 blog post by @rahuljo in #361
- structured data lakes post by @adrianbr in #362
- Several fixes and improvements by @tungbq
New Contributors
- @AstrakhantsevaAA made their first contribution in #356
Full Changelog: 0.2.8...0.2.9
0.2.8
Core Library
- fixes various airflow deployment issues by @rudolfix in #334 that include on-atomic renames on bucket mapped with fuse
- bumps duckdb dependency to include 0.8.0
- Fix/incremental with timezone naive datetime by @steinitzu in #330
- splits schema migration script to fit in max query length by @rudolfix in #339
resource_state
got final interface and is now exposed indlt.current.resource_state
#350- adds transformer overload that may be used when creating transformers dynamically to pass the decorated function
source.with_resources
creates a clone of resource and selects in the clone. previously source was modified in place- you can write back the secrets and configuration using
dlt.config
anddlt.secrets
indexers
Docs
- improve spaces on code samples by @TyDunn in #325
- Incremental loading image by @adrianbr in #318
- Adding matomo docs by @AmanGuptAnalytics in #331
- Adding asana dlt setup guide by @AmanGuptAnalytics in #319
- adding experiment 2 blog by @rahuljo in #336
- Fix Broken image in Docs > Pipelines > Google Analytics by @tungbq in #328
- Adding shopify docs by @AmanGuptAnalytics in #335
- Fixed the broken image link on zendesk page by @MirrorCraze in #337
- Fixed capitalization in docs by @burnash in #341
- Fix typo in docs/pipelines/asana.md by @tungbq in #344
- Fix analytics-engineer.md broken links by @tungbq in #349
- Correct typo in run pipeline guide of shopify.md by @tungbq in #347
New Contributors
- @tungbq made their first contribution in #328
- @MirrorCraze made their first contribution in #337
Full Changelog: 0.2.6...0.2.8
0.2.6
Core Library
- An experimental google secrets config provider #292 (actively used on our CI, goes to GA after adding more tests)
- Several bug fixes for
dlt init
anddlt pipeline
CLI commands - We are shifting from pre-releases to patch versions with post-releases for bugfixes and quick iterations to allow upgrades with
pip install -U dlt
Building Blocks
- add
dlt.sources.credentials
module with reusable credentials by @rudolfix in #315 (google service account, oauth2, database connection string and their base classes available to be used in pipelines)
Docs
- fix redshift docs by @TyDunn in #313
- User guides by @adrianbr in #301
- Updating landing page code snippet by @rahuljo in #314
- Updated README in
pipelines
repo with more building blocks examples and a guide on sharing community pipelines (https://github.com/dlt-hub/pipelines/blob/master/README.md#read-the-docs-on-building-blocks)
Full Changelog: 0.2.6a1...0.2.6
0.2.6a1
Core library
- Feat/pipeline drop command by @steinitzu in #285
- collectors and progress bars by @rudolfix in #302
Customizations
- Feat/new
add_limit
method for resources by @z3z1ma in #298 - Same method added to sources. overall you can now quickly sample large sources to ie. create example data sets, test your transformations etc. without the need to load everything
Docs
- explains how to set logging level and format by @rudolfix in #297
- ga4 internal dashboard demo blog post by @TyDunn in #299
- Added google_analytics docs by @AmanGuptAnalytics in #305
- Update README, add contributor's guide by @burnash in #311
- progress bars docs by @rudolfix in #312
New Contributors
- @z3z1ma made their first contribution in #298
- @ashish-weblianz made their first contribution in #306
Full Changelog: 0.2.6a0...0.2.6a1
0.2.6a0
New package name and pip install command
💡 We changed the package name to dlt!
pip install dlt
Core library
- PyPI package name: migrate to
dlt
by @burnash in #264 - adds anonymous id to telemetry by @rudolfix in #284
- makes duckdb database to follow current working directory by @rudolfix in #291
- you can disable unique checks in incremental by passing empty tuple as
primary_key
todlt.sources.incremental
Helpers
a first of series of Airflow helpers and features: store secrets.toml
in Airflow Variable and have your credentials injected automatically. same code works locally and in Airflow DAG.
Building blocks
When building pipelines you can now use specs that wrap google credentials. we support service credentials and oauth2 credentials, detect default credentials, provide authorization methods etc. info on credentials below will soon be added to our docs
and some example pipelines.
from dlt.common.configuration.specs import GcpClientCredentials, GcpClientCredentialsWithDefault, GcpOAuthCredentials, GcpOAuthCredentialsWithDefault
Docs
- update doc with new alert capability by @adrianbr in #275
- updating the documentation for the section 'Transforming the data' by @rahuljo in #277
- first version of
understanding the tables
content by @TyDunn in #258 - Rename PyPI package to
dlt
in the docs by @burnash in #282 - pushing the new colab demo by @rahuljo in #288
- updates explore/transform the data in Python by @rudolfix in #289
- update a typo in create-a-pipeline.md by @Anggi-Permana-Harianja in #290
New Contributors
- @Anggi-Permana-Harianja made their first contribution in #290
- @redicane made their first contribution in #254
Full Changelog: 0.2.0a32...0.2.6a0
0.2.0a32
What's Changed in Docs
- moving to new docs structure by @TyDunn in #245
- adds Agolia DocSearch to the dlt docs 🚀 by @TyDunn in #248
- Zendesk pipeline docs by @AmanGuptAnalytics in #222
- Added Hubspot setup guide by @AmanGuptAnalytics in #250
- moving
create a pipeline
to use weatherapi and duckdb by @TyDunn in #255 - first version of
exploring the data
docs page by @TyDunn in #257 - adds schema general usage and schema adjusting walkthrough to docs by @rudolfix in #243
- filling in deploying section by @TyDunn in #262
- Examples for customisations by @adrianbr in #247
What's Changed
- Typed pipeline state by @steinitzu in #239
- allows
incremental
to be passed toresource.apply_hints()
method - adds
state
property to sources and resources to get actual value of source and resource scoped state - Fix failing tests for Redshift and PostgreSQL by @burnash in #270
- add resource name to table schema by @steinitzu in #265
- resets the resource scoped state when doing replace on resource
- you can add
Incremental
as a transform step, instead of injecting
Full Changelog: 0.2.0a30...0.2.0a32
0.2.0a30
What's Changed
This release includes two important features
merge
write disposition: load data incrementally by merging with merge keys and/or deduplicate/upsert with primary keys- incremental loading with last value and
dlt
state available when declaring resources
We consider those features still in alpha. Try them out and report bugs! Preliminary documentation is here: https://dlthub.com/docs/customization/incremental-loading
This release includes improved support for resources that use dynamic hint to dispatch data to several database tables and other bug fixes.
What's Changed in docs
- Strapi setup guide by @TyDunn in #212
- add
edit this page
button on all docs pages by @TyDunn in #226 - adding alerting content from workshop by @TyDunn in #233
- adding monitoring content from workshop by @TyDunn in #229
- adding the chess pipeline documentation by @rahuljo in #237
- adds deduplication of staging dataset during merge by @rudolfix in #240
New Contributors
Full Changelog: 0.2.0a29...0.2.0a30
0.2.0.a29
What's Changed
- Allow changing
write_disposition
in the resource without dropping dataset by @burnash in #205 - Add a suffix to the default dataset name by @burnash in #207
- improves and adds several
dlt pipeline
commands:info
,trace
,load-package
,failed-jobs
andsync
(https://dlthub.com/docs/command-line-interface#dlt-pipeline) - extends
LoadInfo
to include the schema changes applied to destination and a list of loaded package infos (https://dlthub.com/docs/running-in-production/running#inspect-save-and-alert-on-schema-changes) - extends load info with
raise_on_failed_jobs
andhas_failed_jobs
to make handling failed jobs easier LoadInfo
andpipeline.last_trace
can be directly loaded into destination to store more metadata on each load (https://dlthub.com/docs/running-in-production/running#inspect-and-save-the-load-info-and-trace)- adds retry strategy for
tenacity
to retryload
pipeline step (or any other per request) (https://dlthub.com/docs/running-in-production/running#handle-exceptions-failed-jobs-and-retry-the-pipeline) raise_on_failed_jobs
config option aborts the load package on first failed jobs (https://dlthub.com/docs/running-in-production/running#failed-jobs)
What's Changed in docs
- Fix typos and wording in docs/concepts/state by @burnash in #200
- Fix a broken link in README.md by @burnash in #203
- replacing team@ with community@ by @TyDunn in #211
- GitHub and Google Sheets setup guides by @AmanGuptAnalytics in #195
- "run a pipeline" troubleshooting & walkthrough https://dlthub.com/docs/walkthroughs/run-a-pipeline
- "run a pipeline in production": https://dlthub.com/docs/running-in-production/running
dlt pipeline
command: https://dlthub.com/docs/command-line-interface#dlt-pipeline
Full Changelog: 0.2.0a28...0.2.0a29