Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master merge for 0.4.6 release #1054

Merged
merged 28 commits into from
Mar 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
c78be61
Edit arrow-pandas.md and fix a typo
Bl3f Feb 26, 2024
804459e
add missing extra for athena iceberg test workflow
sh-rp Feb 27, 2024
329c572
Fix Spelling (#1019)
Pipboyguy Feb 28, 2024
63694c8
Fix/from fork (#1018)
AstrakhantsevaAA Feb 28, 2024
002be1f
Merge pull request #1021 from dlt-hub/d#/fix_athena_test_workflow
sh-rp Feb 28, 2024
4056ff2
add pyairbyte blog (#1024)
adrianbr Feb 28, 2024
a7eb133
Update "create destination" docs with new file layouts
steinitzu Feb 29, 2024
b47edba
feat(airflow): expose the run method (#1014)
IlyaFaer Feb 29, 2024
b622e17
removes sql alchemy dependency and port parts of URL class (#1028)
rudolfix Mar 1, 2024
4a3d6ab
Parallelize decorator (#965)
steinitzu Mar 1, 2024
7e1163b
fixes naive datetime bug in incremental (#1020)
rudolfix Mar 1, 2024
c64cd87
Merge pull request #1001 from Bl3f/patch-1
sh-rp Mar 1, 2024
ffd653c
Merge pull request #1032 from dlt-hub/sthor/destination-docs
sh-rp Mar 1, 2024
1119539
Docs update on how to set query limits. (#973)
dat-a-man Mar 1, 2024
fc34dd0
Import missing pyarrow compute for transforms on arrowitems (#1010)
sh-rp Mar 2, 2024
c6f0ee1
clear normalized package in case it already existed (#1012)
sh-rp Mar 4, 2024
442a2cc
Add __main__ entry point to support calling dlt as python module (#1023)
sultaniman Mar 4, 2024
483d4ae
fix(core): validation error with TTableHintTemplate (#1039)
IlyaFaer Mar 4, 2024
cfb1f91
Docs/Updated for slack alerts. (#1042)
dat-a-man Mar 5, 2024
9410bc4
change timing in round robin test (#1049)
sh-rp Mar 5, 2024
d9bae5a
adds test case where payload data contains PUA unicode characters (#1…
willi-mueller Mar 5, 2024
4b9446c
fix add_limit behavior in edge cases (#1052)
sh-rp Mar 5, 2024
41614a6
adds row_order to incremental (#1041)
rudolfix Mar 5, 2024
23d5222
Quick fix to serialize load metrics as list instead of a dictionary (…
sultaniman Mar 5, 2024
adb6aa4
fix import schema yaml (#1013)
sh-rp Mar 5, 2024
f608131
bumps dlt version to 0.4.6
rudolfix Mar 5, 2024
2ffcd97
Merge branch 'master' into devel
rudolfix Mar 5, 2024
3761335
fixes wrong link in alerting docs
rudolfix Mar 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/test_common.yml
Original file line number Diff line number Diff line change
Expand Up @@ -83,11 +83,11 @@ jobs:
run: poetry install --no-interaction -E duckdb --with sentry-sdk

- run: |
poetry run pytest tests/pipeline/test_pipeline.py
poetry run pytest tests/pipeline/test_pipeline.py tests/pipeline/test_import_export_schema.py
if: runner.os != 'Windows'
name: Run pipeline smoke tests with minimum deps Linux/MAC
- run: |
poetry run pytest tests/pipeline/test_pipeline.py
poetry run pytest tests/pipeline/test_pipeline.py tests/pipeline/test_import_export_schema.py
if: runner.os == 'Windows'
name: Run smoke tests with minimum deps Windows
shell: cmd
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test_destination_athena_iceberg.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ jobs:

- name: Install dependencies
# if: steps.cached-poetry-dependencies.outputs.cache-hit != 'true'
run: poetry install --no-interaction -E --with sentry-sdk --with pipeline
run: poetry install --no-interaction -E athena --with sentry-sdk --with pipeline

- name: create secrets.toml
run: pwd && echo "$DLT_SECRETS_TOML" > tests/.dlt/secrets.toml
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/test_doc_snippets.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ env:

# Slack hook for chess in production example
RUNTIME__SLACK_INCOMING_HOOK: ${{ secrets.RUNTIME__SLACK_INCOMING_HOOK }}

# detect if the workflow is executed in a repo fork
IS_FORK: ${{ github.event.pull_request.head.repo.fork }}
jobs:

run_lint:
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ help:
@echo " test"
@echo " tests all the components including destinations"
@echo " test-load-local"
@echo " tests all components unsing local destinations: duckdb and postgres"
@echo " tests all components using local destinations: duckdb and postgres"
@echo " test-common"
@echo " tests common components"
@echo " test-and-lint-snippets"
Expand Down
4 changes: 4 additions & 0 deletions dlt/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from dlt.cli._dlt import main

if __name__ == "__main__":
main()
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from typing import Any, ClassVar, Dict, List, Optional
from sqlalchemy.engine import URL, make_url
from dlt.common.libs.sql_alchemy import URL, make_url
from dlt.common.configuration.specs.exceptions import InvalidConnectionString

from dlt.common.typing import TSecretValue
Expand All @@ -26,6 +26,7 @@ def parse_native_representation(self, native_value: Any) -> None:
# update only values that are not None
self.update({k: v for k, v in url._asdict().items() if v is not None})
if self.query is not None:
# query may be immutable so make it mutable
self.query = dict(self.query)
except Exception:
raise InvalidConnectionString(self.__class__, native_value, self.drivername)
Expand Down
1 change: 0 additions & 1 deletion dlt/common/data_types/type_helpers.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
from dlt.common.data_types.typing import TDataType
from dlt.common.time import (
ensure_pendulum_datetime,
parse_iso_like_datetime,
ensure_pendulum_date,
ensure_pendulum_time,
)
Expand Down
6 changes: 6 additions & 0 deletions dlt/common/libs/numpy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from dlt.common.exceptions import MissingDependencyException

try:
import numpy
except ModuleNotFoundError:
raise MissingDependencyException("DLT Numpy Helpers", ["numpy"])
10 changes: 6 additions & 4 deletions dlt/common/libs/pyarrow.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from datetime import datetime, date # noqa: I251
from pendulum.tz import UTC
from typing import Any, Tuple, Optional, Union, Callable, Iterable, Iterator, Sequence, Tuple

from dlt import version
Expand All @@ -14,6 +15,7 @@
try:
import pyarrow
import pyarrow.parquet
import pyarrow.compute
except ModuleNotFoundError:
raise MissingDependencyException(
"dlt parquet Helpers", [f"{version.DLT_PKG_NAME}[parquet]"], "dlt Helpers for for parquet."
Expand Down Expand Up @@ -314,21 +316,21 @@ def is_arrow_item(item: Any) -> bool:
return isinstance(item, (pyarrow.Table, pyarrow.RecordBatch))


def to_arrow_compute_input(value: Any, arrow_type: pyarrow.DataType) -> Any:
def to_arrow_scalar(value: Any, arrow_type: pyarrow.DataType) -> Any:
"""Converts python value to an arrow compute friendly version"""
return pyarrow.scalar(value, type=arrow_type)


def from_arrow_compute_output(arrow_value: pyarrow.Scalar) -> Any:
"""Converts arrow scalar into Python type. Currently adds "UTC" to naive date times."""
def from_arrow_scalar(arrow_value: pyarrow.Scalar) -> Any:
"""Converts arrow scalar into Python type. Currently adds "UTC" to naive date times and converts all others to UTC"""
row_value = arrow_value.as_py()
# dates are not represented as datetimes but I see connector-x represents
# datetimes as dates and keeping the exact time inside. probably a bug
# but can be corrected this way
if isinstance(row_value, date) and not isinstance(row_value, datetime):
row_value = pendulum.from_timestamp(arrow_value.cast(pyarrow.int64()).as_py() / 1000)
elif isinstance(row_value, datetime):
row_value = pendulum.instance(row_value)
row_value = pendulum.instance(row_value).in_tz("UTC")
return row_value


Expand Down
Loading
Loading