Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync with upstream #23

Open
wants to merge 579 commits into
base: master
Choose a base branch
from
Open

Sync with upstream #23

wants to merge 579 commits into from

Conversation

drdee
Copy link
Collaborator

@drdee drdee commented Jun 25, 2024

Please follow the Pull Request Process while creating your Pull Request. You can use the below template to help you structure your thoughts.

Why

What is changing

Checklist

How does this PR handle security?

Screenshot

Next steps

nolar and others added 30 commits October 4, 2023 18:23
Unhide the implemented methods for ordering & repring the arith-texts
Fix the missing fields in database classes
Call parent's init before field initialisation, not after
Signed-off-by: Sarad Mohanan <[email protected]>
Co-authored-by: Dan Lawin <[email protected]>
…iya/data-diff into bigquery-dbt-impersonation
* duckdb joindiff works!

* fix example

* add motherduck example

* show a toml example with motherduck

* show that it's an example

* remove sensitive token from logs

* add token tests

* scrub token passing tests

* add duckdb to joindiff tests

---------

Co-authored-by: Sung Won Chung <[email protected]>
AbstractMixin_MD5->Mixin_MD5 to override pg version
The type annotations are incomplete and sometimes simply wrong. But at least they are annotated _somehow_, partially. More importantly, we have numerous simple classes & `attrs` classes, which could be used for type-checking properly without inferring them as `Any` — e.g., in structural pattern matching, such as `match col: case NumericType(precision=0): …`.

Proper type annotations and strict-mode MyPy checking will arrive later, eventually (it is a work in progress, just not a priority).
Mark data-diff as type-annotated
dlawin and others added 30 commits January 9, 2024 16:34
convert sets to list, don't include _ignored_columns_lock: threading.Lock in events
* update supported dbs

* remove submods

* fix import

* remove submods again

* update lock

---------

Co-authored-by: Sung Won Chung <[email protected]>
…m-xdb-int-and-float-type-mapping

LAB-271 Redshift Spectrum type mapping
Detect duplicate rows on each side
Improve error reporting for PK type mismatch
* closing connection once data diff is executed

Signed-off-by: Sarad Mohanan <[email protected]>

* typo fix

Signed-off-by: Sarad Mohanan <[email protected]>

* moving database to with block

Signed-off-by: Sarad Mohanan <[email protected]>

* optimizing _data_diff function

Signed-off-by: Sarad Mohanan <[email protected]>

* minor

Signed-off-by: Sarad Mohanan <[email protected]>

* minor

Signed-off-by: Sarad Mohanan <[email protected]>

* bug fix

Signed-off-by: Sarad Mohanan <[email protected]>

* linter fixes

Signed-off-by: Sarad Mohanan <[email protected]>

* formating code

Signed-off-by: Sarad Mohanan <[email protected]>

* update CONTRIBUTING.md and cleanup ansi escape sequences

Signed-off-by: Sarad Mohanan <[email protected]>

* read postgres db config from common.CONN_STRINGS

Signed-off-by: Sarad Mohanan <[email protected]>

* defaulting postgres port

Signed-off-by: Sarad Mohanan <[email protected]>

* test

Signed-off-by: Sarad Mohanan <[email protected]>

* test for __main__

Signed-off-by: Sarad Mohanan <[email protected]>

* adding close db connection test

Signed-off-by: Sarad Mohanan <[email protected]>

* update test case

Signed-off-by: Sarad Mohanan <[email protected]>

* do not use shared connection

Signed-off-by: Sarad Mohanan <[email protected]>

* no connection sharing

Signed-off-by: Sarad Mohanan <[email protected]>

* no connection sharing

Signed-off-by: Sarad Mohanan <[email protected]>

* avoid list typing

Signed-off-by: Sarad Mohanan <[email protected]>

* Update tests/test_main.py

Co-authored-by: Sung Won Chung <[email protected]>

* Update tests/test_main.py

Co-authored-by: Sung Won Chung <[email protected]>

* Apply suggestions from code review

Co-authored-by: Sung Won Chung <[email protected]>

* Update tests/test_database.py

Co-authored-by: Sung Won Chung <[email protected]>

* minor

Signed-off-by: Sarad Mohanan <[email protected]>

* minor

Signed-off-by: Sarad Mohanan <[email protected]>

* minor

Signed-off-by: Sarad Mohanan <[email protected]>

* minor

Signed-off-by: Sarad Mohanan <[email protected]>

* merging 784

Signed-off-by: Sarad Mohanan <[email protected]>

* remove redundant variable

Signed-off-by: Sarad Mohanan <[email protected]>

---------

Signed-off-by: Sarad Mohanan <[email protected]>
Co-authored-by: Sung Won Chung <[email protected]>
* edits

* style fixes by ruff

* Update README.md

Co-authored-by: Kira Furuichi <[email protected]>

* Update README.md

Co-authored-by: Kira Furuichi <[email protected]>

* kira's edit

Co-authored-by: Kira Furuichi <[email protected]>

---------

Co-authored-by: elliotgunn <[email protected]>
Co-authored-by: Kira Furuichi <[email protected]>
* quick debug logs

* fix the override method

* remove submods

* remove prints

* remove submods

* revert change

* Refactor dynamic database clause in DuckDB.py

* draft tests

* Add validation for input path in select_table_schema method

* style fixes by ruff

* Revert "Add validation for input path in select_table_schema method"

This reverts commit c09f9cf.

* Remove unnecessary code in test_duckdb.py

---------

Co-authored-by: Sung Won Chung <[email protected]>
Co-authored-by: sungchun12 <[email protected]>
* first draft

* style fixes by ruff

* past tense consistency

* working draft of new table

* style fixes by ruff

* dbt diffs work, cloud broken for now

* remove cached git repos

* efficient naming

* add type changed count

* reorder for priority on prod changes

* tabulate value diffs

* style fixes by ruff

* less horizontal space needed

* leo's feedback

* center align values

* consistent formatting

* shorter name same meaning

* row counts and diff values working

* deps impacts works now

* default val

* more readable

* add primary key used

* add model specific CI configs

* consistency

* conditional headers

* style fixes by ruff

* cleaner implementation

* more cleaning

* consistent format

* fix unchanged calc

* remove prints

* default value

* draft up tests

* a couple more tests

* new version

* passing tests

* style fixes by ruff

* util unit test

* add unit tests

* test the templates

* fix type hints

* real test no mocking

* update tests with all the new outputs

* add more validations for mock

* fix json bug

---------

Co-authored-by: Sung Won Chung <[email protected]>
Co-authored-by: sungchun12 <[email protected]>
Co-authored-by: Dan Lawin <[email protected]>
This is a library, so it must allow newer versions of dependencies in the virtualenv of the app, it should not restrict them to specific versions.
Loosen the restrictions of dependencies versions: >= instead of ^ and =
* Update README.md

Remove databases that are not planned from README.

* style fixes by ruff

---------

Co-authored-by: leoebfolsom <[email protected]>
Sunsetting open source data-diff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.