Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate tests for Splink 4 (ComparisonLevelCreator and ComparisonCreator and related changes) #1714

Merged
merged 118 commits into from
Jan 22, 2024

Conversation

ADBond
Copy link
Contributor

@ADBond ADBond commented Nov 8, 2023

This is a branch to translate our existing (Splink 3) tests to work on the new ComparisonLevelCreator class (see #1689, #1703). This will be updated as and when new features get implemented in Splink 4, specifically the various missing comparison levels, and the library and template comparisons)

The changes to splink/ itself should all be relatively minimal, and mostly pertain to allowing the tests to run - anything sufficiently substantial will instead get its own PR to splink4_dev. That said, it is probably good to check the changes there particularly carefully in case I accidentally do anything silly.

The rest of the PR is large, but most of the changes should be fairly straightforward - mostly renaming + the small changes needed with updated syntax. I will try and highlight anything particularly noteworthy so that it doesn't get lost in the noise.

Progress:

Feature Implemented Added to this branch
Core levels (exact, null, else) #1689
LevenshteinLevel #1689
DatediffLevel #1705 (+#1837)
JaroWinklerLevel #1717
ColumnsReversedLevel #1726
JaroLevel #1728
JaccardLevel #1728
DamerauLevenshteinLevel #1728
DistanceInKMLevel #1727
ArrayIntersectLevel #1753
PercentageDifferenceLevel #1758
CustomLevel #1768
DistanceFunctionLevel #1860
or_ #1777 ✅ (#1783)
and_ #1777 ✅ (#1783)
not_ #1777 ✅ (#1783)
ExactMatch #1721
LevenshteinAtThresholds #1721
CustomComparison #1768
DamerauLevenshteinAtThresholds #1770
JaccardAtThresholds #1770
JaroAtThresholds #1770
JaroWinklerAtThresholds #1770
DistanceInKMAtThresholds #1815 ✅ (#1816)
DatediffAtThresholds #1782
ArrayIntersectAtSizes #1770
DistanceFunctionAtThresholds #1860
EmailComparison #1840
DateComparison #1856 ✅ (#1857)
NameComparison #1856 ✅ (#1857)
PostcodeComparison #1856 ✅ (#1857)
ForenameSurnameComparison #1856 ✅ (#1857)
InputExpression (for regex_extract, lower, etc) #1782
comparison tf adjustments #1812 ✅ (#1813)
NullLevel regex validation #1819 (+#1834)
Paremeter validation (e.g. -ve array sizes) #1861

Tests now actually run [although some have been temporarily disabled], so we can start to get semi-useful feedback.

Number of failing tests [postgres, sqlite respectively in parentheses where included]:

110
96
93
85
64
55
52
57 😭
56
48
46
44
40
39
35
33 (5, 3)
33 (15, 12)
33 (5, 2)
31
32 😭
25 (5, 0)
24 (3, 0)
8 (4, 2)
8 (4, 0)
0 (0, 0) 🥳 🎉 🎊 ✅ 😎

as this is already backend-agnostic we don't really need this to live on the `test_helpers`, but this will lubricate the migration
@ADBond ADBond changed the title [WIPMigrate tests to use ComparisonLevelCreator [WIP] Migrate tests to use ComparisonLevelCreator Nov 8, 2023
@RobinL
Copy link
Member

RobinL commented Nov 13, 2023

@ADBond to add datediff and the other levels I've done, shall i just do a PR pointing at this branch?

@ADBond
Copy link
Contributor Author

ADBond commented Nov 13, 2023

@ADBond to add datediff and the other levels I've done, shall i just do a PR pointing at this branch?

It's okay, I have merged in splink4_dev + started changing all the relevant test modules

@RobinL
Copy link
Member

RobinL commented Nov 13, 2023

thanks!

@ADBond ADBond mentioned this pull request Jan 18, 2024
@ADBond ADBond marked this pull request as ready for review January 22, 2024 16:00
@ADBond ADBond changed the title [WIP] Migrate tests for Splink 4 - ComparisonLevelCreator and ComparisonCreator Migrate tests for Splink 4 (ComparisonLevelCreator and ComparisonCreator and related changes) Jan 22, 2024
@ADBond ADBond requested a review from RobinL January 22, 2024 16:01
Copy link
Member

@RobinL RobinL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incredible work!

image

@ADBond ADBond merged commit 42b593d into splink4_dev Jan 22, 2024
6 checks passed
@ADBond ADBond deleted the migrate-tests branch January 22, 2024 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants