Releases: moj-analytical-services/splink
Releases · moj-analytical-services/splink
v3.9.9
What's Changed
- fix non-quoted db&catalog issues by @ThomasHepworth in #1558
- Update sqlglot to >=13.0.0 by @ThomasHepworth in #1642
- Migrate splink_vis_utils.js changes to upstream repo by @RobinL in #1639
- Fix issue 1651 - comparison viewer bars sorted improperly by @RobinL in #1652
- Fix bug with labelling tool where it didn't work offline by @RobinL in #1646
- Update binder to point to splink repo by @RobinL in #1655
- [MAINT] Refactor and clean our settings validation logs by @ThomasHepworth in #1636
- improve the settings validation documentation by @ThomasHepworth in #1674
- fixed null level issue for composing comparison levels by @aymonwuolanne in #1672
- typo by @sama-ds in #1682
- Lambda default warning by @RossKen in #1653
- Fix typos by @ADBond in #1691
- Added a Changelog by @ADBond in #1690
- fix docstrings to use .to_dict() instead of .spec by @aymonwuolanne in #1694
- Explicitly cast postgres function return values by @sluhn-harrisr in #1693
- Refactor
block_using_rules_sql
to follow normal pattern and avoid confusion by @RobinL in #1695 - Update missingness.py by @samnlindsay in #1662
- Fix spark fixture by @ADBond in #1698
- Corrected docstring to match connected components algorithm by @zslade in #1702
- BlockingRule: Clarify name of sql property by @RobinL in #1700
- fix duplicate doc files by @RossKen in #1659
- Settings val updates by @ThomasHepworth in #1710
- add settings validation docs by @ThomasHepworth in #1648
- 856 profile null column by @sama-ds in #1339
- Cluster metrics by @zslade in #1677
- Check input frames have same columns - missingness by @ADBond in #1611
- Fix InputColumn quoting for spark and improve code quality by @RobinL in #1719
- fix: respect boto3_session when checking table existence from AthenaLinker by @finalgrrrl in #1733
- Convert all InputColumn methods that take no arguments to properties by @RobinL in #1730
- 3.9.9 by @RossKen in #1735
New Contributors
- @sluhn-harrisr made their first contribution in #1693
- @finalgrrrl made their first contribution in #1733
Full Changelog: v3.9.8...v3.9.9
v3.9.8
What's Changed
- Add Spellchecker by @zslade in #1588
- Stopped repeat installs if already in docs-venv by @zslade in #1590
- Installation docs tweak by @ThomasHepworth in #1617
- Add table deletion functionality to spark delta tables by @ardila1108 in #1526
- Perform spellcheck by @zslade in #1620
- Add roc chart to gallery by @RossKen in #1599
- update ctl docs by @afua-moj in #1624
- fix problem with csv overwriting in spark by @ThomasHepworth in #1635
- Update scala jars and adjust dependencies by @ThomasHepworth in #1622
New Contributors
- @ardila1108 made their first contribution in #1526
Full Changelog: v3.9.7...v3.9.8
v3.9.7
What's Changed
- Confusion matrix chart by @samnlindsay in #1528
- assign comparator score dataframe by @RossKen in #1520
- Update examples to use block on by @ThomasHepworth in #1540
- Add unlinkables chart to gallery by @RossKen in #1547
- Reduce SQLite example notebook size by @RossKen in #1545
exact_match_rule
->block_on
by @ThomasHepworth in #1553- charts gallery to use
block_on
by @RossKen in #1546 - lock poetry to v1.5.1 by @ThomasHepworth in #1557
- add troubleshooting section to splink datasets by @RossKen in #1566
- add waterfall chart to gallery by @RossKen in #1550
- Chart Gallery - accuracy chart by @samnlindsay in #1563
- Allowed profile_columns to take 0 arguments by @sama-ds in #1516
- Add summary statistics for blocks[issue 1106] by @sama-ds in #1321
- [REFACTOR] Settings validation refactor by @ThomasHepworth in #1523
- 1535 doc make em speed up parameters more visible in the docs by @RossKen in #1544
- fix cumulative_rows count by @RossKen in #1577
- Cluster studio fixes by @ADBond in #1463
- Comparison viewer - filter with chart labels + render on empty subset by @ADBond in #1462
- Comparison checker - bug fixes and code cleaning by @ThomasHepworth in #1560
- Add model definition charts to gallery by @RossKen in #1551
- Initial comparison level validation by @ThomasHepworth in #1522
group
->cluster
by @ThomasHepworth in #1585- Comparison dialect validation by @ThomasHepworth in #1579
- run the linter by @ThomasHepworth in #1589
- add charts dev guide by @RossKen in #1586
- SQlite example notebook fix by @ADBond in #1592
- add initial block on docs by @ThomasHepworth in #1591
- match weight tweaks by @RossKen in #1578
- Comparison viewer - handle comparisons with spacey names + fix waterfall tooltip by @ADBond in #1596
- Comparison viewer colour gamma grid on match weight by @ADBond in #1470
- Drop support for py 3.7 by @ThomasHepworth in #1600
- add path arguments to automated tests by @ThomasHepworth in #1595
- Backend installs by @ThomasHepworth in #1554
- 3.9.7 by @RossKen in #1610
Full Changelog: v3.9.6...v3.9.7
v3.9.6
What's Changed
- adjust the output from
parse_duration
by @ThomasHepworth in #1498 - Reduce EM training warnings by @ADBond in #1491
- 1266 invalid dates to null level by @RossKen in #1267
- fix broken settings editor link by @RossKen in #1506
- add docs pointing to tf chart by @RossKen in #1507
- remove
set
clause by @ThomasHepworth in #1504 - String comparator charts by @samnlindsay in #1408
- Add further reading to tutorial by @RossKen in #1449
- fix sqlite table registration by @ADBond in #1485
- New link accuracy chart by @samnlindsay in #1478
- adjusted default xlim inmissingness chart json by @aliceoleary0 in #1511
- small tweak to also adjust heatmap scale axis limits by @aliceoleary0 in #1521
- change deprecation warnings to SplinkDeprecated warnings by @ThomasHepworth in #1519
- Make brs json serialisable by @ThomasHepworth in #1530
- change u default in parameters comparison chart by @RossKen in #1532
- refactor: perf: dedupe logic, inf checks in predict.py by @NickCrews in #1495
- Temp br erg push by @ThomasHepworth in #1536
- Add Charts gallery by @RossKen in #1517
- Br ergonomics fix json serialisable err by @ThomasHepworth in #1539
- Add Blocking chart to gallery by @RossKen in #1524
- Blocking rule library ergonomics changes by @ThomasHepworth in #1534
- V3.9.6 by @RossKen in #1541
Full Changelog: v3.9.5...v3.9.6
v3.9.5
What's Changed
- Splink dataframe docs by @ADBond in #1457
- Fix issue 1414 by @RobinL in #1416
- Fix find_matches test by @ADBond in #1459
- Regex docs by @zslade in #1296
- Rename join conditions method for clarity by @RobinL in #1439
- fix broken links by @RossKen in #1476
- Add Splink Blog to the docs by @RossKen in #1451
- Fix estimate lambda as zero by @ADBond in #1477
- Add all tutorial and example datasets to splink.datasets by @RossKen in #1466
- Demos migration by @RossKen in #1431
- add charts plugin by @RossKen in #1490
- Accessibility improvements by @zslade in #1489
Full Changelog: v3.9.4...v3.9.5
v3.9.4
What's Changed
- Add dataset table generation script to docs workflow by @ADBond in #1399
- ccl table fix by @RossKen in #1400
- Bump minimum duckdb version to 0.8.0 by @ADBond in #1405
- Postgres docs by @ADBond in #1404
- SL docs edits by @samnlindsay in #1402
- fix docs links to point to master by @ThomasHepworth in #1419
- [FEAT] Detect equi-join conditions in a blocking rule to count the number of comparisons without needing to perform the join by @RobinL in #1388
- fix else_level examples - no parameter needed by @ADBond in #1423
- remove survey in banner by @RossKen in #1432
- FIX: add parens to blocking rules by @NickCrews in #1422
- run actions on
_dev
branches by @ThomasHepworth in #1433 - [FEAT] Blocking Rule helper functions by @ThomasHepworth in #1370
- Update splink demos by @RossKen in #1407
- Contributing guide by @RossKen in #1394
- ref: Remove pre-check for path when loading file by @NickCrews in #1438
- add blocking rule library to existing functions by @ThomasHepworth in #1436
- Blocking Topic Guides by @RossKen in #1389
- Remove_pkg_resources by @NickCrews in #1425
- String comparisons doc text formatting by @samnlindsay in #1445
- V3.9.4 by @RossKen in #1458
Full Changelog: v3.9.3...v3.9.4
v3.9.3
What's Changed
- Fellegi sunter topic guide by @RossKen in #1318
- [MAINT] Backend agnostic comparison composition tests by @ThomasHepworth in #1341
- 1109 athena datediff by @RossKen in #1338
- Extend CacheDictWithLogging so that it also stores all tables materialises by Splink, not just the named ones (Issue 1059) by @RobinL in #1061
- Issue 1225 - Poor performance of estimate u in a link_only job by @RobinL in #1359
- Add a timer into debug mode by @ThomasHepworth in #1367
- lint for
print()
statements by @ThomasHepworth in #1374 - Expectation maximisation speedup option by @aymonwuolanne in #1369
- record linkage topic guides by @RossKen in #1297
- add icons to docs and generated tables by @RossKen in #1353
- [FEAT] Splink Labelling tool beta by @RobinL in #1208
- Docs navigation improvements by @RossKen in #1381
- Postgres bug fixes by @ADBond in #1335
- Txt replacement bash script by @ThomasHepworth in #1378
- Basic settings validator by @ThomasHepworth in #1252
- Add summary of each backend to docs by @RossKen in #1385
- [BUG] fix how nulls are registered in pyspark when loading a pandas df by @ThomasHepworth in #1373
- Tweak readme by @RobinL in #1393
- Splink dummy data by @ADBond in #1358
- Release v3.9.3 by @RossKen in #1398
New Contributors
- @aymonwuolanne made their first contribution in #1369
Full Changelog: v3.9.2...v3.9.3
v3.9.2
What's Changed
- Postgres Linker by @hanslemm in #1191
- Fix altair dependency - redo by @RossKen in #1308
- Add Google analytics to docs by @RossKen in #1313
- Add docs on udfs in sqlite and duckdb by @RobinL in #1317
- satisfy the linter by @ThomasHepworth in #1322
- Adjust import paths to remove backend prefixes by @ThomasHepworth in #1320
- Initial commit for email comparison level feature. by @sama-ds in #1277
- migrate duckdbless action to release by @ThomasHepworth in #1323
- fix symlinks action by @ThomasHepworth in #1324
- make datediff tests backend agnostic by @ThomasHepworth in #1294
- Postgres backend by @ADBond in #1251
- Fix calculation of link-only sample size for u-training by @ADBond in #1312
- Sqlite - fix default connect and levenshtein by @ADBond in #1336
- Altair 5: All Splink charts become alt.Chart() objects rather than custom VegaLiteNoValidate by @RobinL in #1315
- Update actions by @zslade in #1342
- Remove redundant headers of PR template by @RossKen in #1347
- add banner pointing to google form by @RossKen in #1349
- updating splink version by @aliceoleary0 in #1351
New Contributors
Full Changelog: v3.9.1...v3.9.2
v3.9.1
What's Changed
- Update releases.md by @zslade in #1273
- Readme formatting by @RobinL in #1274
- Use
tmp_path
in deterministic link test by @ADBond in #1275 - allow lowercase postcodes by @RossKen in #1263
- update linting bash script by @ThomasHepworth in #1290
- clean datediff code by @ThomasHepworth in #1291
save_settings_to_json
->save_model_to_json
by @RossKen in #1283- Add PR template by @RossKen in #1253
- Settings Topic Guide by @RossKen in #1292
- Comparison pseudo symlinks by @ThomasHepworth in #1279
- Update parameter_estimate_comparisons.json by @samnlindsay in #1301
Full Changelog: v3.9.0...v3.9.1
v3.9.0
What's Changed
- Docs upgrades by @RossKen in #1222
- Adjust table registration by @ThomasHepworth in #1219
- Add regex extract functionality to comparisons by @zslade in #1203
- Issue 1227 - Allow materialisation of df_representatives with no _ suffix by @RobinL in #1228
- 1189 tf topic guide by @RossKen in #1214
- Write splinkdf to csv parquet by @ThomasHepworth in #1194
- pretty print erroneous sql by @ThomasHepworth in #1238
- Cleaned up comparison levels documentation to be a multi-line code bl… by @mastratton3 in #1236
- Postcode comparison template by @zslade in #1230
- Forename Surname ctl by @RossKen in #1174
- 430 Term frequency adjustment chart by @samnlindsay in #1226
- 1111 add damerau levenshtein by @RossKen in #1181
- Duckdbless splink by @ThomasHepworth in #1244
term_frequency_adjustments_names
->term_frequency_adjustments
by @ThomasHepworth in #1254- 1175 deterministric clusters by @RossKen in #1213
- Update citations by @RossKen in #1255
- update benchmarking action to run on PR merge by @ThomasHepworth in #1262
- Backend-agnostic testing by @ADBond in #1205
- Adjust cl imports by @ThomasHepworth in #1248
- Add Topic guide for choosing comparisons & thresholds by @RossKen in #1198
- tweak duckdbless action by @ThomasHepworth in #1270
- fix duckdbless reqs url by @ThomasHepworth in #1271
- New release v3.9.0 by @zslade in #1272
New Contributors
- @mastratton3 made their first contribution in #1236
Full Changelog: v3.8.1...v3.9.0