v3.2.0
What's Changed
There are two minor breaking changes:
(1). settings
must now always be provided to instantiate the linker
object. The most minimal settings object is {"link_type": your_link_type}
(2). By default, EM training sessions no longer estimate the probability_two_random_records_match
. This can be enables by passing an argument explicitly.
Features
- [FEAT] Add support for pairwise format of clusters by @ThomasHepworth in #707
- [FEAT] Haversine comparison level by @ThomasHepworth in #721
- [FEAT] Databricks tweaks pr by @rjc89 in #715
- [FEAT] Direct estimation probability two random records match by @RobinL in #734
Other
-
[DOCS] Update main readme to include clustering by @RobinL in #696
-
add version tag by @ThomasHepworth in #695
-
add a custom translation for
cast(<val> as double)
-><val>D
by @ThomasHepworth in #697 -
add duckdb helper functions to a separate script by @ThomasHepworth in #700
-
[docs] fix minor typo in docs by @Thomas-Hirsch in #708
-
[MAINT] Log SQL statements before, not after, they are executed in Spark by @RobinL in #714
-
Adjust input col sql logic by @ThomasHepworth in #725
-
[Docs] Update dev guide to sqlglot and transpilation by @RobinL in #729
-
[MAINT] Don't return html by default, it crashes jupyter by @RobinL in #735
-
Update sqlglot v5 by @ThomasHepworth in #736
-
Athena fixes by @ThomasHepworth in #738
-
[DOCS] Add developers guide to building docs locally by @RobinL in #740
-
[MAINT] Improve implementation of InputColumn and remove transpile by @RobinL in #727
-
Document
save_offline_chart
and ensure it works if passed a vega lite chart by @RobinL in #742
New Contributors
- @Thomas-Hirsch made their first contribution in #708
- @rjc89 made their first contribution in #715
Full Changelog: v3.1.0...v3.2.0.dev01