Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MISC] Cleanup HLL #128

Merged
merged 1 commit into from
Oct 12, 2023
Merged

[MISC] Cleanup HLL #128

merged 1 commit into from
Oct 12, 2023

Conversation

eseiler
Copy link
Member

@eseiler eseiler commented Oct 11, 2023

Blocked by #127

This also makes the toolbox_test::rearrange_bins test fail, which probably shouldn't have succeeded to begin with.

Before: merge_and_estimate_SIMD uses floats, while estimate uses double.

hibf/src/sketch/toolbox.cpp

Lines 218 to 220 in 9f4bd53

double const estimate_ij = temp_hll.merge_and_estimate_SIMD(clustering[j].hll);
// Jaccard distance estimate
double const distance = 2 - (estimates[i] + estimates[j]) / estimate_ij;

The distance is now 0.0, as it should be, because the sketches are the same.
Before, distance was 0.000000064... and apparently triggered some computations.

Changing the referenced line to float const estimate_ij = ... will trigger the old behaviour.

@vercel
Copy link

vercel bot commented Oct 11, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
hibf ✅ Ready (Inspect) Visit Preview 💬 Add feedback Oct 12, 2023 11:57am

@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Oct 11, 2023
@codecov
Copy link

codecov bot commented Oct 11, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (e5c95f6) 99.71% compared to head (3c9ad7b) 99.62%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #128      +/-   ##
==========================================
- Coverage   99.71%   99.62%   -0.09%     
==========================================
  Files          39       39              
  Lines        1408     1348      -60     
==========================================
- Hits         1404     1343      -61     
- Misses          4        5       +1     
Files Coverage Δ
include/hibf/sketch/hyperloglog.hpp 100.00% <100.00%> (ø)
src/sketch/hyperloglog.cpp 100.00% <100.00%> (ø)
src/sketch/toolbox.cpp 99.19% <100.00%> (-0.81%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Oct 11, 2023
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Oct 11, 2023
@eseiler eseiler force-pushed the misc/hll branch 2 times, most recently from e2d2d74 to fd031d9 Compare October 11, 2023 21:07
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Oct 11, 2023
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Oct 11, 2023
@eseiler eseiler requested a review from smehringer October 12, 2023 10:18
@eseiler eseiler marked this pull request as ready for review October 12, 2023 10:21
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Oct 12, 2023
include/hibf/sketch/hyperloglog.hpp Outdated Show resolved Hide resolved
@eseiler eseiler enabled auto-merge October 12, 2023 11:57
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Oct 12, 2023
@eseiler eseiler merged commit 83e6bd1 into seqan:main Oct 12, 2023
@eseiler eseiler deleted the misc/hll branch October 12, 2023 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants