Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] SIMD count #174

Merged
merged 5 commits into from
Dec 1, 2023
Merged

[FEATURE] SIMD count #174

merged 5 commits into from
Dec 1, 2023

Conversation

eseiler
Copy link
Member

@eseiler eseiler commented Nov 22, 2023

  • Naming
  • Code-deduplication
  • Ensure capacity?!
Bins Before After Speedup
64 30.3 98.07 3.2
128 17.7 83.36 4.7
256 10.7 63.42 5.9
512 6.9 41.7 6.0
1024 4.2 19.85 4.7

Speed in million elements per second

On AVX512 machine

AVX512 support: true
False positive rate: 0.050000
HIBF_HAS_AVX512: true
IBF size in bytes: 16777216
Number of elements: 16984998
Number of hash functions: 2

Copy link

vercel bot commented Nov 22, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
hibf ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 1, 2023 11:32am

@eseiler eseiler marked this pull request as draft November 22, 2023 19:42
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 22, 2023
Copy link

codecov bot commented Nov 22, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (b068f7e) 99.94% compared to head (1cebcfc) 99.94%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #174   +/-   ##
=======================================
  Coverage   99.94%   99.94%           
=======================================
  Files          45       46    +1     
  Lines        1820     1835   +15     
=======================================
+ Hits         1819     1834   +15     
  Misses          1        1           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 24, 2023
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 24, 2023
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 24, 2023
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 27, 2023
@eseiler eseiler marked this pull request as ready for review November 28, 2023 10:02
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 28, 2023
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 28, 2023
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 28, 2023
@seqan-actions seqan-actions added the lint [INTERNAL] used for linting label Nov 28, 2023
@seqan-actions seqan-actions removed the lint [INTERNAL] used for linting label Nov 28, 2023
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 28, 2023
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 28, 2023
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm Do you think /misc/ is a fitting location?

It's not really miscellaneous stuff but very essential?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should think about adding a general directory or something afterwards.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please open an issue for this, if it isn't fixed here

Comment on lines 544 to 546
result_buffer(next_multiple_of_64(ibf.bin_count()))
{
result_buffer.resize(ibf.bin_count());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result_buffer(next_multiple_of_64(ibf.bin_count()))
{
result_buffer.resize(ibf.bin_count());
result_buffer(next_multiple_of_64(ibf.bin_count())) // ensure large enough capacity
{
result_buffer.resize(ibf.bin_count()); // resize to actual size

resize to smaller sizes does never shrink the capacity, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you would need to call shrink_to_fit for that

Comment on lines 52 to 55

template <typename derived_t, std::integral integral_t>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a short brief for the purpose of this class (same for simd_mapping)

{
// get 64 bits starting at position `bit_pos`
uint64_t bit_sequence = bitvector_raw[batch];
uint64_t current_word = bv_ptr[iteration];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bv_ptr is not a nice variable name if you don't concentrate and read from the beginning.

Comment on lines -316 to +321
seqan::hibf::interleaved_bloom_filter ibf{seqan::hibf::bin_count{128u},
seqan::hibf::interleaved_bloom_filter ibf{seqan::hibf::bin_count{127u},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To test that the counting_agent does return a counting_vector with the correct capacity.

Comment on lines +5 to +7
#define HIBF_HAS_AVX512 1

#include "counting_vector_test.cpp"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure that for the normal counting_vector_test.cpp, HIBF_HAS_AVX512 is false? Otherwise, if AVX512 is available, we are testing only the avx512 case in both files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the CI, it works, because there is no AVX512 on the CI. But you are correct that AVX512=0 isn't tested on an AVX512 machine. I changed it such that the base-cpp sets HIBF_HAS_AVX512=0, if it hasn't already been defined.

@eseiler eseiler requested a review from smehringer November 30, 2023 12:05
@eseiler
Copy link
Member Author

eseiler commented Nov 30, 2023

Will rebase after approval

@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Nov 30, 2023
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please open an issue for this, if it isn't fixed here

And AVX512 tests. Uses intrinsics if available, simde's simulation otherwise
@eseiler eseiler force-pushed the feature/simd_count branch from 1f21647 to 1cebcfc Compare December 1, 2023 11:31
@seqan-actions seqan-actions added lint [INTERNAL] used for linting and removed lint [INTERNAL] used for linting labels Dec 1, 2023
@eseiler eseiler enabled auto-merge December 1, 2023 11:36
@eseiler eseiler merged commit 51eaf61 into seqan:main Dec 1, 2023
31 checks passed
@eseiler eseiler deleted the feature/simd_count branch December 1, 2023 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants