Skip to content

bloom filter bulk inserts and queries #723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dcoutts
Copy link
Collaborator

@dcoutts dcoutts commented May 15, 2025

Two changes, both from bulk operations for bloom filters.

For Bloom filter inserts in run accumulation: instead of inserting keys into the bloom filter one by one as they are added to the run accumulator, save them up and add them all in one go when the page is being finalised. This then lets us use a bloom filter bulk insert, which lets us use memory prefetching.

For Bloom filter queries in key lookups, update the existing bulk query code to properly take advantage of the new API and use prefetching. We can now also simplify and use a single high performance implementation, rather than needing two (a more compatible one and a faster one that relied on fancier features available in later GHC versions).

Results for the WP8 benchmark (100M elements, 10bits per key, full caching, 10k batches of 256 keys):

  • baseline: 88,871 ops/sec
  • with bulk inserts: 97,152 ops/sec, ~9% improvement
  • with bulk queries: 103,005 ops/sec, ~6% additional, ~16% cumulative improvement

So overall about a 16% improvement in ops/sec on the primary WP8 benchmark, and as a bonus, getting over the magic 100k ops/sec threshold (on my laptop).

Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments. I have yet to look at the bulk insert code

Copy link
Collaborator

@jorisdral jorisdral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! If we fix the test failure that I noted in my previous review, then we're good to go

@dcoutts
Copy link
Collaborator Author

dcoutts commented May 15, 2025

CI failure due to flaky test fixed in #724.

@dcoutts dcoutts force-pushed the dcoutts/bloomfilter-blocked branch 2 times, most recently from 0c341bb to 09afe55 Compare May 16, 2025 00:04
dcoutts and others added 6 commits May 16, 2025 01:05
Instead of inserting keys into the bloom filter one by one as they are
added to the run accumulator, save them up and add them all in one go
when the page is being finalised.

This then lets us use a bloom filter bulk insert, which lets us use
memory prefetching.

The result should be faster.
Fetch into the caches in the least intrusive way, "0" levels, rather
than "3" levesl. This does not appear to slow down inserts, and should
evict fewer things from the caches.

And document what level "0" means and why we use it.
`bloomQueriesModel` was not really a proper model, because the model itself was
using actual bloom filters. The model is now instead a `Set`, and
`prop_bloomQueriesModel` is updated because the model will now only return true
positives and negatives.
Previously, for the Classic Bloom filter implementation we had two
different implementations of bloomQueries: one that was relatively
simple and didn't rely on anything fancy, and one that went all out to
maximise performance.

The high performance one had to be disabled when we added the
block-structured bloom filter implementation, since it was tightly
coupled to that implementation.

With the new bloom filter API and implementation, we can now implement a
single high performance version of bulk query. We no longer need
separate higher and lower performance versions, since we no longer need
to rely on fancy features like unlifted boxed arrays.

So strip out the bloom-query-fast cabal flag and the BloomFilterQuery2
module.

The updated BloomFilterQuery1 does a simple nested loop, but also does
prefetching. The prefetch distance is the number of runs, which is
proportional to the number of levels, and is typically modest.
The query module used to be big, and there used to be two of them. Now
there's only one and it's a lot smaller. So it makes sense to keep it
all together in one module.
We were using a mix of import qualified as BF, and import as Bloom.
@dcoutts dcoutts force-pushed the dcoutts/bloomfilter-blocked branch from 09afe55 to 49f8071 Compare May 16, 2025 00:06
@dcoutts dcoutts changed the title Bulk insert for bloom filters in RunAcc bloom filter bulk inserts and queries May 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants