execution_profile: whitelist/blacklist filtering #291

muzarski · 2025-04-29T13:21:47Z

Fixes: #243

This PR introduces whitelist/blacklist host filtering to the driver.

Filtering rules

There are currently 4 filtering rules (we could introduce 2 more in the future - rack filtering):

ip whitelist
ip blacklist
dc whitelist
dc blacklist

The logic for checking if a host is accepted is following (in the specified order):

if ip whitelist is non-empty, and if it does not contain host's ip -> reject
if ip blacklist is non-empty, and if it contains host's ip -> reject
if dc whitelist is non-empty, and if it does not contain host's dc -> reject
if dc blacklist is non-empty, and if it contains host's dc -> reject

HostFilter

Since all of the execution profiles are defined before session creation (in other words: user cannot define new profiles after session was created and connected), cpp-driver (and so, cpp-rust-driver as well) can see which hosts are disabled by all execution profiles.

If there is a host which is rejected by all execution profiles (including the default one), the driver does not open the connection to such host. This is exactly what we do in this PR - after collecting all of the filtering rules of each profile, we compute the filtering rules for our custom HostFilter implementor - namely CassHostFilter. This is done by computing the unions of whitelists and intersection of blacklists.

Vec or HashSet

For now, I decided to use Vec to represent the set of items in the whitelists/blacklists. I'd say that we do not expect them to contain a lot of items, so I think Vec is more efficient than HashSet in this specific case. One could argue that this code is not critical (and I agree) - this is a configuration phase. I'm open to suggestions.

Integration tests

ExecutionProfile suite

I have not enabled any yet - they require cass_future_coordinator, which is used to check the ip of host the request was routed to. TBH, I think this is a blocker for merging - I prefer to wait for the cass_future_coordinator, and enable the corresponding tests (and maybe implement some additional tests on my own). I still believe that this PR is ready for review, though.

HostFilter suite

This is the suite I implemented myself - currently there are two tests that reject all nodes using execution profiles - in result, session does not have any connections to route the requests to.

DisconnectedNullStringApiArgs suite

Enabled remaining 4 test cases from this suite - they check whether driver behaves as expected when null string is provided to filtering config methods.

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
PR description sums up the changes and reasons why they should be introduced.
I have implemented Rust unit tests for the features/changes introduced.
I have enabled appropriate tests in .github/workflows/build.yml in gtest_filter.
I have enabled appropriate tests in .github/workflows/cassandra.yml in gtest_filter.

scylla-rust-wrapper/src/load_balancing.rs

muzarski · 2025-05-05T05:58:12Z

I have already found a bug in my HostFilter filtering rules computation. Consider following scenario:
We have two nodes: "ip1" and "ip2". Let's say we have two execution profiles, one with the whitelist being "ip1", the other with empty (disabled) whitelist. In result, the HostFilter whitelist will be a union of these -> "ip1". In result, we filter out the "ip2" node - even though it is accepted by at least one execution profile (the one with empty whitelist).

I'll fix it - the nonempty_union should return None, if at least one of the provided sets is empty (i.e. one of the whitelist rules is disabled). This proves that we need the tests before merging this.

Note: the nonempty_intersection looks good in such scenario, because if at least one of the blacklists is empty, the resulting HostFilter blacklist will be empty (disabled) as well.

muzarski · 2025-05-05T06:04:00Z

Rebased on master.

muzarski · 2025-05-05T07:19:41Z

v2:

addressed @Lorak-mmk comments (haven't transitioned to HashSet yet - still to be discussed)
fixed the bug I mentioned above
introduced HostFilterTest suite and implemented two test cases where we disable all nodes (driver has no connections to route the queries).

muzarski · 2025-05-05T08:16:47Z

v2.1: Previously I implemented new tests, but forgot to enable them..

Lorak-mmk

Looks good to me. I don't really care if we use Vec or HashSet, so I'm approving now. If you decide to change the container, then I'll re-approve.

muzarski · 2025-05-05T14:17:48Z

v2.2:

Rebased on master
Implemented set_[whitelist/blacklist]_filtering methods using update_comma_delimited_list utility method introduced here. I removed the set_filtering helper function which was in my first version of this PR - it's the same as update_comma_delimited_list.
Enabled remaining DisconnectedNullStringApiArgs tests

wprzytula · 2025-05-08T11:02:21Z

❓ You haven't marked:

I have enabled appropriate tests in .github/workflows/build.yml in gtest_filter.
I have enabled appropriate tests in .github/workflows/cassandra.yml in gtest_filter.

Did you just overlook them?

muzarski · 2025-05-08T11:17:27Z

❓ You haven't marked:

I have enabled appropriate tests in .github/workflows/build.yml in gtest_filter.
I have enabled appropriate tests in .github/workflows/cassandra.yml in gtest_filter.

Did you just overlook them?

The ExecutionProfile tests that are testing filtering methods require support for exposing/enforcing coordinator. I'm waiting for these to be implemented. Then, we can enable the corresponding tests and merge this PR (I don't want to merge it without tests enabled).

muzarski · 2025-05-09T10:38:42Z

v2.2: Addressed @wprzytula comments (some of them still need an answer)

muzarski · 2025-05-09T11:08:58Z

v2.3: Next iteration of addressing @wprzytula comments. As a bonus (after discussion on Slack), I added casts to fn(&str) -> T for closures with empty env.

muzarski · 2025-05-09T11:13:11Z

Thanks for the approves - I'm not merging yet. I'll wait for cass_future_coordinator support to enable the corresponding integration tests.

Lorak-mmk · 2025-05-09T11:58:00Z

As a bonus (after discussion on Slack), I added casts to fn(&str) -> T for closures with empty env.

Why? Did you check that it really improves something? In other words, are multiple copies of the function generated (in release mode) if we remove those casts, as compared to the current version?
We never did such casts for a gazzilion methods that accept a closure (e.g. in every iterator usage).

muzarski · 2025-05-09T12:18:58Z

As a bonus (after discussion on Slack), I added casts to fn(&str) -> T for closures with empty env.

Why? Did you check that it really improves something? In other words, are multiple copies of the function generated (in release mode) if we remove those casts, as compared to the current version? We never did such casts for a gazzilion methods that accept a closure (e.g. in every iterator usage).

Without casts:

nm -an target/release/libscylla_cpp_driver.so | c++filt | grep comma_delimi

0000000000141730 t scylla_cpp_driver::cluster::update_comma_delimited_list::h625b382240391297
0000000000141970 t scylla_cpp_driver::cluster::update_comma_delimited_list::ha5386510b8545111
0000000000141b70 t scylla_cpp_driver::cluster::update_comma_delimited_list::hb5560b92aa22d880
0000000000141d70 t scylla_cpp_driver::cluster::update_comma_delimited_list::hea0076943954fdf8
0000000000466eb8 d scylla_cpp_driver::cluster::update_comma_delimited_list::__CALLSITE::META::hf3d04d6f74274d08
00000000004ce2e8 d scylla_cpp_driver::cluster::update_comma_delimited_list::__CALLSITE::h1a7572fa57806091

With casts:

nm -an target/release/libscylla_cpp_driver.so | c++filt | grep comma_delimi

00000000001427b0 t scylla_cpp_driver::cluster::update_comma_delimited_list::ha48524fffb4b98c2
00000000001429f0 t scylla_cpp_driver::cluster::update_comma_delimited_list::hf1fb12ec9c9e5165
0000000000465fe8 d scylla_cpp_driver::cluster::update_comma_delimited_list::__CALLSITE::META::hf3d04d6f74274d08
00000000004cd2e8 d scylla_cpp_driver::cluster::update_comma_delimited_list::__CALLSITE::h1a7572fa57806091

Lorak-mmk · 2025-05-09T12:22:35Z

So probably it is a tradeoff of inlining vs less copies, right? In other words, with casts we get 2 less copies, but the functions won't be inlined. Do we have reasons to believe this is the right tradeoff to do here?

muzarski · 2025-05-09T12:24:33Z

Do we have reasons to believe this is the right tradeoff to do here?

I don't have any. Maybe @wprzytula has some.

wprzytula · 2025-05-09T12:33:33Z

So probably it is a tradeoff of inlining vs less copies, right? In other words, with casts we get 2 less copies, but the functions won't be inlined. Do we have reasons to believe this is the right tradeoff to do here?

By telling the compiler that this is the same function, we leave the decision up to it whether to inline or not.
By not saying it, the compiler may still inline (and then the code ends up equivalent) or not. If not, this is the worst case: there are multiple copies of the same function, each called from another place in code. With our hint, all calls can target the same function.

Tl;dr
@muzarski's experiment proves that the compiler can do better if hinted in a way @muzarski did.

muzarski · 2025-05-09T12:39:21Z

So probably it is a tradeoff of inlining vs less copies, right? In other words, with casts we get 2 less copies, but the functions won't be inlined. Do we have reasons to believe this is the right tradeoff to do here?

By telling the compiler that this is the same function, we leave the decision up to it whether to inline or not. By not saying it, the compiler may still inline (and then the code ends up equivalent) or not. If not, this is the worst case: there are multiple copies of the same function, each called from another place in code. With our hint, all calls can target the same function.

I misunderstood @Lorak-mmk at first as well - he meant inlining of the closure provided to update_comma_delimited_list, and not the update_comma_delimited_list itself.

The semantics are exactly the same as in `cass_cluster_set_contact_points`. This is why we can reuse the `update_comma_delimited_list` method. Note: Cluster methods for some reason do not return `CassError`, while execution profile methods do. This is why the error ignored in cluster case. At first I thought that the logic differs between these two, but no - execution profile methods always return CASS_OK (because cpp-driver does not do any pointer validation...).

Nothing special here, just introducing a new struct with a simple HostFilter trait implementation. More interesting thing is how we construct the filtering rules for the HostFilter - this will be handled later in this PR. This commit is introduced early to reduce the noise during review later.

Motivation is going to be further explained in the later commit, where I explain how cpp-driver decides to which hosts it opens the connections. In short: if **all** execution profiles ignore the host, the driver does not open the connection to it. Knowing how filtering rules are applied, I noticed that we can implement the filtering rules for host filter by computing the unions and intersections of whitelists and blacklists respectively. Note: Empty list means that this filtering rule is disabled.

In cpp-driver, the following rule is upheld: if a host is rejected by **all** policies, the connection to that host is not opened at all. We can achieve this by: - taking the union of all whitelists (per hosts and per dcs) - taking the intersection of all blacklists (per hosts and per dcs) Now, if a host is not in the union of whitelists, it is rejected. If a host is in the intersection of blacklists, it is rejected. Note: if the execution profile does not have a base LBP defined (one of: round-robin, dc-aware, rack-aware), all of the extensions are ignored - including the filtering rules. This is achieved by filtering out such profiles in CassCluster::build_host_filter.

The filtering is now implemented. Unfortunately, we cannot enable any integration tests yet. They require cass_future_coordinator.

As I mentioned before - this error is returned when we for some reason cannot route the request to one of the hosts. I forgot to address this earlier (there was no test case for empty plan). I'll introduce one later in this PR.

Added two test cases where we disable all nodes using execution profile filtering. In result, the driver should not open any connections the requests can be routed to.

They can now be enabled since filtering config is implemented.

muzarski · 2025-05-09T13:47:53Z

v2.4: Removed the casts for closures (after a thorough discussion on Slack)

muzarski changed the title ~~Execution profile filtering~~ execution_profile: whitelist/blacklist filtering Apr 29, 2025

muzarski self-assigned this Apr 29, 2025

muzarski added the P1 P1 priority item - very important label Apr 29, 2025

muzarski added this to the 0.5 milestone Apr 29, 2025

muzarski force-pushed the execution-profile-filtering branch from 7608c10 to 5e8b9a7 Compare April 29, 2025 14:16

muzarski marked this pull request as draft April 29, 2025 16:04

muzarski force-pushed the execution-profile-filtering branch 5 times, most recently from 8f3ced7 to d0c2e66 Compare April 30, 2025 11:45

muzarski marked this pull request as ready for review April 30, 2025 11:47

muzarski requested review from wprzytula and Lorak-mmk April 30, 2025 11:47

Lorak-mmk requested changes May 4, 2025

View reviewed changes

scylla-rust-wrapper/src/load_balancing.rs Outdated Show resolved Hide resolved

scylla-rust-wrapper/src/load_balancing.rs Outdated Show resolved Hide resolved

scylla-rust-wrapper/src/load_balancing.rs Outdated Show resolved Hide resolved

muzarski force-pushed the execution-profile-filtering branch from d0c2e66 to 5caf82b Compare May 5, 2025 06:03

muzarski force-pushed the execution-profile-filtering branch from 5caf82b to 3ac0535 Compare May 5, 2025 07:17

muzarski requested a review from Lorak-mmk May 5, 2025 07:19

muzarski force-pushed the execution-profile-filtering branch from 3ac0535 to f5a7f9d Compare May 5, 2025 08:16

Lorak-mmk approved these changes May 5, 2025

View reviewed changes

muzarski mentioned this pull request May 5, 2025

Null pointer provided to cass_cluster_set_contact_points should clear the list #293

Merged

5 tasks

muzarski force-pushed the execution-profile-filtering branch from f5a7f9d to 5aced6c Compare May 5, 2025 14:08

Lorak-mmk approved these changes May 5, 2025

View reviewed changes

muzarski added enhancement New feature or request area/testing Related to unit/integration testing labels May 6, 2025

muzarski force-pushed the execution-profile-filtering branch from e5215ea to 99b4fc9 Compare May 9, 2025 10:36

muzarski requested a review from wprzytula May 9, 2025 10:38

muzarski force-pushed the execution-profile-filtering branch from 99b4fc9 to 4dfd768 Compare May 9, 2025 11:07

wprzytula approved these changes May 9, 2025

View reviewed changes

muzarski added 11 commits May 9, 2025 15:47

cluster/exec_profile: add unit test for filtering config

c989bfb

lbp: introduce and use FilteringLoadBalancingPolicy

563db1d

testing: remove filtering methods from unimplemented

9971198

testing: uncomment the filtering code

cf03424

The filtering is now implemented. Unfortunately, we cannot enable any integration tests yet. They require cass_future_coordinator.

errors: make EmptyPlan return LIB_NO_HOSTS_AVAILABLE

22e1c29

As I mentioned before - this error is returned when we for some reason cannot route the request to one of the hosts. I forgot to address this earlier (there was no test case for empty plan). I'll introduce one later in this PR.

integration: introduce HostFilterTest suite

ba11578

Added two test cases where we disable all nodes using execution profile filtering. In result, the driver should not open any connections the requests can be routed to.

ci: enable remaining DisconnectedNullStringApiArgs tests

b7bc3fa

They can now be enabled since filtering config is implemented.

muzarski force-pushed the execution-profile-filtering branch from 4dfd768 to b7bc3fa Compare May 9, 2025 13:47

Lorak-mmk approved these changes May 9, 2025

View reviewed changes

wprzytula approved these changes May 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

execution_profile: whitelist/blacklist filtering #291

execution_profile: whitelist/blacklist filtering #291

muzarski commented Apr 29, 2025 •

edited

Loading

muzarski commented May 5, 2025 •

edited

Loading

muzarski commented May 5, 2025

muzarski commented May 5, 2025

muzarski commented May 5, 2025

Lorak-mmk left a comment

muzarski commented May 5, 2025 •

edited

Loading

wprzytula commented May 8, 2025 •

edited

Loading

muzarski commented May 8, 2025 •

edited

Loading

muzarski commented May 9, 2025

muzarski commented May 9, 2025

muzarski commented May 9, 2025

Lorak-mmk commented May 9, 2025 •

edited

Loading

muzarski commented May 9, 2025

Lorak-mmk commented May 9, 2025

muzarski commented May 9, 2025

wprzytula commented May 9, 2025 •

edited

Loading

muzarski commented May 9, 2025

muzarski commented May 9, 2025

execution_profile: whitelist/blacklist filtering #291

Are you sure you want to change the base?

execution_profile: whitelist/blacklist filtering #291

Conversation

muzarski commented Apr 29, 2025 • edited Loading

Filtering rules

HostFilter

Vec or HashSet

Integration tests

ExecutionProfile suite

HostFilter suite

DisconnectedNullStringApiArgs suite

Pre-review checklist

muzarski commented May 5, 2025 • edited Loading

muzarski commented May 5, 2025

muzarski commented May 5, 2025

muzarski commented May 5, 2025

Lorak-mmk left a comment

Choose a reason for hiding this comment

muzarski commented May 5, 2025 • edited Loading

wprzytula commented May 8, 2025 • edited Loading

muzarski commented May 8, 2025 • edited Loading

muzarski commented May 9, 2025

muzarski commented May 9, 2025

muzarski commented May 9, 2025

Lorak-mmk commented May 9, 2025 • edited Loading

muzarski commented May 9, 2025

Lorak-mmk commented May 9, 2025

muzarski commented May 9, 2025

wprzytula commented May 9, 2025 • edited Loading

muzarski commented May 9, 2025

muzarski commented May 9, 2025

muzarski commented Apr 29, 2025 •

edited

Loading

muzarski commented May 5, 2025 •

edited

Loading

muzarski commented May 5, 2025 •

edited

Loading

wprzytula commented May 8, 2025 •

edited

Loading

muzarski commented May 8, 2025 •

edited

Loading

Lorak-mmk commented May 9, 2025 •

edited

Loading

wprzytula commented May 9, 2025 •

edited

Loading