-
Notifications
You must be signed in to change notification settings - Fork 12
execution_profile: whitelist/blacklist filtering #291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
7608c10
to
5e8b9a7
Compare
8f3ced7
to
d0c2e66
Compare
I have already found a bug in my I'll fix it - the Note: the |
d0c2e66
to
5caf82b
Compare
Rebased on master. |
5caf82b
to
3ac0535
Compare
v2:
|
3ac0535
to
f5a7f9d
Compare
v2.1: Previously I implemented new tests, but forgot to enable them.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. I don't really care if we use Vec or HashSet, so I'm approving now. If you decide to change the container, then I'll re-approve.
f5a7f9d
to
5aced6c
Compare
v2.2:
|
❓ You haven't marked:
Did you just overlook them? |
The |
e5215ea
to
99b4fc9
Compare
v2.2: Addressed @wprzytula comments (some of them still need an answer) |
99b4fc9
to
4dfd768
Compare
v2.3: Next iteration of addressing @wprzytula comments. As a bonus (after discussion on Slack), I added casts to |
Thanks for the approves - I'm not merging yet. I'll wait for |
Why? Did you check that it really improves something? In other words, are multiple copies of the function generated (in release mode) if we remove those casts, as compared to the current version? |
Without casts:
With casts:
|
So probably it is a tradeoff of inlining vs less copies, right? In other words, with casts we get 2 less copies, but the functions won't be inlined. Do we have reasons to believe this is the right tradeoff to do here? |
I don't have any. Maybe @wprzytula has some. |
By telling the compiler that this is the same function, we leave the decision up to it whether to inline or not. Tl;dr |
I misunderstood @Lorak-mmk at first as well - he meant inlining of the closure provided to |
The semantics are exactly the same as in `cass_cluster_set_contact_points`. This is why we can reuse the `update_comma_delimited_list` method. Note: Cluster methods for some reason do not return `CassError`, while execution profile methods do. This is why the error ignored in cluster case. At first I thought that the logic differs between these two, but no - execution profile methods always return CASS_OK (because cpp-driver does not do any pointer validation...).
Nothing special here, just introducing a new struct with a simple HostFilter trait implementation. More interesting thing is how we construct the filtering rules for the HostFilter - this will be handled later in this PR. This commit is introduced early to reduce the noise during review later.
Motivation is going to be further explained in the later commit, where I explain how cpp-driver decides to which hosts it opens the connections. In short: if **all** execution profiles ignore the host, the driver does not open the connection to it. Knowing how filtering rules are applied, I noticed that we can implement the filtering rules for host filter by computing the unions and intersections of whitelists and blacklists respectively. Note: Empty list means that this filtering rule is disabled.
In cpp-driver, the following rule is upheld: if a host is rejected by **all** policies, the connection to that host is not opened at all. We can achieve this by: - taking the union of all whitelists (per hosts and per dcs) - taking the intersection of all blacklists (per hosts and per dcs) Now, if a host is not in the union of whitelists, it is rejected. If a host is in the intersection of blacklists, it is rejected. Note: if the execution profile does not have a base LBP defined (one of: round-robin, dc-aware, rack-aware), all of the extensions are ignored - including the filtering rules. This is achieved by filtering out such profiles in CassCluster::build_host_filter.
The filtering is now implemented. Unfortunately, we cannot enable any integration tests yet. They require cass_future_coordinator.
As I mentioned before - this error is returned when we for some reason cannot route the request to one of the hosts. I forgot to address this earlier (there was no test case for empty plan). I'll introduce one later in this PR.
Added two test cases where we disable all nodes using execution profile filtering. In result, the driver should not open any connections the requests can be routed to.
They can now be enabled since filtering config is implemented.
4dfd768
to
b7bc3fa
Compare
v2.4: Removed the casts for closures (after a thorough discussion on Slack) |
Fixes: #243
This PR introduces whitelist/blacklist host filtering to the driver.
Filtering rules
There are currently 4 filtering rules (we could introduce 2 more in the future - rack filtering):
The logic for checking if a host is accepted is following (in the specified order):
ip whitelist
is non-empty, and if it does not contain host's ip -> rejectip blacklist
is non-empty, and if it contains host's ip -> rejectdc whitelist
is non-empty, and if it does not contain host's dc -> rejectdc blacklist
is non-empty, and if it contains host's dc -> rejectHostFilter
Since all of the execution profiles are defined before session creation (in other words: user cannot define new profiles after session was created and connected), cpp-driver (and so, cpp-rust-driver as well) can see which hosts are disabled by all execution profiles.
If there is a host which is rejected by all execution profiles (including the default one), the driver does not open the connection to such host. This is exactly what we do in this PR - after collecting all of the filtering rules of each profile, we compute the filtering rules for our custom
HostFilter
implementor - namelyCassHostFilter
. This is done by computing the unions of whitelists and intersection of blacklists.Vec or HashSet
For now, I decided to use
Vec
to represent the set of items in the whitelists/blacklists. I'd say that we do not expect them to contain a lot of items, so I thinkVec
is more efficient thanHashSet
in this specific case. One could argue that this code is not critical (and I agree) - this is a configuration phase. I'm open to suggestions.Integration tests
ExecutionProfile suite
I have not enabled any yet - they require
cass_future_coordinator
, which is used to check the ip of host the request was routed to. TBH, I think this is a blocker for merging - I prefer to wait for thecass_future_coordinator
, and enable the corresponding tests (and maybe implement some additional tests on my own). I still believe that this PR is ready for review, though.HostFilter suite
This is the suite I implemented myself - currently there are two tests that reject all nodes using execution profiles - in result, session does not have any connections to route the requests to.
DisconnectedNullStringApiArgs suite
Enabled remaining 4 test cases from this suite - they check whether driver behaves as expected when null string is provided to filtering config methods.
Pre-review checklist
.github/workflows/build.yml
ingtest_filter
..github/workflows/cassandra.yml
ingtest_filter
.