Skip to content

Implement exposing/enforcing coordinator for request #299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

muzarski
Copy link
Collaborator

@muzarski muzarski commented May 12, 2025

Fixes: #249, Fixes: #242
Ref: #132

Implemented methods:

  • cass_statement_set_host
  • cass_statement_set_host_n
  • cass_statement_set_host_inet
  • cass_statement_set_node
  • cass_future_coordinator

Problems with implementing cass_future_coordinator

cass_future_coordinator returns CassNode* which should be borrowed from CassFuture and live as long as underlying CassResult lives.

The problem is that CassResult is stored under a Mutex. This is why the obtained reference of Coordinator has very short lifetime - it's the lifetime of acquired MutexGuard. But we need to return something with longer lifetime - our safe FFI API (with borrow-checker) starts to complain.

In the commit where I implement cass_future_coordinator I unsafely extend the lifetime of the coordinator. We can do that, because we are guaranteed that returned pointer will be valid as long as CassFutureResult is valid (see SAFETY comment in the code).

However, there is a way to omit this unsafe code and represent the CassFutureResult immutability guarantees (after the future is resolved) on a type level. Last commit changes CassFuture a bit, so the CassFutureResult is stored inside OnceLock, and not inside Mutex. Only then, can we remove the unsafe code responsible for extending the lifetime in cass_future_coordinator. This commit serves as a suggestion of one of (probably many) solutions to this problem. If we decide to do that, I'll probably split this commit into multiple commits before merging.

Integration tests

Enabled 3 test suites: StatementTests, StatementNoClusterTests and ServerSideFailureThreeNodeClusterTests - 11 tests in total! The logic of each test is explained in commit messages.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have implemented Rust unit tests for the features/changes introduced.
  • I have enabled appropriate tests in .github/workflows/build.yml in gtest_filter.
  • I have enabled appropriate tests in .github/workflows/cassandra.yml in gtest_filter.

@muzarski muzarski marked this pull request as draft May 12, 2025 14:21
@muzarski muzarski self-assigned this May 12, 2025
@muzarski muzarski added this to the 0.5 milestone May 12, 2025
@muzarski muzarski added enhancement New feature or request area/testing Related to unit/integration testing P1 P1 priority item - very important labels May 12, 2025
@muzarski muzarski force-pushed the enforce-get-coordinator branch 2 times, most recently from 3925fa0 to afed812 Compare May 12, 2025 15:23
@muzarski muzarski requested review from wprzytula and Lorak-mmk and removed request for wprzytula May 12, 2025 15:52
@muzarski muzarski marked this pull request as ready for review May 12, 2025 15:52
@wprzytula
Copy link
Collaborator

@muzarski Please fill the pre-review checklist.

Copy link
Collaborator

@Lorak-mmk Lorak-mmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the change in the last commit.

Comment on lines 70 to 71
#[unsafe(no_mangle)]
pub unsafe extern "C" fn testing_future_get_host(
future_raw: CassBorrowedSharedPtr<CassFuture, CConst>,
host: *mut *mut c_char,
host_length: *mut size_t,
) {
let Some(future) = ArcFFI::as_ref(future_raw) else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: In this function we allocate a string, leak it (using into_raw) and assign it to user-provided pointer.
The user is now responsible for freeing this string, but since this is a normal string there is no cpp-driver API for this (think cass_something_free), so C-provided free should be used.

This is not the first appearance of this pattern.

I have two concerns:

  • Is it even legal to free Rust-allocated stuff using free from C? It very well might be, but I'm, not sure.
  • I think there won't be any issues caused by that pattern when developing custom allocator API from cpp-driver, but again I'm not sure. We'll need to be careful.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, this is just testing API, I didn't notice. I'm sure this pattern appears in some public APIs, so the comment should still be valid.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note: In this function we allocate a string, leak it (using into_raw) and assign it to user-provided pointer. The user is now responsible for freeing this string, but since this is a normal string there is no cpp-driver API for this (think cass_something_free), so C-provided free should be used.

I introduced testing_free_host which should be called on a pointer obtained from testing_future_get_host. User does not need to call C free. Or am I understanding your concerns wrong?

Note: In the next commit testing_free_host and testing_free_contact_points are replaced with testing_free_cstring which can be used instead of them. These two functions did the same, so I introduced a common function (with a more suitable name).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or am I understanding your concerns wrong?

In testing your testing_free_host, but we have do places in the public API where the string is filled by us, and freed by the user with free, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I'll need to check that, because in the examples of such pattern I have on top of my head, the string is already allocated and owned by some other struct (which frees the string when dropped/free'd). I'll do it tomorrow.

muzarski and others added 12 commits May 15, 2025 14:17
To get the support for enforcing/getting coordinator.
There is also `cass_statement_set_node` but it cooperates with
`cass_future_coordinator` which will be implemented later.
I decided to store coordinator as Option. This is because we have to somehow
mock it in unit tests - rust-driver does not expose any way to mock the coordinator.
There is one tricky part that we need to handle: the coordinator is held
by CassResult, which is stored under Mutex in CassFuture. In result, the
reference we obtain in `CassFuture::with_waited_result` closure has a lifetime
of the mutex guard (temporary lifetime during the function call). We need to
extend the lifetime of returned coordinator - further reasoning is explained in
the comment in code.
The suite contains 4 tests:
- SetHostWithValidHostString -> calls cass_statment_set_host with valid ip addresses and port.
  Expects CASS_OK to be returned
- SetHostWithInvalidHostString -> calls cass_statement_set_host with invalid ip addressed ("inavlid", "", NULL).
  Expects LIB_BAD_PARAMS to be returned
- SetHostWithValidHostInet -> calls cass_statement_set_host_inet with valid ip addresses
  Expects CASS_OK.
- SetHostWithInvalidHostInet -> calls cass_statement_set_host_inet with invalid CassInet struct.
  Expects LIB_BAD_PARAMS
This suite contains 5 tests:
- SetHost -> it enforces "127.0.0.1:9042" host using cass_statement_set_host.
  Then, it fetches the rpc_address from system.local and checks whether it matches
  enforced address (twice).
- SetHostInet -> the same as above, but using cass_statement_set_host_inet (CassInet instead of String).
- SetNode -> executes "SELECT rpc_address from system.local" on random node. Then it gets the
  coordinator of this request (using cass_future_coordinator) and enforces this coordinator (cass_statement_set_node)
  on the same statement. Checks whether addresses match.
- SetHostWithInvalidPort -> tries to enforce host with unknown port (8888). Expects LIB_NO_HOST_AVAILABLE.
- SetHostWhereHostIsDown -> stops the node, and then tried to enforce it as a coordinator for some request.
  Expects LIB_NO_HOST_AVAILABLE.
It contains two tests. Both of the tests use 3-nodes cluster (single dc) and RF=3.
Both of them try to enforce the "127.0.0.1" host for some read/write request with cl=LOCAL_QUORUM.

- ErrorReadWriteTimeout -> It **pauses** two remaining nodes and expects server-side
  READ/WRITE_TIMEOUT
- ErrorUnavailable -> It **stops** two remaining nodes and expects UNAVAILABLE server-side error.
Added rust utilities to obtain a stringified ip address of request coordinator
from future. Implemented `get_host_from_future` on top of them.

This utility method is used in integration tests. We cannot enable any yet - they
require other features (e.g. filtering config methods).
Replaced `testing_free_contact_points` and `testing_free_host` with one
common method `testing_free_cstring`.
The result is going to be initialized only once, thus we do not need to
store it behind a mutex. We can use OnceLock instead. Thanks to that,
we can remove the unsafe logic which extends the lifetime of `coordinator`
reference in cass_future_coordinator. We now guarantee that
the result will be immutable once future is resolved - the guarantee
is provided on the type-level.
@muzarski muzarski force-pushed the enforce-get-coordinator branch from afed812 to 665266c Compare May 15, 2025 12:21
@muzarski
Copy link
Collaborator Author

muzarski commented May 15, 2025

Rebased on master and added cass_statement_free to unit test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testing Related to unit/integration testing enhancement New feature or request P1 P1 priority item - very important
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement cass_future_coordinator Implement enforcing coordinator for the request execution.
3 participants