Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server-side implementation of incremental subscription changes #2030

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

jsdt
Copy link
Contributor

@jsdt jsdt commented Dec 2, 2024

Description of Changes

This adds the new message types for being able to subscribe and unsubscribe to individual queries without affecting other subscriptions. Much of this was taken from #1997, but this version does not remove the existing way of subscribing to a set of queries. This renames that to legacy_ in many places, and we eventually want to get rid of it, but by making this an additive change we don't need to update all the clients at the same time.

This biggest changes are in the SubscriptionManager code. Previously we just stored two mappings:

  1. Query hash to the set of clients subscribing, and
  2. Table id to the set of related queries

Now we also store a mapping from client to the set of subscriptions for that client (which is useful for unsubscribing), and we keep track of both legacy subscriptions (which have a set of subscriptions per client), and the new subscriptions (which are identified by a (ClientId, RequestId) pair).

API and ABI breaking changes

This is additive, and only adds new enum types to existing messages, so this is not breaking. In the future we will want to make a breaking change to remove the legacy versions.

Expected complexity level and risk

  1. This will add some complexity to the client, but this is mostly just tweaking the existing logic.

Testing

This passes existing tests (which exercise the legacy versions), and there are a few basic tests for the new version.

More unit test coverage for incremental subscription changes would be good, but I will probably add that when I start adding client support for it.

Follow-up work

We still need:

  1. Client support for this.
  2. More testing
  3. Handling of module hot swapping. If a module is republished, and a change to indexes breaks any queries, we need to send errors to clients for those queries.

@jsdt jsdt marked this pull request as ready for review December 2, 2024 20:29
@jsdt jsdt requested review from Centril and gefjon as code owners December 2, 2024 20:29
crates/client-api-messages/src/websocket.rs Outdated Show resolved Hide resolved
crates/client-api-messages/src/websocket.rs Outdated Show resolved Hide resolved
crates/client-api-messages/src/websocket.rs Outdated Show resolved Hide resolved
crates/client-api-messages/src/websocket.rs Outdated Show resolved Hide resolved
crates/client-api-messages/src/websocket.rs Outdated Show resolved Hide resolved
crates/core/src/error.rs Outdated Show resolved Hide resolved
@gefjon gefjon assigned jsdt and unassigned gefjon Dec 3, 2024
/// The ID included in the `SubscribeApplied` and `Unsubscribe` messages.
pub query_id: QueryId,
/// The matching rows for this query.
/// Note, this makes unsubscribing potentially very expensive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make an issue to track this?

Comment on lines +84 to +87
WORKER_METRICS
.request_round_trip
.with_label_values(&WorkloadType::Subscribe, &address, "")
.observe(timer.elapsed().as_secs_f64());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit is repeated 5x modulo the workload and reducer string. I'd extract this into a closure taking the workload and the string.

pub struct SubscriptionRows {
pub table_id: TableId,
pub table_name: Box<str>,
pub table_rows: FormatSwitch<ws::TableUpdate<BsatnFormat>, ws::TableUpdate<JsonFormat>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sad we cannot do type Switched<T> = FormatSwitch<T<BsatnFormat>, T<JsonFormat>>; :( My kingdom for higher kinded types...

Comment on lines +374 to +384
ws::UnsubscribeApplied {
total_host_execution_duration_micros,
request_id,
query_id,
rows: ws::SubscribeRows {
table_id: result.table_id,
table_name: result.table_name,
table_rows,
},
}
.into(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And my other kingdom for generic closures...

let query = match subscriptions.remove_subscription((sender.id.identity, sender.id.address), request.query_id) {
Ok(query) => query,
Err(error) => {
// Apparently we ignore errors sending messages.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you do something differently?

Comment on lines +288 to +292
let client_info = self.clients.get(client);
if client_info.is_none() {
return;
}
let client_info = client_info.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use let Some(...) = ... else { ... }?

Comment on lines +297 to +301
if query_state.is_none() {
tracing::warn!("Query state not found for query hash: {:?}", query_hash);
return;
}
};

self.clients.remove(client);
self.subscribers.retain(|hash, ids| {
ids.remove(client);
if ids.is_empty() {
if let Some(query) = self.queries.remove(hash) {
remove_table_query(query.return_table(), hash);
remove_table_query(query.filter_table(), hash);
}
let query_state = query_state.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too... using let...else ? or are there some borrowing issues?

Comment on lines +306 to +307
SubscriptionManager::remove_table_query(&mut self.tables, query_state.query.return_table(), query_hash);
SubscriptionManager::remove_table_query(&mut self.tables, query_state.query.filter_table(), query_hash);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make into a helper taking &mut self.tables, &query_state, query_hash? Seems to occur 3x.

Comment on lines +652 to +661
let db = TestDB::durable()?;

create_table(&db, "T")?;
let sql = "select * from T";
let plan = compile_plan(&db, sql)?;

let client = Arc::new(client(0));

let query_id: ClientQueryId = QueryId::new(1);
let mut subscriptions = SubscriptionManager::default();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems copy-pasted a few times -- for the sake of readability, I'd like to see these moved to a helper / higher order function.

}

#[test]
fn test_unsubscribe_from_the_only_subscription() -> ResultTest<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a test where we add two different queries for the same client and remove one of them and assert that only one of them remain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Subscriptions: New WebSocket API and server side handling
3 participants