Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add background task to collect Cockroach zone node IDs #5857

Merged
merged 20 commits into from
Jun 13, 2024

Conversation

jgallagher
Copy link
Contributor

As discussed in yesterday's update huddle - this adds an endpoint to the cockroach-admin server which internally uses the undocumented crdb_internal.node_id() function to get the node ID of its local running cockroach, and adds a background task to Nexus to collect zone ID <-> node ID mappings for every cockroach zone in the current target blueprint.

This is a prerequisite for decommissioning, but currently has no consumers.

Copy link
Contributor

@andrewjstone andrewjstone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

clients/cockroach-admin-client/src/lib.rs Outdated Show resolved Hide resolved
// told this is "unlikely to change for some time".
// https://github.com/cockroachdb/cockroach/issues/124988 requests that
// this be documented (or an alternative be documented / provided).
let row = client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly worried about this command being routed to another node, which can happen in some cases when a request is forwarded to the raft leader for the quorum containing the data.

While it would be super weird for that to happen in this case, I worry about accidental breakage. For extra assurance, maybe we should assert that this node id matches the current address we are contacting. I'm not sure if we can do that outside the status cli though.

Am I being too paranoid here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if you're being too paranoid, but I think this is easy enough to check for. I took a stab at it in 70c7558

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This looks excellent.

cockroach-admin/tests/integration_tests/commands.rs Outdated Show resolved Hide resolved
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at https://mozilla.org/MPL/2.0/.

//! Background task for collecting the Cockroach Node ID for running CRDB zones
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great comment!

nexus/src/app/background/crdb_node_id_collector.rs Outdated Show resolved Hide resolved
nexus/src/app/background/crdb_node_id_collector.rs Outdated Show resolved Hide resolved
nexus/src/app/background/crdb_node_id_collector.rs Outdated Show resolved Hide resolved
@jgallagher jgallagher merged commit 468de5c into main Jun 13, 2024
22 checks passed
@jgallagher jgallagher deleted the john/crdb-node-id-rpw branch June 13, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants