Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: CON-1346 CON-1407 Don't use rsync to copy the state during recoveries #3853

Merged
merged 21 commits into from
Feb 17, 2025

Conversation

eichhorl
Copy link
Contributor

@eichhorl eichhorl commented Feb 7, 2025

#2948 introduced a change which avoids using rsync during the "upload state" step of recoveries performed on the replica node directly. This PR does the same for the "download state" step by using cp instead.

Additionally, we add a parameter --wait-for-cup-node which is used to specify the IP of the node to be used for polling the current recovery CUP. This is necessary if no IP was specified in either the "download" or the "upload" step.

Futhermore, we introduce a new function confirm_exec_cmd, which will only execute the given command after explicit user confirmation (similar to how it is already done for rsync and ssh commands). This allows for easier sanity checks to improve the confidence in the new process.

Finally, we add a system test for local node recoveries, which performs a recovery by sending a single ic-recovery to a subnet node via ssh. Node that, for now, we need to skip the download & merge certification steps, as these require ssh agent forwarding, which seems difficult to do in the test environment.

The new system test exercises the recovery variant without version upgrade, failover nodes, or chain key resharing.

@eichhorl eichhorl changed the title recovery local cp feat: CON-1346 CON-1407 Don't use rsync to copy state on the node during recoveries Feb 10, 2025
@github-actions github-actions bot added the feat label Feb 10, 2025
@eichhorl eichhorl changed the title feat: CON-1346 CON-1407 Don't use rsync to copy state on the node during recoveries feat: CON-1346 CON-1407 Don't use rsync to copy the state during recoveries Feb 10, 2025
@eichhorl eichhorl marked this pull request as ready for review February 10, 2025 12:59
@eichhorl eichhorl requested a review from a team as a code owner February 10, 2025 12:59
@eichhorl eichhorl added the CI_ALL_BAZEL_TARGETS Runs all bazel targets and uploads them to S3 label Feb 17, 2025
@eichhorl eichhorl enabled auto-merge February 17, 2025 09:39
@eichhorl eichhorl added this pull request to the merge queue Feb 17, 2025
Merged via the queue into master with commit 7dda5ff Feb 17, 2025
24 checks passed
@eichhorl eichhorl deleted the eichhorl/recover-local-cp branch February 17, 2025 10:11
aterga pushed a commit that referenced this pull request Feb 17, 2025
…coveries (#3853)

#2948 introduced a change which avoids
using `rsync` during the "upload state" step of recoveries performed on
the replica node directly. This PR does the same for the "download
state" step by using `cp` instead.

Additionally, we add a parameter `--wait-for-cup-node` which is used to
specify the IP of the node to be used for polling the current recovery
CUP. This is necessary if no IP was specified in either the "download"
or the "upload" step.

Futhermore, we introduce a new function `confirm_exec_cmd`, which will
only execute the given command after explicit user confirmation (similar
to how it is already done for `rsync` and `ssh` commands). This allows
for easier sanity checks to improve the confidence in the new process.

Finally, we add a system test for local node recoveries, which performs
a recovery by sending a single `ic-recovery` to a subnet node via `ssh`.
Node that, for now, we need to skip the download & merge certification
steps, as these require ssh agent forwarding, which seems difficult to
do in the test environment.

The new system test exercises the recovery variant without version
upgrade, failover nodes, or chain key resharing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI_ALL_BAZEL_TARGETS Runs all bazel targets and uploads them to S3 @consensus feat
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants