Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restore-cluster restoring instances from the incorrect location #766

Open
chrisjmiller1 opened this issue May 22, 2024 · 2 comments
Open
Labels
bug Something isn't working help-wanted Issues in the state 'help-wanted'

Comments

@chrisjmiller1
Copy link

chrisjmiller1 commented May 22, 2024

Project board link

Hi,

I'm currently testing Medusa with OCI (Oracle cloud infrastructure) and have noticed that when I run a restore-cluster command that the source path for node1 is being used for subsequent nodes i.e. node2/3.

Command: medusa restore-cluster --backup-name 20240521-cluster-1

The output from the initial command looks good:
[2024-05-22 14:42:20,019] INFO: About to restore on node1 using {'source': ['node1'], 'seed': True} as backup source
[2024-05-22 14:42:20,019] INFO: About to restore on node2 using {'source': ['node2'], 'seed': True} as backup source
[2024-05-22 14:42:20,019] INFO: About to restore on node3 using {'source': ['node3'], 'seed': True} as backup source

But when I look in the medusa.logs for node2 and node3 I see the following which demonstrate the incorrect source path is being used.
Node 2:
[2024-05-22 14:46:40,709] DEBUG: aws --endpoint-url https://idxxxxxxxxxxx.compat.objectstorage.us-ashburn-1.oraclecloud.com:443 s3 cp s3://bucketname/node1/20240521-cluster-1/meta/tokenmap.json /tmp/medusa-restore-3ea01bff-216f-4536-8d6d-5809dae267de
Node 3:
[2024-05-22 14:49:47,420] DEBUG: https://idxxxxxxxxxxx.compat.objectstorage.us-ashburn-1.oraclecloud.com:443 "HEAD /bucketname/node1/20240521-cluster-1/meta/tokenmap.json HTTP/1.1" 200 0
[2024-05-22 14:49:47,421] DEBUG: aws --endpoint-url https://idxxxxxxxxxxx.compat.objectstorage.us-ashburn-1.oraclecloud.com:443 s3 cp s3://bucketname/node1/20240521-cluster-1/meta/tokenmap.json /tmp/medusa-restore-3f95200e-ed82-46f0-acfb-59c7d6d9a7cb

This then results in the following error being reported (as expected) on the other nodes:
ERROR [main] 2024-05-22 12:59:50,843 CassandraDaemon.java:897 - Cannot start node if snitch's rack (1b) differs from previous rack (1a). Please fix the snitch configuration, decommission and rebootstrap this node or use the flag -Dcassandra.ignore_rack=true.

I'm seeing this behavior on 0.15.0 and 0.21.0.

Thanks,

Chris.

┆Issue is synchronized with this Jira Story by Unito
┆Issue Number: MED-9

@rzvoncek
Copy link
Contributor

Hello and thanks for the report. This indeed looks like a bug, something similar to what we've seen in #676.

I'll try to find some time to fix this, but I can't promise anything.

@rzvoncek rzvoncek moved this to Help Wanted in K8ssandra May 24, 2024
@adejanovski adejanovski added the help-wanted Issues in the state 'help-wanted' label May 24, 2024
@rzvoncek rzvoncek added bug Something isn't working and removed help-wanted Issues in the state 'help-wanted' labels May 24, 2024
@adejanovski adejanovski added the help-wanted Issues in the state 'help-wanted' label Aug 14, 2024
@chrisjmiller1
Copy link
Author

Hi @rzvoncek , @adejanovski , just checking in to see how this issue is progressing.

Thanks,

Chris.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help-wanted Issues in the state 'help-wanted'
Projects
None yet
Development

No branches or pull requests

3 participants