Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry NodeConfig mounts #1559

Open
tnozicka opened this issue Nov 10, 2023 · 2 comments · May be fixed by #2253
Open

Retry NodeConfig mounts #1559

tnozicka opened this issue Nov 10, 2023 · 2 comments · May be fixed by #2253
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@tnozicka
Copy link
Member

tnozicka commented Nov 10, 2023

What should the feature do?

When a mount fails, say because the actual filesystem needed a repair, we won't try again. Well systemd doesn't :(

Say when the admin runs the notorious xfs_repair he has no idea there is a unit 'mnt-persistent\x2dvolumes.mount' that needs to be restarted. With some luck he will reboot the node...

What is the use case behind this feature?

Stability

Anything else we need to know?

root@ubuntu-2204:~# systemctl status 'mnt-persistent\x2dvolumes.mount'
× mnt-persistent\x2dvolumes.mount - Managed mount by Scylla Operator
     Loaded: loaded (/etc/systemd/system/mnt-persistent\x2dvolumes.mount; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Fri 2023-11-10 18:50:29 UTC; 10min ago
      Where: /mnt/persistent-volumes
       What: /dev/md127
        CPU: 5ms

Nov 10 18:50:29 ubuntu-2204 systemd[1]: Mounting Managed mount by Scylla Operator...
Nov 10 18:50:29 ubuntu-2204 systemd[1]: mnt-persistent\x2dvolumes.mount: Mount process exited, code=exited, status=32/n/a
Nov 10 18:50:30 ubuntu-2204 mount[3469]: mount: /mnt/persistent-volumes: mount(2) system call failed: Structure needs cleaning.
Nov 10 18:50:29 ubuntu-2204 systemd[1]: mnt-persistent\x2dvolumes.mount: Failed with result 'exit-code'.
Nov 10 18:50:29 ubuntu-2204 systemd[1]: Failed to mount Managed mount by Scylla Operator.
root@ubuntu-2204:~# systemctl restart 'mnt-persistent\x2dvolumes.mount'
root@ubuntu-2204:~# systemctl status 'mnt-persistent\x2dvolumes.mount'
● mnt-persistent\x2dvolumes.mount - Managed mount by Scylla Operator
     Loaded: loaded (/etc/systemd/system/mnt-persistent\x2dvolumes.mount; enabled; vendor preset: enabled)
     Active: active (mounted) since Fri 2023-11-10 19:01:32 UTC; 1s ago
      Where: /mnt/persistent-volumes
       What: /dev/md127
      Tasks: 0 (limit: 4535)
     Memory: 12.0K
        CPU: 8ms
     CGroup: /system.slice/mnt-persistent\x2dvolumes.mount

Nov 10 19:01:32 ubuntu-2204 systemd[1]: Mounting Managed mount by Scylla Operator...
Nov 10 19:01:32 ubuntu-2204 systemd[1]: Mounted Managed mount by Scylla Operator.
root@ubuntu-2204:~# 
@tnozicka tnozicka added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 10, 2023
@scylla-operator-bot scylla-operator-bot bot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Nov 10, 2023
@tnozicka tnozicka added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Nov 10, 2023
@scylla-operator-bot scylla-operator-bot bot removed the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Nov 10, 2023
Copy link
Contributor

The Scylla Operator project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 30d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out

/lifecycle stale

@scylla-operator-bot scylla-operator-bot bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 30, 2024
@tnozicka
Copy link
Member Author

tnozicka commented Jul 1, 2024

/remove-lifecycle stale
/triage accepted

@scylla-operator-bot scylla-operator-bot bot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 1, 2024
@tnozicka tnozicka assigned rzetelskik and unassigned zimnx Oct 21, 2024
@rzetelskik rzetelskik added this to the v1.16.0 milestone Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants