Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sambacc: avoid logging an error if cluster is being torn down #131

Merged

Conversation

phlogistonjohn
Copy link
Collaborator

Saw this in a ceph teuthology run:

  2024-08-20 20:39:57,289: DEBUG: Creating RADOS connection
 2024-08-20 20:39:57,333: INFO: cluster meta content changed
 2024-08-20 20:39:57,333: DEBUG: cluster meta: previous={'nodes':
[{'pnn': 0, 'identity': 'smb.adctdb1.0.0.ceph0.kdlxgn', 'node':
'192.168.76.200', 'state': 'ready'}, {'pnn': 1, 'identity':
'smb.adctdb1.1.0.ceph1.ngbqkk', 'node': '192.168.76.201', 'state':
'ready'}, {'pnn': 2, 'identity': 'smb.adctdb1.2.0.ceph2.rhmqnu', 'node':
'192.168.76.202', 'state': 'ready'}], '_source': 'cephadm'} current={}
 2024-08-20 20:39:57,333: ERROR: error during ctdb_monitor_nodes: max()
arg is an empty sequence, count=0
 Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sambacc/commands/ctdb.py", line
479, in catch
    yield
  File "/usr/lib/python3.9/site-packages/sambacc/commands/ctdb.py", line
360, in ctdb_monitor_nodes
    ctdb.monitor_cluster_meta_changes(
  File "/usr/lib/python3.9/site-packages/sambacc/ctdb.py", line 561, in
monitor_cluster_meta_changes
    expected_nodes = _cluster_meta_to_ctdb_nodes(
  File "/usr/lib/python3.9/site-packages/sambacc/ctdb.py", line 506, in
_cluster_meta_to_ctdb_nodes
    pnn_max = max(n["pnn"] for n in nodes) + 1  # pnn is zero indexed
 ValueError: max() arg is an empty sequence

I could see from the ceph logs the smb cluster was being removed right around this time. If we had nodes and they suddenly vanish we're likely in the process of getting removed and we raced a tad with cephadm removing services while the smb mgr module was removing the contents of the .smb pool.

@phlogistonjohn phlogistonjohn marked this pull request as ready for review August 22, 2024 12:52
Copy link
Collaborator

@synarete synarete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@phlogistonjohn
Copy link
Collaborator Author

@Mergifyio rebase

Saw this in a ceph teuthology run:
```
  2024-08-20 20:39:57,289: DEBUG: Creating RADOS connection
 2024-08-20 20:39:57,333: INFO: cluster meta content changed
 2024-08-20 20:39:57,333: DEBUG: cluster meta: previous={'nodes':
[{'pnn': 0, 'identity': 'smb.adctdb1.0.0.ceph0.kdlxgn', 'node':
'192.168.76.200', 'state': 'ready'}, {'pnn': 1, 'identity':
'smb.adctdb1.1.0.ceph1.ngbqkk', 'node': '192.168.76.201', 'state':
'ready'}, {'pnn': 2, 'identity': 'smb.adctdb1.2.0.ceph2.rhmqnu', 'node':
'192.168.76.202', 'state': 'ready'}], '_source': 'cephadm'} current={}
 2024-08-20 20:39:57,333: ERROR: error during ctdb_monitor_nodes: max()
arg is an empty sequence, count=0
 Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sambacc/commands/ctdb.py", line
479, in catch
    yield
  File "/usr/lib/python3.9/site-packages/sambacc/commands/ctdb.py", line
360, in ctdb_monitor_nodes
    ctdb.monitor_cluster_meta_changes(
  File "/usr/lib/python3.9/site-packages/sambacc/ctdb.py", line 561, in
monitor_cluster_meta_changes
    expected_nodes = _cluster_meta_to_ctdb_nodes(
  File "/usr/lib/python3.9/site-packages/sambacc/ctdb.py", line 506, in
_cluster_meta_to_ctdb_nodes
    pnn_max = max(n["pnn"] for n in nodes) + 1  # pnn is zero indexed
 ValueError: max() arg is an empty sequence
```

I could see from the ceph logs the smb cluster was being removed right
around this time. If we had nodes and they suddenly vanish we're likely
in the process of getting removed and we raced a tad with cephadm
removing services while the smb mgr module was removing the contents
of the .smb pool.

Signed-off-by: John Mulligan <[email protected]>
Copy link

mergify bot commented Aug 22, 2024

rebase

✅ Branch has been successfully rebased

@mergify mergify bot merged commit fb3a130 into samba-in-kubernetes:master Aug 22, 2024
9 checks passed
@phlogistonjohn phlogistonjohn deleted the jjm-fix-remove-err branch August 28, 2024 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants