Skip to content

Commit

Permalink
sambacc: avoid logging an error if cluster is being torn down
Browse files Browse the repository at this point in the history
Saw this in a ceph teuthology run:
```
  2024-08-20 20:39:57,289: DEBUG: Creating RADOS connection
 2024-08-20 20:39:57,333: INFO: cluster meta content changed
 2024-08-20 20:39:57,333: DEBUG: cluster meta: previous={'nodes':
[{'pnn': 0, 'identity': 'smb.adctdb1.0.0.ceph0.kdlxgn', 'node':
'192.168.76.200', 'state': 'ready'}, {'pnn': 1, 'identity':
'smb.adctdb1.1.0.ceph1.ngbqkk', 'node': '192.168.76.201', 'state':
'ready'}, {'pnn': 2, 'identity': 'smb.adctdb1.2.0.ceph2.rhmqnu', 'node':
'192.168.76.202', 'state': 'ready'}], '_source': 'cephadm'} current={}
 2024-08-20 20:39:57,333: ERROR: error during ctdb_monitor_nodes: max()
arg is an empty sequence, count=0
 Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/sambacc/commands/ctdb.py", line
479, in catch
    yield
  File "/usr/lib/python3.9/site-packages/sambacc/commands/ctdb.py", line
360, in ctdb_monitor_nodes
    ctdb.monitor_cluster_meta_changes(
  File "/usr/lib/python3.9/site-packages/sambacc/ctdb.py", line 561, in
monitor_cluster_meta_changes
    expected_nodes = _cluster_meta_to_ctdb_nodes(
  File "/usr/lib/python3.9/site-packages/sambacc/ctdb.py", line 506, in
_cluster_meta_to_ctdb_nodes
    pnn_max = max(n["pnn"] for n in nodes) + 1  # pnn is zero indexed
 ValueError: max() arg is an empty sequence
```

I could see from the ceph logs the smb cluster was being removed right
around this time. If we had nodes and they suddenly vanish we're likely
in the process of getting removed and we raced a tad with cephadm
removing services while the smb mgr module was removing the contents
of the .smb pool.

Signed-off-by: John Mulligan <[email protected]>
  • Loading branch information
phlogistonjohn committed Aug 22, 2024
1 parent 2cdbd79 commit 410e571
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions sambacc/ctdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -551,6 +551,13 @@ def monitor_cluster_meta_changes(
if curr_meta == prev_meta:
_logger.debug("cluster meta content unchanged: %r", curr_meta)
continue
if len(prev_meta) > 0 and len(curr_meta) == 0:
# cluster is possibly (probably?) being destroyed.
# Return from this loop and let the command-level loop decide if
# this function needs to be restarted or not. There's a chance this
# process will be terminated very soon anyway.
_logger.warning("no current nodes available")
return
_logger.info("cluster meta content changed")
_logger.debug(
"cluster meta: previous=%r current=%r", prev_meta, curr_meta
Expand Down

0 comments on commit 410e571

Please sign in to comment.