Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle Panda disconnect exceptions more elegantly #66

Open
evalott100 opened this issue Nov 30, 2023 · 0 comments
Open

Handle Panda disconnect exceptions more elegantly #66

evalott100 opened this issue Nov 30, 2023 · 0 comments

Comments

@evalott100
Copy link
Contributor

evalott100 commented Nov 30, 2023

During a late night run of I22 on @gilesknap 's container the following error was repeated many times:

ERROR:PandA did not respond to GetChanges within 1.0 seconds. Setting all records to major alarm state.
callbackRequest: ERROR cbLow ring buffer full
callbackRequest: ERROR cbLow ring buffer full
WARNING:socket.send() raised exception.
ERROR:Task exception was never retrieved
future: <Task finished name='Task-68034730' coro=<StreamWriter.drain() done, defined at /usr/lib/python3.10/asyncio/streams.py:348> exception=BrokenPipeError(32, 'Broken pipe')>
Traceback (most recent call last):
  File "/usr/lib/python3.10/asyncio/streams.py", line 359, in drain
    raise exc
  File "/usr/lib/python3.10/asyncio/streams.py", line 359, in drain
    raise exc
  File "/usr/lib/python3.10/asyncio/streams.py", line 359, in drain
    raise exc
  [Previous line repeated 33623 more times]
  File "/venv/lib/python3.10/site-packages/pandablocks/asyncio.py", line 103, in _ctrl_read_forever
    received = await reader.read(4096)
  File "/usr/lib/python3.10/asyncio/streams.py", line 650, in read
    raise self._exception
  File "/usr/lib/python3.10/asyncio/streams.py", line 359, in drain
    raise exc
  File "/usr/lib/python3.10/asyncio/streams.py", line 359, in drain
    raise exc
  File "/usr/lib/python3.10/asyncio/selector_events.py", line 924, in write
    n = self._sock.send(data)
BrokenPipeError: [Errno 32] Broken pipe

We should improve this section of code:

except Exception:
logging.exception(f"Error handling '{received.decode()}'")

  except BrokenPipeError as e:
       logging.exception(f"Error handling '{received.decode()}'")
       await asyncio.sleep(<wait more time before trying again>)
  ...
  # Except other errors the panda should be able to handle 
  ...    
  except Exception as e:
       raise e

@coretl Thoughts?

Update

We agreed in a meeting that it's probably a good idea to completely shut down the pandablocks-ioc on such a failure, then let the kubernetes liveness.sh handle restarting it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant