You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As one of the collaborators on the merged PR for ROS2Connector, I would like to leave this here for any users that have the potential to run into this issue.
I have been working closely with ROS2Connector for several months and have discovered a very cryptic issue when trying to instantiate ROS2Connector while on a network with ROS2 nodes of different distributions running. In my particular case, I am running ROS2 Humble in a container that pulls from dustynv/nano_llm:humble-r36.3.0, and we have other robots that are running ROS2 Jazzy on the same network.
This was the stack trace when trying to instantiate ROS2Connector while Jazzy nodes were on the network.
LLVM ERROR: out of memory
Fatal Python error: Aborted
Thread 0x0000fffdfffff100 (most recent call first):
File "/usr/lib/python3.10/ssl.py", line 1161 in read
File "/usr/lib/python3.10/ssl.py", line 1288 in recv
File "/usr/local/lib/python3.10/dist-packages/websockets/sync/connection.py", line 538 in recv_events
File "/usr/local/lib/python3.10/dist-packages/websockets/sync/server.py", line 171 in recv_events
File "/usr/lib/python3.10/threading.py", line 953 in run
File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap
Thread 0x0000fffdff7ef100 (most recent call first):
File "/usr/lib/python3.10/posixpath.py", line 431 in _joinrealpath
File "/usr/lib/python3.10/posixpath.py", line 397 in realpath
File "/usr/lib/python3.10/inspect.py", line 878 in getmodule
File "/usr/lib/python3.10/inspect.py", line 952 in findsource
File "/usr/lib/python3.10/inspect.py", line 1624 in getframeinfo
File "/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/impl/rcutils_logger.py", line 47 in _find_caller
File "/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/impl/rcutils_logger.py", line 59 in __new__
File "/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/impl/rcutils_logger.py", line 287 in log
File "/opt/ros/humble/local/lib/python3.10/dist-packages/rclpy/impl/rcutils_logger.py", line 329 in info
File "/opt/NanoLLM/nano_llm/plugins/robotics/ros_connector.py", line 89 in __init__
File "/opt/NanoLLM/nano_llm/plugins/dynamic_plugin.py", line 35 in __new__
File "/opt/NanoLLM/nano_llm/agents/dynamic_agent.py", line 67 in add_plugin
File "/usr/lib/python3.10/threading.py", line 953 in run
File "/opt/NanoLLM/nano_llm/agents/dynamic_agent.py", line 60 in add_plugin
File "/opt/NanoLLM/nano_llm/agents/dynamic_agent.py", line 423 in invoke_handler
File "/opt/NanoLLM/nano_llm/agents/dynamic_agent.py", line 441 in on_message
File "/opt/NanoLLM/nano_llm/agents/dynamic_agent.py", line 451 in on_websocket
File "/opt/NanoLLM/nano_llm/web/server.py", line 193 in on_message
File "/opt/NanoLLM/nano_llm/web/server.py", line 393 in websocket_listener
File "/opt/NanoLLM/nano_llm/web/server.py", line 314 in on_websocket
File "/usr/local/lib/python3.10/dist-packages/websockets/sync/server.py", line 499 in conn_handler
File "/usr/lib/python3.10/threading.py", line 953 in run
File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap
Thread 0x0000fffe06a8f100 (most recent call first):
File "/usr/lib/python3.10/selectors.py", line 416 in select
File "/usr/lib/python3.10/socketserver.py", line 232 in serve_forever
File "/usr/local/lib/python3.10/dist-packages/werkzeug/serving.py", line 810 in serve_forever
File "/usr/local/lib/python3.10/dist-packages/werkzeug/serving.py", line 1116 in run_simple
File "/usr/local/lib/python3.10/dist-packages/flask/app.py", line 625 in run
File "/opt/NanoLLM/nano_llm/web/server.py", line 120 in <lambda>
File "/usr/lib/python3.10/threading.py", line 953 in run
File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap
Thread 0x0000fffe0729f100 (most recent call first):
File "/usr/lib/python3.10/selectors.py", line 469 in select
File "/usr/local/lib/python3.10/dist-packages/websockets/sync/server.py", line 227 in serve_forever
File "/opt/NanoLLM/nano_llm/web/server.py", line 119 in <lambda>
File "/usr/lib/python3.10/threading.py", line 953 in run
File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap
Thread 0x0000fffe07aaf100 (most recent call first):
File "/usr/local/lib/python3.10/dist-packages/psutil/__init__.py", line 1814 in cpu_percent
File "/opt/NanoLLM/nano_llm/plugins/tegrastats.py", line 58 in read
File "/opt/NanoLLM/nano_llm/plugins/tegrastats.py", line 96 in run
File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap
Thread 0x0000fffe99fbf100 (most recent call first):
File "/usr/lib/python3.10/threading.py", line 324 in wait
File "/usr/lib/python3.10/threading.py", line 607 in wait
File "/usr/local/lib/python3.10/dist-packages/tqdm/_monitor.py", line 60 in run
File "/usr/lib/python3.10/threading.py", line 1016 in _bootstrap_inner
File "/usr/lib/python3.10/threading.py", line 973 in _bootstrap
Thread 0x0000ffffa2158020 (most recent call first):
File "/usr/lib/python3.10/threading.py", line 1116 in _wait_for_tstate_lock
File "/usr/lib/python3.10/threading.py", line 1096 in join
File "/opt/NanoLLM/nano_llm/agents/dynamic_agent.py", line 504 in run
File "/opt/NanoLLM/nano_llm/studio.py", line 17 in <module>
File "/usr/lib/python3.10/runpy.py", line 86 in _run_code
File "/usr/lib/python3.10/runpy.py", line 196 in _run_module_as_main
We do not know the exact cause of this issue and have struggled with network interference when running different ROS2 distributions before; in one case it resulted in RAM OOM issues. We are unsure as to why this is related to LLVM memory in this case though. Our initial guess is that node discovery across distributions leads to some form of memory leak. If anyone has run into similar issues and has discovered the root cause, please let us know.
The text was updated successfully, but these errors were encountered:
As one of the collaborators on the merged PR for ROS2Connector, I would like to leave this here for any users that have the potential to run into this issue.
I have been working closely with ROS2Connector for several months and have discovered a very cryptic issue when trying to instantiate ROS2Connector while on a network with ROS2 nodes of different distributions running. In my particular case, I am running ROS2 Humble in a container that pulls from dustynv/nano_llm:humble-r36.3.0, and we have other robots that are running ROS2 Jazzy on the same network.
This was the stack trace when trying to instantiate ROS2Connector while Jazzy nodes were on the network.
LLVM ERROR: out of memory
Fatal Python error: Aborted
We do not know the exact cause of this issue and have struggled with network interference when running different ROS2 distributions before; in one case it resulted in RAM OOM issues. We are unsure as to why this is related to LLVM memory in this case though. Our initial guess is that node discovery across distributions leads to some form of memory leak. If anyone has run into similar issues and has discovered the root cause, please let us know.
The text was updated successfully, but these errors were encountered: