-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for HTX hung issue while setting Host and Peer setup. #2875
Fix for HTX hung issue while setting Host and Peer setup. #2875
Conversation
This code has fix for HTX hung issue that causes by killing the HTXD deamon during the setup, due to this HTX setup was existing with "cannot connect to peer" Error. Signed-off-by: Shaik Abdulla <[email protected]>
80c00e5
to
039f994
Compare
with latest openssh-9.8p1-3.el10.ppc64le and updated glibc,gcc and make packages [root@ltczep10-lp1 net]# avocado run htx_nic_devices.py -m htx_nic_devices.py.data/htx_nic_devices.yaml --max-parallel-tasks=1 with older openssh-8.7p1-43.el9.ppc64le and glibc, gcc and make files [root@ltcden7-lp1-new net]# avocado run htx_nic_devices.py -m htx_nic_devices.py.data/htx_nic_devices.yaml --max-parallel-tasks=1 (1/3) htx_nic_devices.py:HtxNicTest.test_start;run-5181: PASS (125.15 s) |
@FarooqAbdulla02 please make sure it works on sles @shirishaganta does it looks good to you ? |
yes LGTM.. |
if hxe_pid: | ||
self.log.info("HXE is already running with PID: %s. Killing it.", hxe_pid) | ||
process.run("hcl -shutdown", ignore_status=True) | ||
time.sleep(20) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
process.run takes a timeout value can you move sleep to timeout or this is needed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also please check how this works for cfg run? one quick run would be good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, merging this as blocks CR
This code has fix for HTX hung issue that causes by killing the HTXD deamon during the setup, due to this HTX setup was existing with "cannot connect to peer" Error.
actual Error on HTX Host and Peer setup:
++++++++++++++++++++++++++++++++++++
[stdlog] 2024-08-26 17:11:50,863 avocado.utils.process process L0658 INFO | Running 'pingum'
[stdlog] 2024-08-26 17:11:50,885 avocado.utils.process process L0472 DEBUG| [stdout] Class B n/w configured, thisid=194.229, mylastnib=229
[stdlog] 2024-08-26 17:11:50,885 avocado.utils.process process L0472 DEBUG| [stdout] Ping Com 9.40.194.229---->
[stdlog] 2024-08-26 17:11:50,886 avocado.utils.process process L0472 DEBUG| [stdout] OK
[stdlog] 2024-08-26 17:11:50,886 avocado.utils.process process L0472 DEBUG| [stdout] Ping Com 9.40.194.245---->
[stdlog] 2024-08-26 17:11:50,888 avocado.utils.process process L0472 DEBUG| [stdout] OK
[stdlog] 2024-08-26 17:11:50,888 avocado.utils.process process L0472 DEBUG| [stdout] Ping Test 101net194.245---->
[stdlog] 2024-08-26 17:11:50,890 avocado.utils.process process L0472 DEBUG| [stdout] OK
[stdlog] 2024-08-26 17:11:50,890 avocado.utils.process process L0472 DEBUG| [stdout] All networks ping Ok
[stdlog] 2024-08-26 17:11:50,891 avocado.utils.process process L0715 INFO | Command 'pingum' finished with 0 after 0.026501673s
[stdlog] 2024-08-26 17:11:50,891 avocado.test htx_nic_devices L0288 INFO | Running the HTX for net.mdt on Host
[stdlog] 2024-08-26 17:11:50,905 avocado.test htx_nic_devices L0292 INFO | HTXD is already running with PID: 3526. Killing it.
[stdlog] 2024-08-26 17:11:50,906 avocado.utils.process process L0658 INFO | Running 'pkill -f htxd'
[stdlog] 2024-08-26 17:11:50,928 avocado.utils.process process L0715 INFO | Command 'pkill -f htxd' finished with 0 after 0.008529288s
[stdlog] 2024-08-26 17:12:00,938 avocado.utils.process process L0658 INFO | Running 'htxcmdline -run -mdt net.mdt'
[stdlog] 2024-08-26 17:12:00,943 avocado.utils.process process L0472 DEBUG| [stderr] ERROR: while connecting hostname and port <3492>. Exiting...: Connection refused
[stdlog] 2024-08-26 17:12:00,943 avocado.utils.process process L0715 INFO | Command 'htxcmdline -run -mdt net.mdt' finished with 1 after 0.002794073s
[stdlog] 2024-08-26 17:12:00,943 avocado.test stacktrace L0041 ERROR|
[stdlog] 2024-08-26 17:12:00,943 avocado.test stacktrace L0043 ERROR| Reproduced traceback from: /usr/local/lib/python3.9/site-packages/avocado_framework-106.0-py3.9.egg/avocado/core/test.py:607
[stdlog] 2024-08-26 17:12:00,949 avocado.test stacktrace L0050 ERROR| Traceback (most recent call last):
[stdlog] 2024-08-26 17:12:00,949 avocado.test stacktrace L0050 ERROR| File "htx_nic_devices.py", line 245, in test_start
[stdlog] 2024-08-26 17:12:00,949 avocado.test stacktrace L0050 ERROR| self.run_htx()
[stdlog] 2024-08-26 17:12:00,949 avocado.test stacktrace L0050 ERROR| File "htx_nic_devices.py", line 285, in run_htx
[stdlog] 2024-08-26 17:12:00,949 avocado.test stacktrace L0050 ERROR| self.start_htx_run()
[stdlog] 2024-08-26 17:12:00,949 avocado.test stacktrace L0050 ERROR| File "htx_nic_devices.py", line 296, in start_htx_run
[stdlog] 2024-08-26 17:12:00,950 avocado.test stacktrace L0050 ERROR| process.run(cmd, shell=True, sudo=True)
[stdlog] 2024-08-26 17:12:00,950 avocado.test stacktrace L0050 ERROR| File "/usr/local/lib/python3.9/site-packages/avocado_framework-106.0-py3.9.egg/avocado/utils/process.py", line 1013, in run
[stdlog] 2024-08-26 17:12:00,950 avocado.test stacktrace L0050 ERROR| raise CmdError(cmd, sp.result)
[stdlog] 2024-08-26 17:12:00,950 avocado.test stacktrace L0050 ERROR| avocado.utils.process.CmdError: Command 'htxcmdline -run -mdt net.mdt' failed.
[stdlog] 2024-08-26 17:12:00,950 avocado.test stacktrace L0050 ERROR| stdout: b''
[stdlog] 2024-08-26 17:12:00,950 avocado.test stacktrace L0050 ERROR| stderr: b'ERROR: while connecting hostname and port <3492>. Exiting...: Connection refused\n'
[stdlog] 2024-08-26 17:12:00,950 avocado.test stacktrace L0050 ERROR| additional_info: None
[stdlog] 2024-08-26 17:12:00,950 avocado.test stacktrace L0051 ERROR|
[stdlog] 2024-08-26 17:12:00,950 avocado.test test L0611 DEBUG| Local variables:
[stdlog] 2024-08-26 17:12:00,982 avocado.test test L0614 DEBUG| -> self <class 'htx_nic_devices.HtxNicTest'>: 1-htx_nic_devices.py:HtxNicTest.test_start;run-012a
[stdlog] 2024-08-26 17:12:00,983 avocado.test test L0688 ERROR| Traceback (most recent call last):
[stdlog] 2024-08-26 17:12:00,983 avocado.test test L0688 ERROR| File "/usr/local/lib/python3.9/site-packages/avocado_framework-106.0-py3.9.egg/avocado/core/test.py", line 615, in _run_test
[stdlog] raise details
[stdlog] 2024-08-26 17:12:00,983 avocado.test test L0688 ERROR| File "/usr/local/lib/python3.9/site-packages/avocado_framework-106.0-py3.9.egg/avocado/core/test.py", line 602, in _run_test
[stdlog] testMethod()
[stdlog] 2024-08-26 17:12:00,983 avocado.test test L0688 ERROR| File "htx_nic_devices.py", line 245, in test_start
[stdlog] self.run_htx()
[stdlog] 2024-08-26 17:12:00,983 avocado.test test L0688 ERROR| File "htx_nic_devices.py", line 285, in run_htx
[stdlog] self.start_htx_run()
[stdlog] 2024-08-26 17:12:00,983 avocado.test test L0688 ERROR| File "htx_nic_devices.py", line 296, in start_htx_run
[stdlog] process.run(cmd, shell=True, sudo=True)
[stdlog] 2024-08-26 17:12:00,983 avocado.test test L0688 ERROR| File "/usr/local/lib/python3.9/site-packages/avocado_framework-106.0-py3.9.egg/avocado/utils/process.py", line 1013, in run
[stdlog] raise CmdError(cmd, sp.result)