Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e: add back agent-flood #534

Closed
dougsland opened this issue Sep 3, 2024 · 5 comments · Fixed by #580
Closed

e2e: add back agent-flood #534

dougsland opened this issue Sep 3, 2024 · 5 comments · Fixed by #580
Assignees
Labels

Comments

@dougsland
Copy link
Collaborator

agent-flood is not working yet, I have disabled as we fixed the others CI/CD issues, we need to investigate what's going on.
Reference: 2dd3e4e

My first suggestion would be, when a fail happen, we need more debug information, for example okay bluechi-tester-1 is not active but why? Go with podman inside the container and print some debug like: bluechi configuration also, not only "bluechi-tester-1 is not active" (of course always be careful about sensitive data even in a github actions env which is test/debug only.

[ INFO  ] setup_test_containers_in_qm(): prepare quadlet files for bluechi-tester-1.container
[ INFO  ] setup_test_containers_in_qm(): updating AllowedNodeNames in /etc/bluechi/controller.conf
[ INFO  ] setup_test_containers_in_qm(): bluechi-controller reload & restart
[ INFO  ] setup_test_containers_in_qm(): prepare quadlet files for bluechi-tester-2.container
[ INFO  ] setup_test_containers_in_qm(): updating AllowedNodeNames in /etc/bluechi/controller.conf
[ INFO  ] setup_test_containers_in_qm(): bluechi-controller reload & restart
[ INFO  ] run_test_containers(): bluechi-tester-1 reload & restart
[ INFO  ] run_test_containers(): bluechi-tester-2 reload & restart
[ INFO  ] test() bluechi-tester-1 is not active

< ADD HERE the BLUECHI CONFIGURATION for easy debug when the node is NOT active >

@dougsland dougsland added the jira label Sep 3, 2024
@pengshanyu pengshanyu self-assigned this Sep 12, 2024
@pengshanyu
Copy link
Collaborator

pengshanyu commented Sep 19, 2024

I tried to start the bluechi agent bluechi-tester-1 manually, and ran into an issue:


bash-5.1# systemctl status bluechi-tester-1
× bluechi-tester-1.service - bluechi-tester-1
     Loaded: loaded (/etc/containers/systemd/bluechi-tester-1.container; generated)
     Active: failed (Result: exit-code) since Thu 2024-09-19 02:55:36 UTC; 3s ago
   Duration: 200ms
    Process: 365 ExecStart=/usr/bin/podman run --name=systemd-bluechi-tester-1 --cidfile=/run/bluechi-tester-1.cid --replace --rm --cgroups=split --network=host --sdnotify=conmon -d dir:/var/lib/containers/registry/tools-ffi:latest /root/tests/FFI/bin/bluechi-tester --url=tcp:host=10.0.2.15,port=842 --nodename=bluechi-tester-1 --numbersignals=11111111 --signal=JobDone (code=exited, status=1/FAILURE)
    Process: 384 ExecStopPost=/usr/bin/podman rm -v -f -i --cidfile=/run/bluechi-tester-1.cid (code=exited, status=0/SUCCESS)
   Main PID: 365 (code=exited, status=1/FAILURE)
        CPU: 639ms

Sep 19 02:55:35 cfd08667ece1 systemd[1]: Started bluechi-tester-1.
Sep 19 02:55:35 cfd08667ece1 bluechi-tester-1[349]: 2ba8e622d531c5889df5320e2470e27910667c363890e5aab8fedda1a24138d8
Sep 19 02:55:35 cfd08667ece1 systemd-bluechi-tester-1[365]: Traceback (most recent call last):
Sep 19 02:55:35 cfd08667ece1 systemd-bluechi-tester-1[365]:   File "/root/tests/FFI/bin/bluechi-tester", line 10, in <module>
Sep 19 02:55:35 cfd08667ece1 systemd-bluechi-tester-1[365]:     from dasbus.connection import AddressedMessageBus, InterfaceProxy, ObjectProxy
Sep 19 02:55:35 cfd08667ece1 systemd-bluechi-tester-1[365]: ModuleNotFoundError: No module named 'dasbus'
Sep 19 02:55:36 cfd08667ece1 podman[369]: 2024-09-19 02:55:36.090419367 +0000 UTC m=+0.071204242 container died 2ba8e622d531c5889df5320e2470e27910667c363890e5aab8fedda1a24138d8 (image=fef24121ddbde6d1beb55bb67a676e8d13ea76fc1e2d6faa890148ed4e73266d, name=systemd-bluechi-tester-1, PODMAN_SYSTEMD_UNIT=bluechi-tester-1.service, io.buildah.version=1.34.0, org.label-schema.build-date=20240917, org.label-schema.license=GPLv2, org.label-schema.name=CentOS Stream 9 Base Image, org.label-schema.schema-version=1.0, org.label-schema.vendor=CentOS)
Sep 19 02:55:36 cfd08667ece1 podman[369]: 2024-09-19 02:55:36.116336369 +0000 UTC m=+0.097121503 container remove 2ba8e622d531c5889df5320e2470e27910667c363890e5aab8fedda1a24138d8 (image=fef24121ddbde6d1beb55bb67a676e8d13ea76fc1e2d6faa890148ed4e73266d, name=systemd-bluechi-tester-1, io.buildah.version=1.34.0, org.label-schema.build-date=20240917, org.label-schema.license=GPLv2, org.label-schema.name=CentOS Stream 9 Base Image, org.label-schema.schema-version=1.0, org.label-schema.vendor=CentOS, PODMAN_SYSTEMD_UNIT=bluechi-tester-1.service)
Sep 19 02:55:36 cfd08667ece1 systemd[1]: bluechi-tester-1.service: Main process exited, code=exited, status=1/FAILURE
Sep 19 02:55:36 cfd08667ece1 systemd[1]: bluechi-tester-1.service: Failed with result 'exit-code'.

And the quadlet file for bluechi-tester-1 is:

# cat bluechi-tester-1.container 
[Unit]
Description=bluechi-tester-1
After=local-fs.target

[Container]
Image=dir:/var/lib/containers/registry/tools-ffi:latest
Exec=/root/tests/FFI/bin/bluechi-tester --url="tcp:host=10.0.2.15,port=842"      --nodename=bluechi-tester-1      --numbersignals=11111111      --signal="JobDone"
Network=host

@pengshanyu
Copy link
Collaborator

Hi @engelmi , could you please help to take a look at eclipse-bluechi/bluechi/main/tests/tools/FFI/bluechi-tester
Thank you.

@pengshanyu
Copy link
Collaborator

Thanks @engelmi for the help.
Hi @Yarboa, bluechi-tester needs python3-dasbus, and in ffi-tools python3-dasbus was moved to "$shortbuild" = "" section.
Can we move it back to the shortbuild=yes section?

@pengshanyu
Copy link
Collaborator

@pengshanyu
Copy link
Collaborator

submitted a PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants