You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
While running chaos experiment, for node cpu hog, sometimes it's not able to bring up some helper pod if I specify multiple TARGET_NODES in the comma separated format. In my case I have 4 nodes, and If I specify all 4 nodes, it's able to bring up 2 helper pods, then fails to bring up the other 2. And I see the error bellow inside de node-cpu-xxxx-xxx pod:
CPU hog failed, err: unable to create the helper pod, err: Post "https://10.96.0.1:443/api/v1/namespaces/default/pods\": read tcp 192.168.230.167:50174->10.96.0.1:443: read: connection reset by peer"
time="2023-03-09T15:43:36Z" level=info msg="Experiment Name: node-cpu-hog"
time="2023-03-09T15:43:36Z" level=info msg="[PreReq]: Getting the ENV for the node-cpu-hog experiment"
time="2023-03-09T15:43:38Z" level=info msg="[PreReq]: Updating the chaos result of node-cpu-hog experiment (SOT)"
time="2023-03-09T15:43:42Z" level=info msg="The application information is as follows" Node Label= Chaos Duration=60 Target Nodes="node-10-120-127-170,node-10-120-127-171,node-10-120-127-172,node-10-120-127-173" Node CPU Cores=1
time="2023-03-09T15:43:42Z" level=info msg="[Status]: Verify that the AUT (Application Under Test) is running (pre-chaos)"
time="2023-03-09T15:43:42Z" level=info msg="[Status]: No appLabels provided, skipping the application status checks"
time="2023-03-09T15:43:42Z" level=info msg="[Status]: Getting the status of target nodes"
time="2023-03-09T15:43:42Z" level=info msg="The Node status are as follows" Ready=true Node=node-10-120-127-170
time="2023-03-09T15:43:42Z" level=info msg="The Node status are as follows" Node=node-10-120-127-171 Ready=true
time="2023-03-09T15:43:42Z" level=info msg="The Node status are as follows" Node=node-10-120-127-172 Ready=true
time="2023-03-09T15:43:42Z" level=info msg="The Node status are as follows" Ready=true Node=node-10-120-127-173
time="2023-03-09T15:43:44Z" level=info msg="[Info]: The chaos tunables are:" Sequence=parallel Node CPU Cores=1 CPU Load=0 Node Affce Perc=0
time="2023-03-09T15:43:44Z" level=info msg="[Info]: Details of Nodes under chaos injection" No. Of Nodes=4 Node Names="[node-10-120-127-170 node-10-120-127-171 node-10-120-127-172 node-10-120-127-173]"
time="2023-03-09T15:43:44Z" level=info msg="[Info]: Details of Node under chaos injection" NodeName=node-10-120-127-170 NodeCPUcores=1
time="2023-03-09T15:43:44Z" level=info msg="[Info]: Details of Node under chaos injection" NodeName=node-10-120-127-171 NodeCPUcores=1
time="2023-03-09T15:43:44Z" level=info msg="[Info]: Details of Node under chaos injection" NodeName=node-10-120-127-172 NodeCPUcores=1
time="2023-03-09T15:43:45Z" level=error msg="[Error]: CPU hog failed, err: unable to create the helper pod, err: Post \"https://10.96.0.1:443/api/v1/namespaces/default/pods\": read tcp 192.168.230.167:50174->10.96.0.1:443: read: connection reset by peer"
kubectl get nodes
NAME STATUS ROLES AGE VERSION
node-10-120-127-170 Ready edge,node 8d v1.22.17
node-10-120-127-171 Ready edge,node 8d v1.22.17
node-10-120-127-172 Ready node 8d v1.22.17
node-10-120-127-173 Ready node 8d v1.22.17
node-cpu-hog-engine YAML File:
cat node-cpu-hog-engine.yaml
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: nginx-chaos
namespace: default
spec:
# It can be active/stop
engineState: 'active'
#ex. values: ns1:name=percona,ns2:run=nginx
auxiliaryAppInfo: ''
chaosServiceAccount: node-cpu-hog-sa
experiments:
- name: node-cpu-hog
spec:
components:
env:
# set chaos duration (in sec) as desired
- name: TOTAL_CHAOS_DURATION
value: '60'
## ENTER THE NUMBER OF CORES OF CPU FOR CPU HOGGING
## OPTIONAL VALUE IN CASE OF EMPTY VALUE IT WILL TAKE NODE CPU CAPACITY
- name: NODE_CPU_CORE
value: '1'
## LOAD CPU WITH GIVEN PERCENT LOADING FOR THE CPU STRESS WORKERS.
## 0 IS EFFECTIVELY A SLEEP (NO LOAD) AND 100 IS FULL LOADING
- name: CPU_LOAD
value: '0'
## percentage of total nodes to target
- name: NODES_AFFECTED_PERC
value: ''
# provide the comma separated target node names
- name: TARGET_NODES
value: 'node-10-120-127-170,node-10-120-127-171,node-10-120-127-172,node-10-120-127-173'
The text was updated successfully, but these errors were encountered:
BUG REPORT
What happened:
While running chaos experiment, for node cpu hog, sometimes it's not able to bring up some helper pod if I specify multiple TARGET_NODES in the comma separated format. In my case I have 4 nodes, and If I specify all 4 nodes, it's able to bring up 2 helper pods, then fails to bring up the other 2. And I see the error bellow inside de node-cpu-xxxx-xxx pod:
CPU hog failed, err: unable to create the helper pod, err: Post "https://10.96.0.1:443/api/v1/namespaces/default/pods\": read tcp 192.168.230.167:50174->10.96.0.1:443: read: connection reset by peer"
And this fails the experiment at the end:
What you expected to happen:
I expect all the helper pods able to be up and Running and the experiment successful.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
node-cpu-hog-engine YAML File:
The text was updated successfully, but these errors were encountered: