You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
We are simulating network failures to test our HA performance and found that once one of the iSCSI portals becomes unavailable, the CSI fails to mount all volumes.
In the log below, the network link to 10.0.20.10 was blocked using an iptable rule that drops all packets to/from that IP.
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:26 iscsi.go:160: waitForPathToExistImpl (/dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4)
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:26 iscsi.go:170: [0] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:26 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:27 iscsi.go:170: [1] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:27 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:28 iscsi.go:170: [2] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:28 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node I1210 17:11:29.194677 1 node.go:96] >>> /csi.v1.Identity/Probe
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:29 iscsi.go:170: [3] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:29 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:30 iscsi.go:170: [4] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:30 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node I1210 17:11:31.036371 1 node.go:96] >>> /csi.v1.Node/NodeGetCapabilities
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:31 iscsi.go:170: [5] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:31 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:32 iscsi.go:170: [6] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:32 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:33 iscsi.go:170: [7] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:33 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:34 iscsi.go:170: [8] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:34 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:35 iscsi.go:170: [9] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:35 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:36 iscsi.go:170: [10] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:36 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:37 iscsi.go:170: [11] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:37 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:38 iscsi.go:170: [12] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:38 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:39 iscsi.go:170: [13] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:39 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:40 iscsi.go:170: [14] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:40 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:41 iscsi.go:170: [15] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:41 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:42 iscsi.go:170: [16] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:42 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:43 iscsi.go:170: [17] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:43 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:44 iscsi.go:170: [18] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:44 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:45 iscsi.go:170: [19] os stat device: exist false device /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:45 iscsi.go:175: Device not found for: /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:45 iscsi.go:199: device does NOT exist [20*1s] (/dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4)
seagate-exos-x-csi-node-server-kmm5x seagate-exos-x-csi-node DEBUG: 2024/12/10 17:11:45 iscsi.go:316: waitForPathToExist: exists=false err=stat /dev/disk/by-path/ip-10.0.20.10:3260-iscsi-iqn.1988-11.com.dell:01.array.bc305b6893fb-lun-4: no such file or directory
We think the problem is the error handling inside the waitForPathToExistImpl function. When a path is unavailable the physical device will not be created and this function fails with non-nil err value which fails the entire mount procedure. We think that this function shouldn't propogate existance errors.
To Reproduce
Start a pod with volume. This step is important to ensure the session to the portal is logged-in.
Block access to one of the paths by running
iptables -A INPUT -s 10.0.20.10 -j DROP
iptables -A OUTPUT -d 10.0.20.10 -j DROP
Restart the pod.
Expected behavior
The pod is restarted and enters the "running" state.
Screenshots
None
Storage System (please complete the following information):
Vendor: DELL
Model: ME5012
Firmware Version: ME5.1.2.1.1
Environment:
Kubernetes version: v1.30.5+rke2r1
Host OS: Ubuntu 24.04.1 LTS
Additional context
We have found the following two issues might affect reproducibility:
We previously had a different issue in a similar scenario where a volume-mount hit a gRPC timeout if some of the portals were unavailable. To resolve this we lowered the discovery timeout and max retries at the iscsid.conf by adding -
If the session on the unavailable portal is not logged-in, this issue will not reproduce because the login will fail and continue to the next portal as expected, this will show up in the logs as -
seagate-exos-x-csi-node-server-4bh6w seagate-exos-x-csi-node DEBUG: 2024/12/10 19:28:02 iscsiadm.go:68: Output of iscsiadm command: {output: iscsiadm: connect to 10.0.20.10 timed out\niscsiadm: connect to 10.0.20.10 timed out\niscsiadm: connection login retries (reopen_max) 1 exceeded\niscsiadm: Could not perform SendTargets discovery: iSCSI PDU timed out\n}
The text was updated successfully, but these errors were encountered:
Describe the bug
We are simulating network failures to test our HA performance and found that once one of the iSCSI portals becomes unavailable, the CSI fails to mount all volumes.
In the log below, the network link to 10.0.20.10 was blocked using an iptable rule that drops all packets to/from that IP.
We think the problem is the error handling inside the waitForPathToExistImpl function. When a path is unavailable the physical device will not be created and this function fails with non-nil
err
value which fails the entire mount procedure. We think that this function shouldn't propogate existance errors.To Reproduce
Expected behavior
The pod is restarted and enters the "running" state.
Screenshots
None
Storage System (please complete the following information):
Environment:
Additional context
We have found the following two issues might affect reproducibility:
The text was updated successfully, but these errors were encountered: