You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Team,
one of the customer using our solution, a custom image which is based on Linux kernel 4.14.51 and Centos 7.8. Customer is facing random traffic loss in production on netvsc interfaces (non-accelerated) . P.S. deployment size is ~600 VM instances.
(2) we have rebuild the kernel with below 2-patches as this symptom (napi gets disable when ring is temporary busy ) is similar to issue mentioned in #36
(1) hv_netvsc: Fix napi reschedule while receive completion is busy
(2) hv_netvsc: fix race that may miss tx queue wakeup
(3) now with these patches there is some improvement in the sense few instances are getting into this problem , but issue still persists(~5 out of ~200) . On these bad instances ethtool stats shows very high 'rx_comp_busy & tx_send_full' as shown below. I think super high 'rx_comp_busy' is expected after these patches
I would request azure team to provide list of patches that we can try with 4.14.51 kernel as LIS option is not applicable to us.
please let me know if I can provide any additional details.
The text was updated successfully, but these errors were encountered:
Hi Team,
one of the customer using our solution, a custom image which is based on Linux kernel 4.14.51 and Centos 7.8. Customer is facing random traffic loss in production on netvsc interfaces (non-accelerated) . P.S. deployment size is ~600 VM instances.
-bash-4.2# ethtool -S eth2
NIC statistics:
tx_scattered: 0
tx_no_memory: 0
tx_no_space: 0
tx_too_big: 0
tx_busy: 0
tx_send_full: 75776 <<<<<<<<<<<<<
rx_comp_busy: 1 <<<<<<<<<<<<<<<<
vf_rx_packets: 0
vf_rx_bytes: 0
vf_tx_packets: 0
vf_tx_bytes: 0
vf_tx_dropped: 0
tx_queue_0_packets: 48323650
tx_queue_0_bytes: 9856533412
rx_queue_0_packets: 70704892
rx_queue_0_bytes: 6523868834
tx_queue_1_packets: 44242587
tx_queue_1_bytes: 9561505139
rx_queue_1_packets: 67683390
rx_queue_1_bytes: 6248204528
tx_queue_2_packets: 45780035
tx_queue_2_bytes: 10119440310
rx_queue_2_packets: 69738233
rx_queue_2_bytes: 6443619208
tx_queue_3_packets: 44413637
tx_queue_3_bytes: 9640385380
rx_queue_3_packets: 69258427
rx_queue_3_bytes: 6396199857
tx_queue_4_packets: 96161043
tx_queue_4_bytes: 43152567515
rx_queue_4_packets: 68506662
rx_queue_4_bytes: 6329763902
tx_queue_5_packets: 42685859
tx_queue_5_bytes: 9232930840
rx_queue_5_packets: 68869195
rx_queue_5_bytes: 6360734718
tx_queue_6_packets: 44105935
tx_queue_6_bytes: 9641517238
rx_queue_6_packets: 71297219
rx_queue_6_bytes: 6568436535
tx_queue_7_packets: 44680296
tx_queue_7_bytes: 9764630663
rx_queue_7_packets: 70747471
(2) we have rebuild the kernel with below 2-patches as this symptom (napi gets disable when ring is temporary busy ) is similar to issue mentioned in #36
(1) hv_netvsc: Fix napi reschedule while receive completion is busy
(2) hv_netvsc: fix race that may miss tx queue wakeup
(3) now with these patches there is some improvement in the sense few instances are getting into this problem , but issue still persists(~5 out of ~200) . On these bad instances ethtool stats shows very high 'rx_comp_busy & tx_send_full' as shown below. I think super high 'rx_comp_busy' is expected after these patches
-bash-4.2# ethtool -S eth2
NIC statistics:
tx_scattered: 0
tx_no_memory: 0
tx_no_space: 0
tx_too_big: 0
tx_busy: 0
tx_send_full: 417979<<<<<<<<<<<<<<<<<<<
rx_comp_busy: 36978379935<<<<<<<<<<<<<< rapid fast increments
vf_rx_packets: 0
vf_rx_bytes: 0
vf_tx_packets: 0
vf_tx_bytes: 0
vf_tx_dropped: 0
tx_queue_0_packets: 22487545
tx_queue_0_bytes: 4594218563
rx_queue_0_packets: 33816104
rx_queue_0_bytes: 3148800004
tx_queue_1_packets: 23095847
tx_queue_1_bytes: 4629433827
rx_queue_1_packets: 34169457
rx_queue_1_bytes: 3198473995
tx_queue_2_packets: 22235899
tx_queue_2_bytes: 4554101089
rx_queue_2_packets: 35447873
rx_queue_2_bytes: 3306351633
tx_queue_3_packets: 22655564
tx_queue_3_bytes: 4658776077
rx_queue_3_packets: 34320559
rx_queue_3_bytes: 3200636386
tx_queue_4_packets: 43152346
tx_queue_4_bytes: 17461777045
rx_queue_4_packets: 34941411
rx_queue_4_bytes: 3240195702
tx_queue_5_packets: 22992696
tx_queue_5_bytes: 4613837166
rx_queue_5_packets: 32975505
rx_queue_5_bytes: 3079512739
tx_queue_6_packets: 22535083
tx_queue_6_bytes: 4672503110
rx_queue_6_packets: 33796904
rx_queue_6_bytes: 3159691807
tx_queue_7_packets: 22452840
tx_queue_7_bytes: 4584966389
rx_queue_7_packets: 33860772
rx_queue_7_bytes: 3155304090
rx_queue_7_bytes: 6525418289
I would request azure team to provide list of patches that we can try with 4.14.51 kernel as LIS option is not applicable to us.
please let me know if I can provide any additional details.
The text was updated successfully, but these errors were encountered: