-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CDC ECM packet pool drop #120
Comments
ux_slave_class_cdc_ecm_bulkout_thread keeps polling CDC-ECM ethernet packets from USB bulkout endpoint, when there is packets received, they will be passed to NX to handle (and released in NX). From your description it seems there are real ethernet inputs from USB ethernet, but input packets are too many for NX to process and release in time. Maybe you can consider increasing the pool size to buffer more packets for processing. |
Thanks for the answer. |
On USB side, the packets are received and passed to upper layer (maybe application) and upper layer take the ownership to free them, so I think application may need optimization on ethernet packets handling, while we are checking if there is something could be done on USB side. BTW, it seems CDC-ECM only recognized by linux. I'm not sure if you can share your way to make CDC-ECM recognized on windows for WSL so it's easier for us to reproduce the issue. |
Thanks ! |
Thanks for sharing. When I say "application", I mean your application or even higher up. The packets allocated and filled in USBX is passed to upper layers and the packets ownership is also passed to upper layers, they should process and free the packets in time. |
A possible improvement for ethernet packets handling in USBX is, in ux_device_class_cdc_ecm_bulkout_thread.c, if no free packet is available, host bulk out transfers are NAKed currently. Such a blocking of host bulk out transfer may cause host behavior to reset the device (just a guess, from your observation of deactivate and activate again, it's host specific behavior), maybe you can try to allocate NX packet after USB bulk out transfer, if no packet free the packet is dropped by discarding the data. In this way the host is not resetting the device, but network packets are dropped until free packet available. Note that the upper logic change does not help on packets handling and releasing, upper layers still need to be checked to find the real issue (why packets are not handled and released). |
For TCP packets, some are indeed queued by TCP control block.
|
What target device are you using?
LPC55S69
Which version of Azure RTOS?
6.1
What toolchain and environment?
arm-none-eabi-* + WSL
Hi,
I am developping a embedded application on a MCU that runs Azure RTOS with the netxduo/usbx layers to manage TCP/IP sockets & packets.
This MCU communicates with a laptop through an Ethernet over USB protocol (USB CDC-ECM). The laptop runs Ubuntu and can open TCP ports through netcat command and send a various range of command.
Everything is running fine most of the time but we sometimes experience random problem where all the opened TCP ports get frozen.
Indeed, it is no longer possible to open sockets on these ports, or even to send commands on ports that have already been opened.
I was suspecting packet pool leak with NetX/UsbX so I've added a debug log trace on _nx_packet_allocate() and _nx_packet_release() to find out which thread is taking a packet from which packet pool and what are the number of packet left.
Thanks to this, I can see that, when the problem is happening, the "ux_slave_class_cdc_ecm_bulkout_thread" thread requests one packet per second without ever releasing it, as you can see on the screen attached. As soon as the packet pool drops to 0, my application is stuck (no TCP port available, everything seems to be frozen).
After several minutes (5/10/15?), all packets are released at once, but I can no longer communicate with the MCU. It's as if the USB link had been removed. I have to reboot my MCU to make it work again.
Such behavior is not acceptable, as it will not be possible to unplug/replug in the final product.
Do you have any idea of the cause and how to solve this bug?
Thanks in advance,
Best regards,
Antoine
teraterm3.log
The text was updated successfully, but these errors were encountered: