Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GreatDancer backend hang on bad responses #55

Open
SoMuchHat opened this issue Feb 22, 2021 · 4 comments
Open

GreatDancer backend hang on bad responses #55

SoMuchHat opened this issue Feb 22, 2021 · 4 comments
Assignees
Labels
needs validation described problem or solution requires further validation technical support request for technical support

Comments

@SoMuchHat
Copy link

I'm encountering a hang in situations when I send bad data to a host with a GreatFET configured as a USB device. I seem to be able to reliably reproduce this issue when the host issues a standard GET_DESCRIPTOR and I generate a random response of request length. As can be seen in the output below, the code will appear to send the response and the host will typically either reset the bus or reissue the request again. Eventually after a few responses like this, greatdancer will hang forever within read_from_endpoint when attempting to check if _transfer_is_complete returned successfully until I finally interrupt it.

If I understand the spec correctly (rev2 8.5.2), it appears that on an error to a IN request, the host doesn't necessarily have to ACK or even NAK. I believe the issue is that the greatdancer backend always expects an ACK and will wait forever attempting to receive one.

This may be an issue within the greatdancer firmware? I would expect that the GET_ENDPTCOMPLETE status should have been set as long as the response was actually sent. The exact wording in the LPC4330 datasheet is:

"ENDPTCOMPLETE
ETCE: Endpoint transmit complete event for physical IN endpoints. This bit is set to 1 by hardware when a transmit event (IN/INTERRUPT) occurred."

Does a "transmit complete" event only occur once the host acks?

Below is an example case and a backtrace when I interrupt the hang. If I can get some input on the issue I can open a pull request with a proposed fix.

INFO    | greatdancer    | Connecting to host.
DEBUG   | selector_events| Using selector: EpollSelector
DEBUG   | greatdancer    | Host issued bus reset.
INFO    | device         | Host issued a bus reset; resetting our connection.
DEBUG   | greatdancer    | Host issued bus reset.
INFO    | device         | Host issued a bus reset; resetting our connection.
DEBUG   | device         | generic device received request: IN STANDARD request GET_DESCRIPTOR (0x06) to DEVICE [value=0x0100, index=0x0000, length=64]
TRACE   | greatdancer    | EP0/IN: <- b'p\x0f\x1fW\xd6\x1b\xea]\xdf%\xc4\xae\x07%\x9a\x17q\xcaAb:\xf2\x94W\xfa\x00O0`\x00\xf1C\x16\xa8\xcc\xfd\x9a&\x89\x9f#\x8aZ\xbey\xf6o\xa3x\xb7\x13s\x19\x17#4\xc0X,)\xa4JM3'
DEBUG   | device         | generic device received request: IN STANDARD request GET_DESCRIPTOR (0x06) to DEVICE [value=0x0100, index=0x0000, length=64]
TRACE   | greatdancer    | EP0/IN: <- b'%\xd7\xb3\x9d\xe9\xc1f\xdf\xca\xb6S\xf6_\xad\xcf,t\x12\xed\x04\xde\xb82\xe2\x15\x16f3igV\xc5\x94\xfe\xa4\xd78L\xde\x97\x11\x8c\x08^\xfaM\x93.\rT\x9d\x90\xfcL\t\xfe\x81\x95\x83\xc5w\xed_\xa3'
DEBUG   | device         | generic device received request: IN STANDARD request GET_DESCRIPTOR (0x06) to DEVICE [value=0x0100, index=0x0000, length=64]
TRACE   | greatdancer    | EP0/IN: <- b'~\x15*\x90;;\xd1\x88\xbc\x1d\x01\xe7 \xce\x1d\x14\x97$\xdc\x8bM\xd8\x0b62\xe8>@\x0c\xcbG\xf5P\xca\r\x9d4;\x91\x00c\xee\x95\xfa(f\xfb\x1fQ\xeb\x96\xa3K\xb5d\x00\xc2p\x18\xb2\x14$\x98F'
DEBUG   | greatdancer    | Host issued bus reset.
INFO    | device         | Host issued a bus reset; resetting our connection.
DEBUG   | device         | generic device received request: IN STANDARD request GET_DESCRIPTOR (0x06) to DEVICE [value=0x0100, index=0x0000, length=64]
TRACE   | greatdancer    | EP0/IN: <- b'R\x87\x8f\xce\xbe>m\xf5;\x04\xa0\x96\xa0\x8a\xb0k\xa2\xa8\xbc\x84\n\xcaJ\xb5\xd7\xc6\xd0\xc3n\x87\xfc\x9b\x03\xc3n\x81\x8c\xb1\x84\xbe\x94<\xb9g9\xdd\x9f2\x9ap\xaa\x1e2\xf2\x13\x8c\x85\xa0F\xad9\xde\xd4\x95'
DEBUG   | device         | generic device received request: IN STANDARD request GET_DESCRIPTOR (0x06) to DEVICE [value=0x0100, index=0x0000, length=64]
TRACE   | greatdancer    | EP0/IN: <- b'l\x8e\x1bZ\xd5\xfbxz\xfd\xe9rJ9\xf8\xcf\x00D\xc0i\xdd\xcc\x14I\x82\xa8\xc6W\x80\xe60\xeb\xaa\xda\xbd\xb91\x10\x84\x1d\x8c%\x906&\xc2\xcd\xb7\xf1}\xe3h\xb7\xc7#<\x18\xb1PM\x85\x8a\xe7\x98+'
^CINFO    | greatdancer    | Disconnecting from host.

...

  File "/usr/local/lib/python3.8/dist-packages/facedancer-2.9+dirty-py3.8.egg/facedancer/backends/greatdancer.py", line 370, in _handle_setup_events
    self._handle_setup_event_on_endpoint(i)
  File "/usr/local/lib/python3.8/dist-packages/facedancer-2.9+dirty-py3.8.egg/facedancer/backends/greatdancer.py", line 408, in _handle_setup_event_on_endpoint
    self.ack_status_stage(direction=self.DEVICE_TO_HOST)
  File "/usr/local/lib/python3.8/dist-packages/facedancer-2.9+dirty-py3.8.egg/facedancer/backends/greatdancer.py", line 161, in ack_status_stage
    self.read_from_endpoint(endpoint_number)
  File "/usr/local/lib/python3.8/dist-packages/facedancer-2.9+dirty-py3.8.egg/facedancer/backends/greatdancer.py", line 268, in read_from_endpoint
    while not self._transfer_is_complete(ep_num, self.HOST_TO_DEVICE):
  File "/usr/local/lib/python3.8/dist-packages/facedancer-2.9+dirty-py3.8.egg/facedancer/backends/greatdancer.py", line 431, in _transfer_is_complete
    status = self._fetch_transfer_status()
  File "/usr/local/lib/python3.8/dist-packages/facedancer-2.9+dirty-py3.8.egg/facedancer/backends/greatdancer.py", line 418, in _fetch_transfer_status
    return self.api.get_status(self.GET_ENDPTCOMPLETE)
  File "/home/dev/.local/lib/python3.8/site-packages/pygreat/comms.py", line 1108, in method
    return self.execute_command(verb_number, in_format, out_format, name=name, class_name=class_name,
...
KeyboardInterrupt
@straithe straithe added the technical support request for technical support label Nov 4, 2021
@straithe
Copy link
Member

Are you still experiencing this issue, @seanmharrington ?

@straithe straithe self-assigned this Nov 13, 2021
@SoMuchHat
Copy link
Author

Yes, though I haven't dug any deeper into the issue and I no longer have a GreatFET in my possession to assist outside of what I originally provided.

@straithe
Copy link
Member

Thank you for the update. I'll do my best to recreate the concerning behaviour and address this issue.

@straithe straithe added the needs validation described problem or solution requires further validation label Nov 17, 2021
@straithe straithe removed their assignment Oct 11, 2023
@antoinevg antoinevg self-assigned this Aug 15, 2024
@antoinevg
Copy link
Member

"ENDPTCOMPLETE ETCE: Endpoint transmit complete event for physical IN endpoints. This bit is set to 1 by hardware when a transmit event (IN/INTERRUPT) occurred."

Does a "transmit complete" event only occur once the host acks?

This is correct.

If you could please share your code I'll try to replicate the issue on this side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs validation described problem or solution requires further validation technical support request for technical support
Projects
None yet
Development

No branches or pull requests

4 participants
@antoinevg @straithe @SoMuchHat and others