Skip to content

Conversation

@loicpoulain
Copy link

During QRB2210 provisioning, a USB write failure has been observed early in the Firehose phase (first block write). This issue has been traced to a timeout during the USB bulk transfer of the Zero-Length Packet (ZLP).

In some conditions, the ZLP transfer may take longer than the current timeout, up to approximately 1.7 seconds.

Increasing the USB bulk transfer timeout to 3 seconds appears to resolve the issue.

The issue specifically occurs after a prior large eMMC write operation (e.g., during a previous QDL session). It could then be related to internal eMMC operations or timing delays that affect the USB ack.

@andersson
Copy link
Collaborator

It seems perfectly reasonable that following certain operations the eMMC might need to do some garbage collection or similar, and hence the I/O might be stalling because of this. It also seems reasonable that the USB URBs leading up to the ZLP fills some buffer, and as the ZLP arrives the buffer is turned into an I/O request that hits this stall.

That said, I believe 1 second was an arbitrary number that happened to work all these years. Increasing it to a little bit more than a doesn't seem adequate across all hardware (up until this week I thought 1 second was enough...). Further, in the recently merged patches the write timeout value affect how fast we give up on the speculatively send , which directly affect the time it takes before we start programming; so changing the value from 1 to 3 globally makes all runs of qdl take 2 seconds longer.

Further, I've never liked the asymmetry of qdl_read() and qdl_write() arguments.

So, can you please extend qdl_write() to take a timeout, keep it at 1 second in all other places and then make it e.g. 10 seconds during the program/rawmode phase?

During QRB2210 provisioning, a USB write failure has been
observed early in the Firehose phase (first block write).
This issue has been traced to a timeout during the USB
bulk transfer of the Zero-Length Packet (ZLP).

In some conditions, the ZLP transfer may take longer than
the current timeout, up to approximately 1.7 seconds.

The issue specifically occurs after a prior large eMMC write
operation (e.g., during a previous QDL session). It could then
be related to internal eMMC I/O operations or timing delays
affecting the USB ack.

To resolve this issue, we introduce a timeout parameter to the
qdl_write function, consistent with the existing qdl_read, and
we increase the timeout to 10 seconds for Firehose raw binary
write operations to avoid 'false-positive' timeout.

Signed-off-by: Loic Poulain <[email protected]>
@loicpoulain
Copy link
Author

It seems perfectly reasonable that following certain operations the eMMC might need to do some garbage collection or similar, and hence the I/O might be stalling because of this. It also seems reasonable that the USB URBs leading up to the ZLP fills some buffer, and as the ZLP arrives the buffer is turned into an I/O request that hits this stall.

Yes, that’s very likely what’s happening here.

So, can you please extend qdl_write() to take a timeout, keep it at 1 second in all other places and then make it e.g. 10 seconds during the program/rawmode phase?

Fair, done!

@loicpoulain loicpoulain changed the title usb: increase bulk transfer timeout to 3 seconds usb: increase bulk transfer timeout for firehose raw write Oct 27, 2025
@andersson andersson merged commit 8c0fd74 into linux-msm:master Nov 3, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants