fix(api): delete image race condition #6607
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
fix(api): deleting large images fails
This issue is caused by a race condition. When a large image is served to the client, it is done using a streaming
FileResponse
. This concurrently serves the image straight from disk. The file is kept open by FastAPI until the image is fully served.When a user deletes an image before the file is done serving, the delete fails because the file is still held by FastAPI.
To reproduce the issue:
Failed to delete image file
, and the image does not disappear from the UI.The workaround is to instead serve the image from memory.
Loading the image to memory is very fast, so there is only a tiny window in which we could create the race condition, but it technically could still occur, because FastAPI is asynchronous and handles requests concurrently.
Once we load the image into memory, deletions of that image will work. Then we return a normal
Response
object with the image bytes. This is essentially whatFileResponse
does - except it usesanyio.open_file
, which is async.The tradeoff is that the server thread is blocked while opening the file. I think this is a fair tradeoff.
A future enhancement could be to implement soft deletion of images (db is already set up for this), and then clean up deleted image files on startup/shutdown. We could move back to using the async
FileResponse
for best responsiveness in the server without any risk of race conditions.fix(app): windows indefinite hang while finding port
For some reason, I started getting this indefinite hang when the app checks if port 9090 is available. After some fiddling around, I found that adding a timeout resolves the issue.
I confirmed that the util still works by starting the app on 9090, then starting a second instance. The second instance correctly saw 9090 in use and moved to 9091.
Related Issues / Discussions
Closes #6598 (for real this time)
QA Instructions
See description for a repro.
Merge Plan
I will do RC1 release after this merges
Checklist