-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Multiple uploads resulting in CANCELED #15841
Comments
Looks like the 30MB file fails when individually uploaded. |
I know you mentioned that there aren't any relevant logs but seems like it could be related to #12937 -> #13009 Are you using Element to upload the file? Element will cancel the request if it doesn't see any progress for 30 seconds (see It seems like TCP dumps from the client <-> server would be useful to debug this further. Other related issues: |
I'm no web developer, but I checked the debugger and saw that I am indeed triggering the timeout. The upload only takes a few seconds but the bar sits at 100% for a bit before the timeout kills it. I also moved the media store to an SSD and put |
Is your reverse proxy killing the upload? I think by default they usually have small limits. |
@Cyberes Does the media end up on your server after the dust settles? (if so, feels like it's one of the issues I've already linked) @clokep also has an interesting angle with the reverse-proxy. Otherwise, seems like a case of tracing or adding logs to see where Synapse is stuck at. I've added some extra instrumentation so traces provide a lot more info when using the Are you able to enable tracing for your Synapse instance (will need wait for the PR to merge or apply those changes manually)? Please share the trace JSON files if you get this going |
There isn't anything in the nginx error logs. I have And I'll pull that branch and enable tracing. |
Add tracing instrumentation to media `/upload` code paths to investigate #15841
Here's the trace. evulid.cc media_worker UploadResource.json.txt From the clients perspective, it was canceled after 33.68s of being in-flight: |
Thanks for trace @Cyberes! It doesn't appear to be a bug in Synapse given that the file ends up on your server and the trace shows that it's working but just has a long period where it blocks while storing the file (probably the What does your I've also made the tracing less ambiguous since there are so many shared names that make my head hurt trying to correlate to the code, #15888. Also added an extra trace around the |
media_store_path: "/mnt/matrix-storage/media"
media_storage_providers:
- module: s3_storage_provider.S3StorageProviderBackend
store_local: True
store_remote: True
store_synchronous: True
config:
bucket: matrix-synapse
endpoint_url: 'https://xxx.r2.cloudflarestorage.com/matrix-synapse'
access_key_id: 'xxx'
secret_access_key: 'xxx' Ah, I forgot about S3! It might be slow to send it to S3. evulid.cc media_worker UploadResource less ambiguous.json.txt |
It's definitely going to be the S3 storage provider being slow 🎯 I think you could set |
I disabled S3 and it looks like the duration is now 312.27ms but Element still says |
@Cyberes Please post the client-side network request alongside the trace. If the request says From the trace alone, nothing seems wrong. The PR to update the tracing also has a few updates if it's easy for you to pull. |
I didn't see the new changes, here's the new trace. |
I might have forgotten to do evulid.cc.media_worker.UploadResource with media_storage_providers commented out.json.txt Oddly, after restarting Synapse with the whole |
@Cyberes The bulk amount of info isn't helpful. Please just boil it down to the simplest configuration that reproduces the problem. It looks like the Just need one Jaeger trace and corresponding request timing on the client (like earlier). The |
Okay, wasn't sure what you were looking for. Just stick with evulid.cc.media_worker.UploadResource more.json.txt and the corresponding vector fixed.har.txt Synapse trace: Client-side: |
A lot of the functions have the same name in this space like `store_file`, and we also do it multiple times for different reasons (main media repo, other storage providers, thumbnails, etc) so it's good to differentiate them so your head doesn't explode. Follow-up to #15850 Tracing instrumentation to media `/upload` code paths to investigate #15841
Thanks for the clarity @Cyberes! Trace looks good again and seems like we need to move to the client-side layer (Element) to see why it's canceling. Add a bunch of logging around what the |
I really appreciate the help @MadLittleMods and you going out of your way to help! I've sent the Element team a bug report and I'm sure they'll appreciate the server-side debugging we've done so far. element-hq/element-web#25756 |
I ran debug logging on the |
I think this may be a Cloudflare issue. I disabled Cloudflare for matrix.evulid.cc and I was able to upload a few dozen files without any (CANCELED) error. I then switched CF back on and got the error again. So it's something about their middleware that is causing something to go wrong. I've never encountered an issue like this with CF. Do you know of any conflicts that may be occurring between Element -> Cloudflare -> Synapse? |
Description
I am attempting to upload two video files (each about 30 MB) to a room at the same time. This always fails with a generic error message popup (
The file '[filename]' failed to upload.
). There is no error in the console and the network of the debug window shows the fetch request as(canceled)
. Synapse logs don't show any errors and neither does Openresty. Uploading the files one at a time works, but uploading them together always fails.Steps to reproduce
Homeserver
matrix.evulid.cc
Synapse Version
1.86.0
Installation Method
pip (from PyPI)
Database
PostgreSQL
Workers
Multiple workers
Platform
Ubuntu Server LXC container
Configuration
Relevant log output
Anything else that would be useful to know?
No response
The text was updated successfully, but these errors were encountered: