-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hash mismatch on resumed files #57
Comments
Please append '.__incomplete' to a failed file, retry and see if the remaining part gets downloaded and the hash is correct. |
I tried that on 5 files, it completed them, but it still claimed hash errors for all 5. I checked filesizes and they are complete/correct so I'll try extracting a file from inside to make sure they are intact. |
It would be nice if you could
|
edit: and now I'm getting Code: 1000, msg: ('Connection aborted.', ResponseNotReady('Request-sent',)) I wonder if something is down. |
I will add a check that suppresses the hash error messages when a file download is incomplete for some reason. Why a hashing error occurs for resumed files, I don't know. Regarding the connection error, Amazon has disabled downloads of large files again. |
This greatly improves FUSE read speeds by streaming chunks on read operations compared to experimental FUSE release. Reduces read timeout to 5 seconds. Moving and renaming should now work. misc: - raise RequestError on incomplete download (#57) - moving nodes now done using add-child and remove-child (fixes #62) - cache.query: new resolve method tha returns (node, parent) tuple
This greatly improves FUSE read speeds by streaming chunks on read operations compared to experimental FUSE release. Reduces read timeout to 5 seconds. Moving and renaming should now work. misc: - raise RequestError on incomplete download (#57) - moving nodes now done using add-child and remove-child (fixes #62) - cache.query: new resolve method tha returns (node, parent) tuple
I also have this issue, I have more or less the same example of how it happens. However I have maybe more info since I have the original files inside my archive files. Anyway, I have a 4GiB archive file, I download it via acd_cli and it has a wrong hash when completed. It usually stops to 0 B/s DL speed for about 30 seconds at some random point, like above, and then resumes itself saying it dropped the connection. After a while it says it failed, so I start the DL again to resume the file to completion, it still reports a failure. I check its size in bytes, it matches the source file size, however the hash is wrong as compared to my own hash, and the one that amazon's metadata reports(which is the same as mine). I then proceeded to extract the files inside the archive, which I also have the original hashes for and then I md5sum check them and I get this: More than half of the files in the archive do not match my originals, the archive seems to be the same size in bytes as my original source file, however it is a mismatch on the hash. Just to further check around I downloaded the same archive file from amazon's website for clouddrive, I do get the correct file and hash by doing that. I can only assume there may be some bug in the api or acd_cli, but I don't really know enough to say. Hopefully any of this is helpful. If you want I can test things for you since I've already got everything setup for that. |
I'm also running into this problem. I have a file that's about 8.38 GiB (8998338695 B) in size. I made 5 attempts to download this file:
Some of these attempts failed midway (or I manually interrupted them) so they had to be resumed, though I don't quite remember which ones did. I know for certain 5 did not fail at all, and 4 did fail midway. So I suspect it's a result of resuming a download. FYI I am downloading them onto an SSD. (Here, "corruption" means the majority of the bytes in that range do not match the original file at all.) |
Since you were able to identify the offending byte ranges, could you provide some further information on the corruption? |
Removes downloaded files that do not match the originals' MD5 sums and adds HASH_MISMATCH to list of return values that trigger a download retry. Also adds an optional acd_cli.ini config file. Having the line "keep_corrupt=True" in the [download] section will safely rename the corrupt file for later inspection. Concerns #57 and #336.
While trying to reproduce #336, I only was able to reproduce this issue. It turns out that faulty byte ranges originally appear in the incompletely downloaded files. In one file, a chunk of approximately 500MB is missing at an 1500MB offset. The resuming itself seems to work fine. |
I had an inkling about this. Sorry I took so long. Please try whether the latest commit fixes the issues. |
I get these errors constantly on 2 different installs. I'm trying to download my data off of ACD, all of my files are 10235MB split 7zip archives. Machine 1 is running ubuntu 14.04 with 250mbps symmetrical BW, Machine 2 is running latest debian stable with gbit symmetrical. The file will be downloading and then the speed drops to 0.0 KBps and it will start downloading again after 15-30 seconds. Then once it reaches 60-90% downloaded it will fail with [ERROR] [acd_cli] - Hash mismatch between local and remote file for "File_Name".
If I set max connections to anything other than 1 every is guaranteed to fail, setting max retries doesn't seem to affect the hash error rate. If I queue 25 files to download I'll get 3-4 without errors, then if I delete the error-ed files and redownload I can get another 3-4 files with the rest being hash errors. SO it's wasting a large amount of BW and time. Because once a hash mismatch happens it will stop downloading that file.
I'll see if I can get some verbose logs of the errors. Anybody have any ideas why I'm getting constant errors?
The text was updated successfully, but these errors were encountered: