Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash mismatch on resumed files #57

Open
shadesmore opened this issue Jun 1, 2015 · 10 comments
Open

Hash mismatch on resumed files #57

shadesmore opened this issue Jun 1, 2015 · 10 comments

Comments

@shadesmore
Copy link

I get these errors constantly on 2 different installs. I'm trying to download my data off of ACD, all of my files are 10235MB split 7zip archives. Machine 1 is running ubuntu 14.04 with 250mbps symmetrical BW, Machine 2 is running latest debian stable with gbit symmetrical. The file will be downloading and then the speed drops to 0.0 KBps and it will start downloading again after 15-30 seconds. Then once it reaches 60-90% downloaded it will fail with [ERROR] [acd_cli] - Hash mismatch between local and remote file for "File_Name".

If I set max connections to anything other than 1 every is guaranteed to fail, setting max retries doesn't seem to affect the hash error rate. If I queue 25 files to download I'll get 3-4 without errors, then if I delete the error-ed files and redownload I can get another 3-4 files with the rest being hash errors. SO it's wasting a large amount of BW and time. Because once a hash mismatch happens it will stop downloading that file.

I'll see if I can get some verbose logs of the errors. Anybody have any ideas why I'm getting constant errors?

@yadayada
Copy link
Owner

yadayada commented Jun 1, 2015

Please append '.__incomplete' to a failed file, retry and see if the remaining part gets downloaded and the hash is correct.

@shadesmore
Copy link
Author

I tried that on 5 files, it completed them, but it still claimed hash errors for all 5. I checked filesizes and they are complete/correct so I'll try extracting a file from inside to make sure they are intact.

@yadayada
Copy link
Owner

yadayada commented Jun 2, 2015

It would be nice if you could

  1. md5sum the completed files and ascertain that the hashing is correct and there is a download error. That means that acd-cli find-md5 should not find the independently computed hash.
  2. Do a binary diff on the files and see where the errors occur, provided that you still have or can recreate the original files.

@shadesmore
Copy link
Author

  1. I used md5sum on 10 different files, a few were downloaded without error; all were correctly matched via find-md5 even the resumed ones from above.
USER@HO$T:~$ md5sum /data/USER/test1/test/BD-000-097.7z.001
b36de352cb66b7641f736adb847ffc11  /data/USER/test1/test/BD-000-097.7z.001
USER@HO$T:~$ sudo acd_cli find-md5 b36de352cb66b7641f736adb847ffc11
[OK9rN6nzRBCYj4ZC1mXyRw] [A] /test/BD-000-097.7z.001
USER@HO$T:~$ md5sum /data/USER/test1/test/BD-000-097.7z.005
4fd6810060d41c153e719b4236ac4ba9  /data/USER/test1/test/BD-000-097.7z.005
USER@HO$T:~$ sudo acd_cli find-md5 4fd6810060d41c153e719b4236ac4ba9
[4ls8JVeZRAyfxVEji_9Cfw] [A] /test/BD-000-097.7z.005
  1. I do not have the original files, I did extract one archive set completely and I had no errors/issues.

edit: and now I'm getting Code: 1000, msg: ('Connection aborted.', ResponseNotReady('Request-sent',)) I wonder if something is down.

@yadayada
Copy link
Owner

yadayada commented Jun 4, 2015

I will add a check that suppresses the hash error messages when a file download is incomplete for some reason. Why a hashing error occurs for resumed files, I don't know.

Regarding the connection error, Amazon has disabled downloads of large files again.

yadayada added a commit that referenced this issue Jun 6, 2015
This greatly improves FUSE read speeds by streaming chunks on read operations
compared to experimental FUSE release. Reduces read timeout to 5 seconds.
Moving and renaming should now work.

misc:
- raise RequestError on incomplete download (#57)
- moving nodes now done using add-child and remove-child (fixes #62)
- cache.query: new resolve method tha returns (node, parent) tuple
yadayada added a commit that referenced this issue Jun 14, 2015
This greatly improves FUSE read speeds by streaming chunks on read operations
compared to experimental FUSE release. Reduces read timeout to 5 seconds.
Moving and renaming should now work.

misc:
- raise RequestError on incomplete download (#57)
- moving nodes now done using add-child and remove-child (fixes #62)
- cache.query: new resolve method tha returns (node, parent) tuple
@Sunako
Copy link

Sunako commented Jun 30, 2015

I also have this issue, I have more or less the same example of how it happens. However I have maybe more info since I have the original files inside my archive files.

Anyway, I have a 4GiB archive file, I download it via acd_cli and it has a wrong hash when completed. It usually stops to 0 B/s DL speed for about 30 seconds at some random point, like above, and then resumes itself saying it dropped the connection. After a while it says it failed, so I start the DL again to resume the file to completion, it still reports a failure. I check its size in bytes, it matches the source file size, however the hash is wrong as compared to my own hash, and the one that amazon's metadata reports(which is the same as mine). I then proceeded to extract the files inside the archive, which I also have the original hashes for and then I md5sum check them and I get this:
md5sum: WARNING: 468 computed checksums did NOT match

More than half of the files in the archive do not match my originals, the archive seems to be the same size in bytes as my original source file, however it is a mismatch on the hash. Just to further check around I downloaded the same archive file from amazon's website for clouddrive, I do get the correct file and hash by doing that.

I can only assume there may be some bug in the api or acd_cli, but I don't really know enough to say. Hopefully any of this is helpful. If you want I can test things for you since I've already got everything setup for that.

@yadayada yadayada changed the title Constant Hash mismatch Hash mismatch on resumed files Jul 9, 2015
@yadayada yadayada added the bug label Jul 11, 2015
@yadayada yadayada added the API label Sep 20, 2015
@Rufflewind
Copy link

I'm also running into this problem.

I have a file that's about 8.38 GiB (8998338695 B) in size. I made 5 attempts to download this file:

  1. Success (no special flags).
  2. Bytes 4204527168 to 8479162056 are corrupted (no special flags).
  3. Bytes 5772410432 to 8479162056 are corrupted (no special flags).
  4. Bytes 7937719872 to 8467119236 are corrupted (with -r 2 -x 8).
  5. Success (with -r 4 -x 8).

Some of these attempts failed midway (or I manually interrupted them) so they had to be resumed, though I don't quite remember which ones did. I know for certain 5 did not fail at all, and 4 did fail midway. So I suspect it's a result of resuming a download. FYI I am downloading them onto an SSD.

(Here, "corruption" means the majority of the bytes in that range do not match the original file at all.)

@yadayada
Copy link
Owner

(Here, "corruption" means the majority of the bytes in that range do not match the original file at all.)

Since you were able to identify the offending byte ranges, could you provide some further information on the corruption?

yadayada added a commit that referenced this issue Jun 11, 2016
Removes downloaded files that do not match the originals' MD5 sums and
adds HASH_MISMATCH to list of return values that trigger a download retry.

Also adds an optional acd_cli.ini config file. Having the line
"keep_corrupt=True" in the [download] section will safely rename
the corrupt file for later inspection.

Concerns #57 and #336.
@yadayada
Copy link
Owner

While trying to reproduce #336, I only was able to reproduce this issue.

It turns out that faulty byte ranges originally appear in the incompletely downloaded files. In one file, a chunk of approximately 500MB is missing at an 1500MB offset.

The resuming itself seems to work fine.

@yadayada
Copy link
Owner

I had an inkling about this. Sorry I took so long. Please try whether the latest commit fixes the issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants