You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The problem is that this size (together with last modification date) is used to check if some files are already present in offline dir (in case of a failed bank update attempt) => all files get redownloaded on each bank update attempt, even if some are already present locally.
Group file size is (also) obtained during listing step, ans used with date to decide if file should be downloaded (provided data depends on protocoles and servers config ..).
Using content length will be exact, but may be différent from listing info, and file will be downloaded again on next check.
Hmm. The listing step uses unix file information I think? For the files already on disk.
In any case, parsing directly the html info will break everytime, since it's not usually printed in bytes.
Hopefully content-length will work a bit better, but we'll see
For example here: https://ftp.ncbi.nlm.nih.gov/blast/db/
-> the file sizes are in kb/mb/gb, which means it's not super precise.
The problem is that this size (together with last modification date) is used to check if some files are already present in offline dir (in case of a failed bank update attempt) => all files get redownloaded on each bank update attempt, even if some are already present locally.
Probably needs to be done around https://github.com/genouest/biomaj-download/blob/master/biomaj_download/download/curl.py#L375
It's already implemented in the direct http downloader https://github.com/genouest/biomaj-download/blob/master/biomaj_download/download/direct.py#L233
Not sure if we want it to be configurable. Maybe just do it in this case =
http.group.file.size=-1
in properties file?The text was updated successfully, but these errors were encountered: