Using pydrive with user credentials for authenticated download #3

jeremyfix · 2021-04-21T17:46:56Z

Unfortunately, when using your code, an anonymous download is performed and I tried several consecutive days, I always got an exceeded quota error making me unable to download the dataset.

This pull requests, which uses code adapted from the FFHQ-Aging repo is using user credentials for downloading the dataset.

The only requirement is to follow the pydrive quickstart for getting the client_secrets.json file placed in the same directory than download_ffhq.py and you can then indicate you want to use pydrive google authentication by appending the --pydrive command line option.

So for example, for downloading the 1024x1024 images, you simply :

python3 download_ffhq.py -i --pydrive

In the code, several attempts are tried to download a file. Without that code, inspired by yours, I got some httplib2.error.ServerNotFoundError: Unable to find the server at www.googleapis.com being raised. Apparently, retrying the download a second time and the exception is not raised.

I only tested the download of the images (the command line above) but as the other downloads go through the download_files function, I hope it works as well for the other downloads.

…iles, see issue #147 of pydrive

jeremyfix · 2021-04-22T17:13:22Z

Note that, for some reasons, after some times (like hours), it may try to reauthenticate and it ends as a failure but relaunching the script and it continues downloading;

I successfully downloaded the 90 GB of the 1024x1024 images this way.

mmazeika · 2022-04-12T23:41:56Z

This was very helpful for me. I was able to download the 89GB of 1024x1024 images with a restart after a few hours. As an additional step, I had to replace

# Google Drive virus checker nag.
links = [html.unescape(link) for link in data_str.split('"') if 'export=download' in link]
if len(links) == 1:
    if attempts_left:
        file_url = requests.compat.urljoin(file_url, links[0])
        continue

with

# Google Drive virus checker nag.
file_id = re.findall('uc\?id=(.*)&amp', data_str)
if len(file_id) == 1:
    file_id = file_id[0]
    if attempts_left:
        file_url = 'https://www.googleapis.com/drive/v3/files/{}/?key=API_KEY&alt=media'.format(file_id)
        continue

This is because the virus checker page changed, so the code for handling it doesn't work anymore. To make this work, I had to follow the instructions in the pydrive quickstart link given above (i.e., use this PR and get a client_secrets.json from the Drive API). The new virus checker workaround uses an API key that you can create in a GCP API project, similar to how you get the client_secrets.json file. You can also use the OAuth key.

I had to run the download script with the --cmd_auth flag and use a "Desktop" instead of "Web application" setting in the Drive API to make it work. Here is a screenshot of my Drive API page.

jeremyfix added 2 commits April 21, 2021 19:39

using pydrive with user credentials for authenticated download

81fe2f3

using pydrive2 rather than pydrive which ends up with too many open f…

4df9dd1

…iles, see issue #147 of pydrive

This was referenced Jul 27, 2022

ffhq data download NVlabs/eg3d#34

Closed

About ffhq dataset download NVlabs/eg3d#46

Open

neeek2303 approved these changes Jul 5, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using pydrive with user credentials for authenticated download #3

Using pydrive with user credentials for authenticated download #3

jeremyfix commented Apr 21, 2021

jeremyfix commented Apr 22, 2021 •

edited

Loading

mmazeika commented Apr 12, 2022 •

edited

Loading

Using pydrive with user credentials for authenticated download #3

Are you sure you want to change the base?

Using pydrive with user credentials for authenticated download #3

Conversation

jeremyfix commented Apr 21, 2021

jeremyfix commented Apr 22, 2021 • edited Loading

mmazeika commented Apr 12, 2022 • edited Loading

jeremyfix commented Apr 22, 2021 •

edited

Loading

mmazeika commented Apr 12, 2022 •

edited

Loading