From 7704399beca344062f74bf07f318c80c8ad815d4 Mon Sep 17 00:00:00 2001 From: dat-a-man <98139823+dat-a-man@users.noreply.github.com> Date: Sat, 2 Mar 2024 04:01:56 +0000 Subject: [PATCH] Updated the docs for OAuth verification. --- .../verified-sources/filesystem.md | 67 ++++++++++++++++++- .../verified-sources/google_analytics.md | 5 +- .../verified-sources/google_sheets.md | 5 +- 3 files changed, 72 insertions(+), 5 deletions(-) diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/filesystem.md b/docs/website/docs/dlt-ecosystem/verified-sources/filesystem.md index aed19838ef..8c74585028 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/filesystem.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/filesystem.md @@ -23,6 +23,7 @@ Sources and resources that can be used with this verified source are: | read_jsonl | Resource-transformer | Reads jsonl file content and extract the data | | read_parquet | Resource-transformer | Reads parquet file content and extract the data with **Pyarrow** | + ## Setup Guide ### Grab credentials @@ -49,8 +50,11 @@ For more info, see [AWS official documentation.](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html) #### Google Cloud Storage / Google Drive credentials +There are two ways to access GDrive, using: + - Service account credentials, which is the preferred option for those who have a GCP account. + - OAuth credentials, which is suitable for those who don't have a GCP account. -To get GCS/GDrive access: +To get GCS/GDrive access using **Service account** credentials: 1. Log in to [console.cloud.google.com](http://console.cloud.google.com/). 1. Create a [service account](https://cloud.google.com/iam/docs/service-accounts-create#creating). @@ -63,6 +67,57 @@ To get GCS/GDrive access: For more info, see how to [create service account](https://support.google.com/a/answer/7378726?hl=en). +To get GCS/GDrive access using **OAuth** credentials: + +You need to create a GCP account to get OAuth credentials if you don't have one. To create one, +follow these steps: + +1. Initialize the verified source as explained in ["initialize-the-verified-source"](#initialize-the-verified-source) to set up OAuth credentials. + +1. Open a GCP project in your GCP account. + +1. Enable the Google Drive API in the project. + +1. Search credentials in the search bar and go to Credentials. + +1. Go to Credentials -> OAuth client ID -> Select Desktop App from the Application type and give an + appropriate name. + +1. Download the credentials and fill "client_id", "client_secret" and "project_id" in + "secrets.toml". You can comment out the "refresh_token" field since we will grab it + in the next steps. + +1. Go back to credentials and select the OAuth consent screen on the left. + +1. Fill in the App name, user support email(your email), authorized domain (localhost.com), and dev + contact info (your email again). + +1. To access GDrive, add the following scope: + + ``` + "https://www.googleapis.com/auth/drive.readonly" + ``` + +1. To access GCS, add the following scope: + + ``` + "https://www.googleapis.com/auth/devstorage.read_only" + ``` + +1. Add your email as a test user. + +1. Generate `refresh_token`: + + After configuring "client_id", "client_secret" and "project_id" in "secrets.toml". To generate + the refresh token, run the following script from the root folder: + + ```bash + python filesystem/setup_script_gcp_oauth.py + ``` + + Once you have executed the script and completed the authentication, you will receive a "refresh + token" that can be used to set up the ".dlt/secrets.toml". + #### Azure Blob Storage credentials To obtain Azure blob storage access: @@ -111,11 +166,17 @@ For more information, read the aws_access_key_id="Please set me up!" aws_secret_access_key="Please set me up!" - # For GCS bucket / Google Drive access: + # For GCS bucket / Google Drive access (Service account method): client_email="Please set me up!" private_key="Please set me up!" project_id="Please set me up!" + # For GCS bucket / Google Drive access (OAuth method): + client_id = "Please set me up!" + client_secret = "Please set me up!" + refresh_token = "Please set me up!" + project_id = "Please set me up!" + # For Azure blob storage access: azure_storage_account_name="Please set me up!" azure_storage_account_key="Please set me up!" @@ -164,7 +225,7 @@ For more information, read the ```bash pip install adlfs>=2023.9.0 ``` - - GCS storage: No separate module needed. + - GCS, GDrive storage: No separate module needed. 1. You're now ready to run the pipeline! To get started, run the following command: diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/google_analytics.md b/docs/website/docs/dlt-ecosystem/verified-sources/google_analytics.md index 02d7803a9b..e51a5fc24a 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/google_analytics.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/google_analytics.md @@ -63,6 +63,8 @@ You need to create a GCP service account to get API credentials if you don't hav You need to create a GCP account to get OAuth credentials if you don't have one. To create one, follow these steps: +1. Initialize the verified source as explained in ["initialize-the-verified-source"](#initialize-the-verified-source) to set up OAuth credentials. + 1. Ensure your email used for the GCP account has access to the GA4 property. 1. Open a GCP project in your GCP account. @@ -75,7 +77,8 @@ follow these steps: appropriate name. 1. Download the credentials and fill "client_id", "client_secret" and "project_id" in - "secrets.toml". + "secrets.toml". You can comment out the "refresh_token" field since we will grab it + in the next steps. 1. Go back to credentials and select the OAuth consent screen on the left. diff --git a/docs/website/docs/dlt-ecosystem/verified-sources/google_sheets.md b/docs/website/docs/dlt-ecosystem/verified-sources/google_sheets.md index 2a5d4b03ab..7a4f1d09ec 100644 --- a/docs/website/docs/dlt-ecosystem/verified-sources/google_sheets.md +++ b/docs/website/docs/dlt-ecosystem/verified-sources/google_sheets.md @@ -68,6 +68,8 @@ You need to create a GCP service account to get API credentials if you don't hav You need to create a GCP account to get OAuth credentials if you don't have one. To create one, follow these steps: +1. Initialize the verified source as explained in ["initialize-the-verified-source"](#initialize-the-verified-source) to set up OAuth credentials. + 1. Open a GCP project in your GCP account. 1. Enable the Sheets API in the project. @@ -78,7 +80,8 @@ follow these steps: appropriate name. 1. Download the credentials and fill "client_id", "client_secret" and "project_id" in - "secrets.toml". + "secrets.toml". You can comment out the "refresh_token" field since we will grab it + in the next steps. 1. Go back to credentials and select the OAuth consent screen on the left.