-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add drive download utility functions and test #462
base: main
Are you sure you want to change the base?
Changes from 3 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
from io import BytesIO | ||
import json | ||
import os | ||
from googleapiclient.discovery import build | ||
from googleapiclient.http import MediaIoBaseDownload | ||
from google.oauth2.service_account import Credentials | ||
|
||
SERVICE_ACCOUNT_FILE = os.environ['GOOGLE_SERVICE_ACCOUNT_FILE'] | ||
|
||
from logger import create_log | ||
|
||
logger = create_log(__name__) | ||
|
||
def create_drive_service(service_account_info): | ||
scopes = ['https://www.googleapis.com/auth/drive'] | ||
credentials = Credentials.from_service_account_info(service_account_info, scopes=scopes) | ||
|
||
return build('drive', 'v3', credentials=credentials) | ||
|
||
def get_drive_file(file_id: str) -> BytesIO: | ||
drive_service = create_drive_service(service_account_info=json.loads(SERVICE_ACCOUNT_FILE)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like the use of functions rather than a class here. Could we create the drive_service at the top level of this file so it's used more like a singleton? |
||
request = drive_service.files().get_media(fileId=file_id) | ||
file = BytesIO() | ||
|
||
try: | ||
downloader = MediaIoBaseDownload(file, request) | ||
|
||
done = False | ||
while done is False: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
status, done = downloader.next_chunk() | ||
|
||
except HttpError as error: | ||
logger.warning(f"HTTP error occurred when downloading Drive file {file_id}: {error}") | ||
file = None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we can return early here and just have |
||
except Exception as err: | ||
logger.warning(f"Unexpected {type(err)=} occurred when downloading drive file {file_id}: {error}") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. logger.exception works well in this case to capture the error type and message |
||
|
||
return file | ||
|
||
|
||
|
||
|
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: only need one new line at the end of the file! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oops, thanks, vs code was no longer linting on save! |
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
import os | ||
from load_env import load_env_file | ||
|
||
load_env_file('local-compose', file_string='config/local-compose.yaml') | ||
|
||
from processes.util.google_integration import get_drive_file | ||
|
||
def test_get_drive_file(): | ||
test_id = os.environ['EXAMPLE_FILE_ID'] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm using a test file ID in my local-compose. Could we upload a small file to test with, assuming the UUID of a drive file is otherwise anonymous (e.g. doesn't expose anything else about our account info or structure)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Definitely! |
||
file = get_drive_file(test_id) | ||
|
||
assert file != None | ||
assert file.seek(0, 2) > 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think putting this in the services folder instead is more explicit - e.g.
services/google_drive_service.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think it's probably better this to refactor this into a service structure and move it there