-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LTD-4662] process file uploads from source bucket #1808
Conversation
0c0b58b
to
a8f8896
Compare
# staged bucket and write for the processed bucket. This might be something to investigate | ||
# with SRE later. | ||
processed_aws_client, processed_bucket_name = _get_bucket_client("processed") | ||
processed_aws_client.put_object(Bucket=processed_bucket_name, Key=s3_key, Body=staged_document["Body"].read()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just reading the documentation for put_object
and it suggests that you can set Body
as [a] seekable file-like object
and get_object
returns a StreamingBody
which I think acts like a seekable file so you might not need to fully .read()
and the two interfaces will just stream the file.
for bucket_details in VCAP_SERVICES["aws-s3-bucket"]: | ||
bucket_name = None | ||
if FILE_UPLOAD_PROCESSED_NAME in bucket_details["tags"]: | ||
bucket_name = FILE_UPLOAD_PROCESSED_NAME | ||
elif FILE_UPLOAD_STAGED_NAME in bucket_details["tags"]: | ||
bucket_name = FILE_UPLOAD_STAGED_NAME | ||
else: | ||
# Skip buckets which are not tagged with the expected names | ||
continue | ||
|
||
AWS_S3_BUCKETS[bucket_name] = { | ||
"AWS_ACCESS_KEY_ID": bucket_details["aws_access_key_id"], | ||
"AWS_SECRET_ACCESS_KEY": bucket_details["aws_secret_access_key"], | ||
"AWS_REGION": aws_credentials["aws_region"], | ||
"AWS_STORAGE_BUCKET_NAME": aws_credentials["bucket_name"], | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this means we need to do a bit of fiddling first with tags before we can deploy this anywhere?
Is there anything else we need to do upfront? Does this require us to have both buckets created as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing these tags are set by SRE?
Locally when we use Minio, does that allow us to set these tags?
@@ -11,6 +11,16 @@ | |||
logger = get_task_logger(__name__) | |||
|
|||
|
|||
@shared_task( | |||
bind=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these tasks need to be bound.
class S3OperationsMoveStagedDocumentToProcessedTests(SimpleTestCase): | ||
@patch("api.documents.libraries.s3_operations._staged_client") | ||
@patch("api.documents.libraries.s3_operations._processed_client") | ||
def test_get_object(self, mock_processed_client, mock_staged_client): | ||
mock_staged_body = Mock() | ||
mock_staged_file = {"Body": mock_staged_body} | ||
mock_staged_client.get_object.return_value = mock_staged_file | ||
|
||
move_staged_document_to_processed("document-id", "s3-key") | ||
|
||
mock_staged_client.get_object.assert_called_with(Bucket="staged", Key="s3-key") | ||
mock_processed_client.put_object.assert_called_with( | ||
Bucket="processed", Key="s3-key", Body=mock_staged_body.read() | ||
) | ||
mock_staged_client.delete_object.assert_called_with(Bucket="staged", Key="s3-key") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use moto here instead of using mocks and asserting?
Closing for now as we have decided to pivot from this idea from the time being. |
Aim
WIP
LTD-4662