-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Source S3: Add handling NoSuchBucket error #31383
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Ignored Deployment
|
Before Merging a Connector Pull RequestWow! What a great pull request you have here! 🎉 To merge this PR, ensure the following has been done/considered for each connector added or updated:
If the checklist is complete, but the CI check is failing,
|
source-s3 test report (commit
|
Step | Result |
---|---|
Build source-s3 docker image for platform(s) linux/x86_64 | ❌ |
Code format checks | ✅ |
Validate metadata for source-s3 | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-s3 test
@@ -64,6 +64,8 @@ def check_availability_and_parsability( | |||
def _check_list_files(self, stream: "AbstractFileBasedStream") -> List[RemoteFile]: | |||
try: | |||
files = stream.list_files() | |||
except CustomFileBasedSourceException as custom_exc: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tolik0 thanks for this PR, I agree that we want to be able customize the exceptions.
Looking at this code: the error that's relevant to the user is still that there was a problem listing files, so I think we want to keep the CheckAvailabilityError
, but want to make sure we can customize the message, e.g. for things like missing bucket.
Since we're already raising CheckAvailabilityError ... from exc
, we should get the missing bucket error as part of the stacktrace when we raise the CheckAvailabilityError
, right?
If that's obscuring the error too much, instead of raising CustomFileBasedSourceException
, WDYT about giving it a "message" attribute, then here we'll raise CheckAvailabilityError(custom_exc.message, stream=stream.name) from exc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clnoll Thanks for feedback!
I've removed CustomFileBasedSourceException. Do you think this is better? Should I split this PR into two: one for file-based CDK and another for S3 changes?
9f586f9
to
db3ff06
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tolik0! I think I'm actually preferring to keep the CustomFileBasedSourceException
. I left an example of what I have in mind in a comment.
I think you will have to split this into 2 PRs since the new CDK version won't be available yet when the source-s3 connector is published.
@@ -71,7 +71,8 @@ def _check_list_files(self, stream: "AbstractFileBasedStream") -> List[RemoteFil | |||
try: | |||
files = stream.list_files() | |||
except Exception as exc: | |||
raise CheckAvailabilityError(FileBasedSourceError.ERROR_LISTING_FILES, stream=stream.name) from exc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what I'd do here is continue to raise the CustomException from your previous version, but catch it and re-raise with a CheckAvailabilityError, like this:
try:
...
except CustomFileBasedException as exc:
raise CheckAvailabilityError(exc.error_message, stream=stream.name) from exc
except Exception as exc:
...
This way we can avoid having to do the hasattr
check. Plus, I think we'll probably want to use your CustomFileBasedException in other places too as the file-based CDK matures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the code to use CustomFileBasedException and re-raise as CheckAvailabilityError, removing the hasattr check. Thanks for the suggestion.
yield remote_file | ||
else: | ||
for remote_file in self._page(s3, globs, self.config.bucket, None, seen, logger): | ||
for current_prefix in (prefixes if prefixes else [None]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this change needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simplified the code to avoid duplication
) from exc | ||
self.raise_error_listing_files(FileBasedSourceError.ERROR_LISTING_FILES, globs, exc) | ||
|
||
def raise_error_listing_files(self, error_message: Union[str, FileBasedSourceError], globs: List[str], exc: Optional[Exception] = None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a helper method for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the helper method to reduce repetitive error-raising code.
source-s3 test report (commit
|
Step | Result |
---|---|
Build source-s3 docker image for platform(s) linux/x86_64 | ✅ |
Unit tests | ✅ |
Acceptance tests | ❌ |
Code format checks | ✅ |
Validate metadata for source-s3 | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-s3 test
c60529b
to
3c18702
Compare
source-s3 test report (commit
|
Step | Result |
---|---|
Build source-s3 docker image for platform(s) linux/x86_64 | ❌ |
Code format checks | ❌ |
Validate metadata for source-s3 | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-s3 test
source-s3 test report (commit
|
Step | Result |
---|---|
Build source-s3 docker image for platform(s) linux/x86_64 | ❌ |
Code format checks | ✅ |
Validate metadata for source-s3 | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-s3 test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tolik0, looks great!
30e354a
to
3edc8af
Compare
source-s3 test report (commit
|
Step | Result |
---|---|
Build source-s3 docker image for platform(s) linux/amd64 | ❌ |
Code format checks | ❌ |
Validate metadata for source-s3 | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-s3 test
source-s3 test report (commit
|
Step | Result |
---|---|
Build source-s3 docker image for platform(s) linux/amd64 | ❌ |
Code format checks | ✅ |
Validate metadata for source-s3 | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-s3 test
0550b77
to
2c9c7b6
Compare
source-s3 test report (commit
|
Step | Result |
---|---|
Build source-s3 docker image for platform(s) linux/amd64 | ✅ |
Unit tests | ✅ |
Acceptance tests | ✅ |
Check our base image is used | ✅ |
Code format checks | ✅ |
Validate metadata for source-s3 | ✅ |
Connector version semver check | ✅ |
Connector version increment check | ✅ |
QA checks | ✅ |
☁️ View runs for commit in Dagger Cloud
Please note that tests are only run on PR ready for review. Please set your PR to draft mode to not flood the CI engine and upstream service on following commits.
You can run the same pipeline locally on this branch with the airbyte-ci tool with the following command
airbyte-ci connectors --name=source-s3 test
What
Add raising config error in case of wrong bucket name.
Closes:
#25835
How
Added a new type of error to file-based cdk to avoid catching in file-based cdk and just raise AirbyteTracedException config error.