Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 7935/integrate httpclient to bulk job creation and status update #38685

Conversation

maxi297
Copy link
Contributor

@maxi297 maxi297 commented May 27, 2024

What

Partially address https://github.com/airbytehq/airbyte-internal-issues/issues/7935 (the bulk creation and status update part)

How

By:

  • Moving the HTTP query for bulk job creation outside the stream
  • By using the HttpClient to perform bulk creation and status update requests
  • By assuming all other HTTP response will be handled as part of the current code

Review guide

  1. Removing the query from base_stream: airbyte-integrations/connectors/source-shopify/source_shopify/streams/base_streams.py While doing that, I also took the opportunity to align stream_slices with what cursors in the CDK do so that when we will standardize, there will be this thing less to do
  2. Adding the query as part of the bulk: airbyte-integrations/connectors/source-shopify/source_shopify/shopify_graphql/bulk/job.py

User Impact

The connector should be resilient following this change as TRANSIENT_EXCEPTIONS will be retried (see the updated test in airbyte-integrations/connectors/source-shopify/unit_tests/integration/test_bulk_stream.py). Apart from that, we expect no behavior change.

This will also add retries on 429 + 5XX errors which will affect not only bulk job creation and status update but also access scope (see #38678)

Can this PR be safely reverted and rolled back?

  • YES 💚
  • NO ❌

TODO

  • Migrate the deletion and getting results to HttpClient
  • Ensure that all error handling that is re-usable is in the ShopifyErrorHandler

@maxi297 maxi297 requested a review from a team May 27, 2024 20:17
Copy link

vercel bot commented May 27, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Jun 5, 2024 1:19pm

@octavia-squidington-iv octavia-squidington-iv requested a review from a team May 27, 2024 20:18
@maxi297 maxi297 changed the title Issue 7935/integrate httpclient to bulk job check Issue 7935/integrate httpclient to bulk job creation and status update May 27, 2024
return ErrorResolution(ResponseAction.IGNORE, None, None)
return _NO_ERROR_RESOLUTION

if response.status_code == 429 or response.status_code >= 500:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior was part of the HttpStream which was used for bulk job creation requests. Hence, I moved the logic here so that it applies to all the HTTP requests performed by HttpClient

from requests.exceptions import JSONDecodeError
from source_shopify.utils import ApiTypeEnum
from source_shopify.utils import ShopifyRateLimiter as limiter

from ...http_request import ShopifyErrorHandler
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we use relative import here instead of source_shopify.http_request?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who should answer this question?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know but I haven't encountered any issue associated to this. It just felt odd that we do that here while we use fully qualified import elsewhere

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, there is no such import (for master), i believe this one was added by you, with this change, if not, please attach the related PR that implemented this line, because i'm confused)

@@ -21,6 +21,8 @@
exceptions.SSLError,
) + RESPONSE_CONSUMPTION_EXCEPTIONS

_NO_ERROR_RESOLUTION = ErrorResolution(ResponseAction.SUCCESS, None, None)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add the comment about this, what it stands for and what is the parrent used?

@bazarnov
Copy link
Collaborator

This change makes sense to me, but I'm concerned about the CAT tests. @maxi297, please take a look. It seems like the API Password expired or something happened in between the reading cycle.

Let's ensure we still pass the correct authentication information to the HttpClient, when:

  • request is retried
  • forced canceled and created another job with another slice range

Once CAT passes, I'll approve this change right away, since it's pretty straightforward. Thanks @maxi297

@maxi297
Copy link
Contributor Author

maxi297 commented May 28, 2024

@bazarnov If you want to recheck this PR, there were two weird cases on error retrying that felt weird so I changed a couple more things:

  • When there was a JSON parsing error on the job creation request, we would not retry (see this test that reproduce the issue). The issues seems to be that the retry was on job_process_created (see this) and this only redo the HTTP request if it is a concurrency error here
  • We didn’t retry on the first status update if there was an issue (see this test). This was caused by self._job_state not being set when a job was created

Let me know if you see issues with this. I'll move the job cancellation and fetching the job results in another PR tomorrow


def _has_running_concurrent_job(self, errors: Optional[Iterable[Mapping[str, Any]]] = None) -> bool:
"""
When concurent BULK Job is already running for the same SHOP we receive:
When concurrent BULK Job is already running for the same SHOP we receive:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we change the concurrent to be concurent ? I believe there is an English word concurrent - article

@bazarnov
Copy link
Collaborator

@bazarnov If you want to recheck this PR, there were two weird cases on error retrying that felt weird so I changed a couple more things:

  • When there was a JSON parsing error on the job creation request, we would not retry (see this test that reproduce the issue). The issues seems to be that the retry was on job_process_created (see this) and this only redo the HTTP request if it is a concurrency error here
  • We didn’t retry on the first status update if there was an issue (see this test). This was caused by self._job_state not being set when a job was created

Let me know if you see issues with this. I'll move the job cancellation and fetching the job results in another PR tomorrow

This change looks good to me, left some small comments.

Also, for after the latest CAT, we see this schema change for the countries stream:

ERROR    root:test_core.py:902 
The countries stream has the following schema errors:
0.0 is not of type 'null', 'integer'

Failed validating 'type' in schema['properties']['provinces']['items']['properties']['tax_percentage']:
    {'description': 'Percentage value of tax applicable in the province.',
     'type': ['null', 'integer']}

On instance['provinces'][0]['tax_percentage']:
    0.0

Should we include this as a fix to this PR as well? I believe this one will be a breaking change for the countries stream.

@bazarnov
Copy link
Collaborator

I'll also test the latest changes tomorrow manually to see if there are no regressions on the retry on concurrency and cancelation and retry scenarios. Will update you here.

…issue-7935/integrate-httpclient-to-bulk-job-check
@maxi297
Copy link
Contributor Author

maxi297 commented May 29, 2024

Should we include this as a fix to this PR as well? I believe this one will be a breaking change for the countries stream.

I would create another PR on master for that as have the two changes being independent. Does that make sense to you?

@maxi297
Copy link
Contributor Author

maxi297 commented May 29, 2024

@bazarnov Here is the PR for the breaking change: #38746

@bazarnov
Copy link
Collaborator

@bazarnov Here is the PR for the breaking change: #38746

This one has been approved, we can merge it.

…issue-7935/integrate-httpclient-to-bulk-job-check
@maxi297 maxi297 merged commit aa691e7 into issue-7935/integrate-httpclient-to-access-scopes Jun 5, 2024
16 of 19 checks passed
@maxi297 maxi297 deleted the issue-7935/integrate-httpclient-to-bulk-job-check branch June 5, 2024 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants