-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incremental sync flaw #109
Comments
I'm pretty sure I discovered the issue here is due to a flaw with Jira pagination. When we run a query fetching all issues since the last update, we might get too many records for Jira to return at once, so it returns the first N and then indicates that there are M more records. However, it doesn't look like Jira has a way to cache the query results. So when you re-run the API call and request records starting with the N+1 record, it's possible that the results may have been updated. For example, suppose I have 5 issues (ISSUE-1 through ISSUE-5) and we want to return all of them ordered by the updated timestamp, but we have
The API will return that
The consequence is that ISSUE-4 is never seen in the paginated results. To fix this, I believe we'd have to stop using the Paginator and instead run subsequent queries with the maximum updated timestamp of the previous query. |
Using the maximum updated timestamp doesn't work either since JQL is limited to querying by the minute and it would be easy to get more than a page of data where all the records were updated in the same minute. Instead, we can query the data in descending order. |
I'm running this tap through meltano. I've been running since March 2023 and have noticed that it occasionally misses records. If I do a full re-sync, the records show up, but something about the state is not working properly. Have you seen this before?
The text was updated successfully, but these errors were encountered: