Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Critical Bug - Usage of skip_to_timestamp option causes querying the information schema huge number of times #383

Open
shivamgly opened this issue Oct 5, 2022 · 1 comment

Comments

@shivamgly
Copy link

Overview

We are using the library to periodically sync the updated rows to another database. We are using the skip_to_timestamp option to skip the binary logs that were already synced in the previous cycle. But if we use this, we see that the library is executing this query too many times.

Bug description

According to the logic written in row_event.py (line 613), it seems the schema is fetched for every TABLE_MAP_EVENT if it is not already present in the table_map. But in binlogstream.py (lines 551 to 557), the result of the fetched schema is ignored if the event timestamp is lesser than the skip_to_timestamp option. So, the schema will be fetched again in the next TABLE_MAP_EVENT as it was not populated in the table_map previously.

Resolution

Populating the table_map first before continuing the loop in binlogstream.py (lines 551 to 557) should fix the issue.

@julien-duponchelle
Copy link
Owner

Hi @shivamgly thanks for the report would like to propose a PR ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants