Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed first/last_indexed, improved import perf, support for mariadb connector #4108

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

damien-git
Copy link
Contributor

VuFind 10 includes a PR that modifies UpdateDateTracker.java to avoid a deprecated method. After we started using that in prod, our galera cluster crashed. Disabling first/last_indexed fixed it. We are guessing galera could not keep up with the frequent statement creations during import.

This PR reverts the UpdateDateTracker PR. The finalize() code could have been simply removed, because DatabaseManager closes the database connection and hence the statements on shutdown.

It also improves performance by sending db updates by batch. This required changing DatabaseManager to call the shutdown in UpdateDateTracker before the connection is closed, otherwise the statements could not have been used reliably to finish sending db data by batch. When the change_tracker table is empty and only database inserts are used, import performance was improved by 30% in a test.

Sending batches is done a bit differently in mysql and mariadb. We are using mariadb, so I had to add specific support for mariadb, and include the mariadb JDBC connector. In PHP code the mysql driver can be used, but support for the mariadb connection string is needed.

NOTE 1: this change will require mariadb users to change their config to use mariadb as the database driver (or in the connection string). If they don't do it, firt/last_indexed import might fail with a syntax error.

NOTE 2: this code has only been tested with mariadb so far, but it also changes execution for mysql and postgres.

@demiankatz
Copy link
Member

Thanks, @damien-git! Since I'm out of office this week, I haven't had time to give this an especially close look yet, but I did run the full integration test suite on this branch in my local test environment using MySQL and everything passed. That's hardly an exhaustive test, but at least there's no catastrophic issue there.

I would like to do the same thing for PostgreSQL, but upgrading to Ubuntu 24 in my VM seems to have broken my PostgreSQL install; I'll need to figure that out when time permits so I can do further testing. I'll try to find time next week when I'm back in office, though with things piling up, it's possible it will take me a little longer!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants