Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to set character set for load data infile. #1942

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jpwhite4
Copy link
Member

@jpwhite4 jpwhite4 commented Nov 5, 2024

Description

Trying to load utf8 data properly from xras. The mysql and mariadb docs claim that the character_set_database
is used when load data infile. However I could not get it to work. The only thing that did work was explicitly setting the character set in the load data statement.

Note that the old-school ingestors supportted setting the character set in the load file statement and we use this capability in the classes/DB/PDODBUtf8MultiIngestor.php in the xsede module. It looks like an oversight that this support was not added to the ETLv2 code.

Tests performed

Tested on NAIRR XDMoD:

Old code (note the character_set_* variables appear to be the correct values):

image

New code:

image

@jpwhite4 jpwhite4 requested a review from aaronweeden November 5, 2024 16:10
@jpwhite4 jpwhite4 added the bug Bugfixes label Nov 5, 2024
@jpwhite4 jpwhite4 added this to the 11.5.0 milestone Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugfixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant