Skip to content

MySQL_notes

Steve Gallo edited this page Apr 7, 2017 · 1 revision

Handling of invalid UTF8 characters

MySQL changed the handling of invalid characters in a September 2016 update causing XRAS ingestion to fail. The XRAS request and publication tables contain invalid UTF8 characters so these need to be stripped out on ingest. Another option is to add the directive "character set 'latin1'" to the LOAD DATA INFILE command but this would currently apply that character set for all table ingestions, which is not desirable.

The solution when querying data in Postgres was to convert to latin-1 to strip the bad characters.

SELECT
TO_ASCII(rp.publication::text, 'latin-1') AS publication
FROM xras.request_publications rp
Clone this wiki locally