Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python app works with 11.4, fails with 11.5 #604

Closed
lovette opened this issue Aug 16, 2024 · 7 comments
Closed

Python app works with 11.4, fails with 11.5 #604

lovette opened this issue Aug 16, 2024 · 7 comments

Comments

@lovette
Copy link

lovette commented Aug 16, 2024

I have a Python app that worked fine with Docker images 10.x through 11.4 but after it upgraded to 11.5 today, it started failing. (The error message mentions ut8mb4 so I suspect a charset issue.) I don't see anything in the Release Notes that jumps out at me as a potential cause. I tried 11.6-rc image and it fails too.

As background, it's a 10 year old codebase written in Python 2.7 and uses oursql which hasn't been updated since 2012. The app has worked with every version of MariaDB since 2014 until today 😕

Not the end of the world if I'm now stuck on 11.4, but maybe there's a release issue at play?

@grooverdan
Copy link
Member

There's a default collation change in https://mariadb.com/kb/en/mariadb-11-5-1-release-notes/.

Exact error message would be useful.

Confirming its 11.4.3 that is ok?

@lovette
Copy link
Author

lovette commented Aug 17, 2024

Thanks for pointing that out.

The error itself, a Python KeyError triggered deep within oursql, is not very helpful 🙄

unknown encoding: utf8mb4

There are only a few differences of SHOW VARIABLES between 11.4.3 and 11.5.2 and the only charset related one is character_set_collations.

Version Value
11.4.3 utf8mb4=utf8mb4_uca1400_ai_ci
11.5.2 utf8mb3=utf8mb3_uca1400_ai_ci,ucs2=ucs2_uca1400_ai_ci,utf8mb4=utf8mb4_uca1400_ai_ci,utf16=utf16_uca1400_ai_ci,utf32=utf32_uca1400_ai_ci

I tried setting and changing various charset settings and nothing makes a difference 🥲 I'm happy to take any suggestions you may have, but realize this is not an image related issue and have no problem laying the blame on oursql 😁

@grooverdan
Copy link
Member

oursql:

      property charset:
          """charset -> str
          
          Get or set the connection's current encoding. If use_unicode is 
          enabled, this is the encoding that will be used to decode incoming
          strings.
          """
          def __get__(self):
>>            self._charset = PyString_FromString( > undeclared name not builtin: PyString_Fr…
>>                mysql_character_set_name(self.conn)) > undeclared name not builtin: mysql_c…
              return self._charset
          def __set__(self, value):
              cdef char *svalue
              self._check_closed()
>>            svalue = PyString_AsString(value) > undeclared name not builtin: PyString_AsStr…
   > Storing unsafe C derivative of temporary Python reference
>>            if mysql_set_character_set(self.conn, svalue): > undeclared name not builtin: m…
                  self._raise_error()
              self._charset = value

So an error on mysql_set_character_set will raise the KeyError you experienced.

Current implementation https://mariadb.com/kb/en/mysql_set_character_set/ accepts utf8mb4 however that may not the the case for you.

The lack of backtrace means I can't see where this is coming from. Work out how to change it where it occurs.

A change in the code that hack utf8mb4 back to urf8 might be one option, or identify the source where utf8mb4 comes into the codebase.

@lovette
Copy link
Author

lovette commented Aug 19, 2024

Expanding on your observation, I see that the charset property is the value returned by mysql_character_set_name.

When I call oursql.connect I set charset=utf8 which is passed into mysql_options to set MYSQL_SET_CHARSET_NAME. In the past, this resulted in mysql_character_set_name returning utf8, but with 11.5 it returns utf8mb4. 🤨

The good news is, if I set connection.charset=utf8 again after the connection is made, oursql is happy! So perhaps setting MYSQL_SET_CHARSET_NAME is failing or it accepts setting it to utf8 but then reports it as being utf8mb4.

Not only that, but 11.5 also reports utf8mb3 for variables it used to report as utf8mb4. I sanity check a few other variables after I connect...

[WARNING] MySQL server variable 'character_set_client' is utf8mb3, expected utf8mb4
[WARNING] MySQL server variable 'character_set_connection' is utf8mb3, expected utf8mb4
[WARNING] MySQL server variable 'character_set_results' is utf8mb3, expected utf8mb4

The only references to utf8mb3 in SHOW VARIABLES is in character-set-collations, character-set-system and old-mode (which is UTF8_IS_UTF8MB3), the last two are the same as shown with 11.4.

My app sets everything to utf8mb4 from top to bottom, tables and all. These are the settings I set explicitly.

[mysqld]
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
init-connect = 'SET COLLATION_CONNECTION = utf8mb4_unicode_ci, NAMES utf8mb4'

[client]
default-character-set=utf8mb4

Lastly, I notice the official Python 2.7 image is based on Debian 10 which includes MariaDB 10.3 client libraries (but not the client intself.) Could there be some incompatibility between the 10.3 client libraries interacting with an 11.5 server?

@grooverdan
Copy link
Member

Sorry, lost track of this.

Could there be some incompatibility between the 10.3 client libraries interacting with an 11.5 server?

No idea. I doubt its actively tested.

@lovette
Copy link
Author

lovette commented Nov 19, 2024

Thanks for the help! I'm happy to say it's been running fine since I last posted. I'll close this issue and get to work porting our codebase to PY3 😁

@lovette lovette closed this as completed Nov 19, 2024
@grooverdan
Copy link
Member

Thanks for the help! I'm happy to say it's been running fine since I last posted.

Glad to hear it.

... get to work porting our codebase to PY3 😁

Or just wait until python 4 😺 (joke, really, py3 is a good move).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants