Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add do command to change MySQL charset to utf8mb4 #1079

Conversation

Danyal-Faheem
Copy link
Contributor

@Danyal-Faheem Danyal-Faheem commented Jun 14, 2024

Additional PR to close #938.

The first part of this work can be found in #1084.

Introduces a new do command convert_mysql_utf8mb4_charset that converts the default and current character and collation of the tables to utf8mb4.

The command has the following options:

  • database: the database in which the tables are to be converted. Default: {{ OPENEDX_MYSQL_DATABSE }}
  • include: comma separated list of tables on which the upgrade process is to be applied
  • exclude: comma separated list of tables to exclude from the upgrade process

The command will perform the following upgrades:
- Upgrade all utf8mb3 charset to utf8mb4
- Upgrade collation utf8mb3_general_ci to utf8mb4_unicode_ci
- Upgrade collation utf8mb3_bin to utf8mb4_bin
- Upgrade collation utf8mb3_* to utf8mb4_*

@DawoudSheraz
Copy link
Contributor

@ormsbee FYI

@ztraboo
Copy link

ztraboo commented Aug 6, 2024

Will these changes work with the Nutmeg release? I'm considering this to help with putting emojis in section and subsection header titles.
https://discuss.openedx.org/t/emoji-in-subsection-title-not-supported-since-nutmeg/7856

docs/local.rst Outdated
Changing the mysql charset and collation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. note:: This command is only for users upgrading from quince.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure if this is correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this note.

Comment on lines 375 to 380
if not config["RUN_MYSQL"]:
fmt.echo_info(
f"You are not running MySQL (RUN_MYSQL=false). It is your "
f"responsibility to update your MySQL instance to {charset} charset and {collation} collation."
)
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not move this on top? If MySQL is not running, we don't need to do any actions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should. Thank you for picking this out. I will update it.

tutor/utils.py Outdated
Comment on lines 406 to 410
CASE
WHEN COLLATION_NAME LIKE CONCAT('{charset_to_upgrade_from}', '_general_ci') THEN 'utf8mb4_unicode_ci'
WHEN COLLATION_NAME LIKE CONCAT('{charset_to_upgrade_from}', '_%') THEN CONCAT('{charset}', SUBSTRING_INDEX(COLLATION_NAME, '{charset_to_upgrade_from}', -1))
ELSE COLLATION_NAME
END AS COLLATION_NAME
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I did not know about CASE.

tutor/utils.py Outdated
Comment on lines 437 to 438
# Then, upgrade the default charset and collation of each column
# This sequence of table -> column is necessary to preserve column defaults
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IF we are updating table again in UpdateTables, how will it not cause any issues?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UpdateTables will only update the tables that have the default collation of something other than utf8mb4_*, i.e, tables that weren't updated in UpdateColumns.

Reference: https://github.com/overhangio/tutor/pull/1079/files#diff-46b7a04b09b0ffeb02b4f36ed494cbed6eaa7ea1efb4c46f9951b4ff6b0a3a37R464

@Danyal-Faheem Danyal-Faheem force-pushed the danyalfaheem/mysql-charset-change-command branch from f4fbab9 to 7dc81d9 Compare August 21, 2024 13:39
@ormsbee
Copy link
Contributor

ormsbee commented Aug 30, 2024

I am not really competent at writing MySQL stored procedures, but this seems right to me. Certainly the logical mappings in the description and the options you give in the Python script for inclusion/exclusion are good. I realize this is a pain to get right–thank you for sticking to it.

@Danyal-Faheem
Copy link
Contributor Author

Hi @regisb, can you review this PR for me?

tutor/commands/jobs.py Outdated Show resolved Hide resolved
tutor/commands/jobs.py Outdated Show resolved Hide resolved
tutor/commands/jobs.py Outdated Show resolved Hide resolved
tutor/utils.py Outdated Show resolved Hide resolved
docs/local.rst Outdated
Comment on lines 146 to 169

.. note:: This command has been tested only for users upgrading from Quince. While it is expected to work for users on earlier releases, please use it with caution as it has not been tested with those versions.

Your database's charset and collation might not support specific characters or emojis. Tutor will run fine without this change unless you explicity use specific characters in your instance.

.. warning:: This change is potentially irreversible. It is recommended to make a backup of the MySQL database. See the :ref:`database dump instructions <database_dumps>` to create a DB dump.

To change the charset and collation of all the tables in the openedx database, run::

tutor local do convert-mysql-utf8mb4-charset

Alternatively, if you only want to change the charset and collation of certain tables or exclude certain tables, you can use the ``--include`` or ``--exclude`` options. These options take comma separated names of tables/apps with no space in-between. To upgrade the ``courseware_studentmodule`` and ``courseware_studentmodulehistory`` tables, run::

tutor local do convert-mysql-utf8mb4-charset --include=courseware_studentmodule,courseware_studentmodulehistory

Tutor performs pattern matching from the start of the table name so you can just enter the name of the app to include/exclude all the tables under that app. To upgrade all the tables in the database except the ones under the student and wiki apps, run::

tutor local do convert-mysql-utf8mb4-charset --exclude=student,wiki

In the above command, all the tables whose name starts with either student or wiki will be excluded from the upgrade process.

By default, only the tables in the openedx database are changed. If you are running any plugins with their own databases, you can upgrade them by utilizing the ``--database`` option. To upgrade all the tables in the discovery database, run::

tutor local do convert-mysql-utf8mb4-charset --database=discovery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the usage of you/second person pronouns in the guide should be reduced/removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworded the docs to reduce the usage.

charset_to_upgrade_from: str,
) -> str:
"""
Helper function to generate the mysql query to upgrade the charset and collation of tables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not just tables, it is columns, tables and database.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -164,6 +164,10 @@ def upgrade_from_quince(context: click.Context, config: Config) -> None:
upgrade_mongodb(context, config, "5.0.26", "5.0")
upgrade_mongodb(context, config, "6.0.14", "6.0")
upgrade_mongodb(context, config, "7.0.7", "7.0")
fmt.echo_alert(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this would be moved to upgrade from sumac section because we are targeting this for nightly and makes sense to have this there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to upgrade to Sumac, right? I agree, but we don't have an upgrade_from_redwood command yet, so we can make that change later in the sumac branch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, right. I meant upgrade_from_redwood section of sumac branch. I will take care of that in sumac branch once this branch is merged.

Copy link
Contributor

@regisb regisb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Awesome work.

@DawoudSheraz DawoudSheraz merged commit d4e6c6c into overhangio:nightly Nov 11, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

5 participants