Skip to content

Commit

Permalink
feature: new commands to help migrating database schema (#2002)
Browse files Browse the repository at this point in the history
Co-authored-by: Joongi Kim <[email protected]>
Backported-from: main (24.09)
Backpored-to: 24.03
  • Loading branch information
kyujin-cho and achimnol committed Apr 16, 2024
1 parent af42d4a commit a58c599
Show file tree
Hide file tree
Showing 7 changed files with 1,447 additions and 3 deletions.
1 change: 1 addition & 0 deletions changes/2002.feature.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Introduce `mgr schema dump-history` and `mgr schema apply-missing-revisions` command to ease the major upgrade involving deviation of database migration histories
15 changes: 15 additions & 0 deletions docs/dev/daily-workflows.rst
Original file line number Diff line number Diff line change
Expand Up @@ -718,6 +718,21 @@ Making a new release
* Push the commit and tag. The GitHub Actions workflow will build the packages
and publish them to PyPI.

* When making a new major release, snapshot of prior release's final DB migration history
should be dumped. This will later help to fill out missing gaps of DB revisions when
upgrading outdated cluster. The output then should be committed to **next** major release.

.. code-block:: console
$ ./backend.ai mgr schema dump-history > src/ai/backend/manager/models/alembic/revision_history/<version>.json
Suppose you are trying to create both fresh baked 24.09.0 and good old 24.03.10 releases.
In such cases you should first make a release of version 24.03.10, move back to latest branch, and then
execute code snippet above with `<version>` set as `24.03.10`, and release 24.09.0 including the dump.

To make workflow above effective, be aware that backporting DB revisions to older major releases will no longer
be permitted after major release version is switched.

Backporting to legacy per-pkg repositories
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
1 change: 1 addition & 0 deletions docs/install/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,5 @@ Installation Guides
install-on-clouds
install-on-premise
install-monitoring-and-logging-tools
upgrade-existing-cluster
env-wsl2
116 changes: 116 additions & 0 deletions docs/install/upgrade-existing-cluster.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
Upgrade existing Backend.AI cluster
===================================

.. note::

It is considered as an ideal situation to terminate every workload (including compute sessions)
before initiating upgrade. There may be unexpected side effects when performing a rolling upgrade.

.. note::

Unless you know how each components interacts with others, it is best to retain a single version
installed across every parts of Backend.AI cluster.


Performing minor upgrade
------------------------

A minor upgrade means upgrading a Backend.AI cluster while keeping the major version same (e.g. 24.03.0 to 24.03.1).
Usually changes for minor upgrades are meant for fixing critical bugs rather than introducing new features.
In general there should be only trivial changes between minor versions that won't affect how users interact with the software.
To plan the upgrade, first check following facts:

* Read every bit of the release changelog.

* Run the minor upgrade consecutively version by version.

Do not skip the intermediate version event when trying to upgrade an outdated cluster.

* Check if there is a change at the database schema.

As it is mentioned at the beginning it is best to maintain database schema as concrete, but in rare situations it is
inevitable to alter it.

* Make sure every mission critical workloads are shut down when performing a rolling upgrade.


Upgrading Backend.AI Manager
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Stop the manager process running at server.
2. Upgrade the Python package by executing ``pip install -U backend.ai-manager==<target version>``.
3. Match databse schema with latest by executing ``alembic upgrade head``.
4. Restart the process.


Upgrading other Backend.AI components
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Stop the ongoing server process.
2. Upgrade the Python package by executing ``pip install -U backend.ai-<component name>==<target version>``.
3. Restart the process.


Others
~~~~~~

Depending on the situation there might be an additional process required which must be manually performed by the system administrator.
Always check out the release changelog to find out whether it indicates to do so.


Performing major upgrade
------------------------

A major upgrade involves significant feature additions and structural changes.
DO NOT perform rolling upgrades in any cases.
Please make sure to shutdown every workload of the cluster and notify users of a relatively prolonged downtime.

To plan the upgrade, first check following facts:

* Upgrade the Backend.AI cluster to the very latest minor version of the prior release before starting major version upgrade.

By the policy, it is not allowed to upgrade the cluster to the latest major on a cluster with an outdated minor version installed.

* Do not skip the intermediate major version

You can not skip the stop-gap version!


Example of allowed upgrade paths
~~~~~~~~~~~~~~~~~~~~~
* **23.09.10 (latest in the previous major)** -> 24.03.0
* **23.09.10 (latest in the previous major)** -> 24.03.5
* 23.09.9 -> **23.09.10 (latest in the previous major)** -> 24.03.0
* 23.03.11 -> 23.09.0 -> 23.09.1 -> ... -> **23.09.10 (latest in the previous major)** -> 24.03.0
* ...

Example of forbidden upgrade paths
~~~~~~~~~~~~~~~~~~~~~~~
* 23.09.9 (a non-latest minor version of the prior release) -> 24.03.0
* 23.03.0 (not a direct prior release) -> 24.03.0
* ...


Upgrading Backend.AI Manager
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Stop the manager process running at server.
2. Upgrade the Python package by executing ``pip install -U backend.ai-manager==<target version>``.
3. Match databse schema with latest by executing ``alembic upgrade head``.
4. Fill out any missing DB revisions by executing ``backend.ai mgr schema apply-mission-revisions <version number of previous Backend.AI software>``.
5. Start the process again.


Upgrading other Backend.AI components
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Stop the ongoing server process.
2. Upgrade the Python package by executing ``pip install -U backend.ai-<component name>==<target version>``.
3. Restart the process.


Others
~~~~~~

Depending on the situation there might be an additional process required which must be manually performed by system administrator.
Always check out the release changelog to find out whether it indicates to do so.
135 changes: 132 additions & 3 deletions src/ai/backend/manager/cli/dbschema.py
Original file line number Diff line number Diff line change
@@ -1,17 +1,22 @@
from __future__ import annotations

import asyncio
import importlib.resources
import json
import logging
from typing import TYPE_CHECKING
import sys
from typing import TYPE_CHECKING, TypedDict

import click
from alembic import command
from alembic.config import Config
from alembic.runtime.migration import MigrationContext
from alembic.script import ScriptDirectory
from alembic.runtime.environment import EnvironmentContext
from alembic.runtime.migration import MigrationContext, MigrationStep
from alembic.script import Script, ScriptDirectory
from sqlalchemy.engine import Connection, Engine

from ai.backend.common.logging import BraceStyleAdapter
from ai.backend.manager import __version__

from ..models.alembic import invoked_programmatically
from ..models.base import metadata
Expand All @@ -23,6 +28,20 @@
log = BraceStyleAdapter(logging.getLogger(__spec__.name)) # type: ignore[name-defined]


class RevisionDump(TypedDict):
down_revision: str | None
revision: str
is_head: bool
is_branch_point: bool
is_merge_point: bool
doc: str


class RevisionHistory(TypedDict):
manager_version: str
revisions: list[RevisionDump]


@click.group()
def cli(args) -> None:
pass
Expand Down Expand Up @@ -62,6 +81,116 @@ async def _show(sa_url: str) -> None:
asyncio.run(_show(sa_url))


@cli.command()
@click.option(
"-f",
"--alembic-config",
default="alembic.ini",
type=click.Path(exists=True, dir_okay=False),
metavar="PATH",
help="The path to Alembic config file. [default: alembic.ini]",
)
@click.option(
"--output",
"-o",
default="-",
type=click.Path(dir_okay=False, writable=True),
help="Output file path (default: stdout)",
)
@click.pass_obj
def dump_history(cli_ctx: CLIContext, alembic_config: str, output: str) -> None:
"""Dump current alembic history in a serialiazable format."""

alembic_cfg = Config(alembic_config)
script = ScriptDirectory.from_config(alembic_cfg)
serialized_revisions = []

for sc in script.walk_revisions(base="base", head="heads"):
revision_dump = RevisionDump(
down_revision=sc._format_down_revision() if sc.down_revision else None,
revision=sc.revision,
is_head=sc.is_head,
is_branch_point=sc.is_branch_point,
is_merge_point=sc.is_merge_point,
doc=sc.doc,
)
serialized_revisions.append(revision_dump)

dump = RevisionHistory(manager_version=__version__, revisions=serialized_revisions)

if output == "-" or output is None:
print(json.dumps(dump, ensure_ascii=False, indent=2))
else:
with open(output, mode="w") as fw:
fw.write(json.dumps(dump, ensure_ascii=False, indent=2))


@cli.command()
@click.argument("previous_version", type=str, metavar="VERSION")
@click.option(
"-f",
"--alembic-config",
default="alembic.ini",
type=click.Path(exists=True, dir_okay=False),
metavar="PATH",
help="The path to Alembic config file. [default: alembic.ini]",
)
@click.option(
"--dry-run",
default=False,
is_flag=True,
help="When specified, this command only informs of revisions unapplied without actually applying it to the database.",
)
@click.pass_obj
def apply_missing_revisions(
cli_ctx: CLIContext, previous_version: str, alembic_config: str, dry_run: bool
) -> None:
"""
Compare current alembic revision paths with the given serialized
alembic revision history and try to execute every missing revisions.
"""
with importlib.resources.as_file(
importlib.resources.files("ai.backend.manager.models.alembic.revision_history")
) as f:
try:
with open(f / f"{previous_version}.json", "r") as fr:
revision_history: RevisionHistory = json.loads(fr.read())
except FileNotFoundError:
log.error(
"Could not find revision history dump as of Backend.AI version {}. Make sure you have upgraded this Backend.AI cluster to very latest version of prior major release before initiating this major upgrade.",
previous_version,
)
sys.exit(1)

alembic_cfg = Config(alembic_config)
script_directory = ScriptDirectory.from_config(alembic_cfg)
revisions_to_apply: dict[str, Script] = {}

for sc in script_directory.walk_revisions(base="base", head="heads"):
revisions_to_apply[sc.revision] = sc

for applied_revision in revision_history["revisions"]:
del revisions_to_apply[applied_revision["revision"]]

log.info("Applying following revisions:")
scripts = list(revisions_to_apply.values())[::-1]

for script_to_apply in scripts:
log.info(" {}", str(script_to_apply))

if not dry_run:
with EnvironmentContext(
alembic_cfg,
script_directory,
fn=lambda rev, con: [
MigrationStep.upgrade_from_script(script_directory.revision_map, script_to_apply)
for script_to_apply in scripts
],
destination_rev=script_to_apply.revision,
):
script_directory.run_env()


@cli.command()
@click.option(
"-f",
Expand Down
1 change: 1 addition & 0 deletions src/ai/backend/manager/models/alembic/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,6 @@ resources(
name="resources",
sources=[
"script.py.mako",
"revision_history/*.json",
],
)
Loading

0 comments on commit a58c599

Please sign in to comment.