Vitess v18.0.0
Release of Vitess v18.0.0
Summary
Table of Contents
- Major Changes
Major Changes
Breaking Changes
Local examples now use etcd v3 storage and API
In previous releases the local examples were
explicitly using etcd v2 storage (etcd --enable-v2=true
) and API (ETCDCTL_API=2
) mode. We have now
removed this legacy etcd usage and instead use the new (default) etcd v3 storage and API. Please see
PR #13791 for details. If you are using the local
examples in any sort of long-term non-testing capacity, then you will need to explicitly use the v2 storage
and API mode or migrate your existing data from v2 to v3.
New command line flags and behavior
VTOrc flag --allow-emergency-reparent
VTOrc has a new flag --allow-emergency-reparent
that specifies whether VTOrc is allowed to run emergency
failover operations. Users that want VTOrc to fix replication issues, but don't want it to run any failovers
should use this flag. This flag defaults to true
which corresponds to the default behavior from prior releases.
VTOrc flag --change-tablets-with-errant-gtid-to-drained
VTOrc has a new flag --change-tablets-with-errant-gtid-to-drained
that allows users to choose whether VTOrc should change the
tablet type of tablets with errant GTIDs to DRAINED
. By default, this flag is disabled.
This feature allows users to configure VTOrc such that any tablet that encounters errant GTIDs is automatically taken out of the
serving graph. These tablets can then be inspected for what the errant GTIDs are, and once fixed, they can rejoin the cluster.
ERS sub flag --wait-for-all-tablets
vtctldclient command EmergencyReparentShard
has a new sub-flag --wait-for-all-tablets
that makes EmergencyReparentShard
wait
for a response from all the tablets. Originally EmergencyReparentShard
was meant only to be run when a primary tablet is unreachable.
We have realized now that there are cases when replication is broken but all tablets are reachable. In these cases, it is advisable to
call EmergencyReparentShard
with --wait-for-all-tablets
so that it does not ignore any of the tablets.
VTGate GRPC stream execute session flag --grpc-send-session-in-streaming
This flag enables transaction support on VTGate's StreamExecute
gRPC API.
When this is enabled, StreamExecute
will return the session in the last packet of the response.
Users should enable this flag only after client code has been changed to expect such a packet.
It is disabled by default.
Experimental Foreign Key Support
A new optional field foreignKeyMode
has been added to the VSchema. This field can be provided for each keyspace. The VTGate flag --foreign_key_mode
has been deprecated in favor of this field.
There are 3 foreign key modes now supported in Vitess -
unmanaged
-
This mode represents the default behavior in Vitess, where it does not manage foreign key column references. Users are responsible for configuring foreign keys in MySQL in such a way that related rows, as determined by foreign keys, reside within the same shard.managed
[EXPERIMENTAL] -
In this experimental mode, Vitess is fully aware of foreign key relationships and actively tracks foreign key constraints using the schema tracker. VTGate will handle DML operations with foreign keys and correctly cascade updates and deletes.
It will also verifyrestrict
constraints and validate the existence of parent rows before inserting child rows.
This ensures that all child operations are logged in binary logs, unlike the InnoDB implementation of foreign keys.
This allows the usage of VReplication workflows with foreign keys.
Implementation details are documented in the RFC for foreign keys.disallow
-
In this mode Vitess explicitly disallows any DDL statements that try to create a foreign key constraint. This mode is equivalent to running VTGate with the flag--foreign_key_mode=disallow
.
In addition to query support, there is a new flag to MoveTables
called --atomic-copy
which should be used to import data into Vitess from databases which have foreign keys defined in the schema.
Upgrade process
After upgrading from v17 to v18, users should specify the correct foreign key mode for all their keyspaces in the VSchema using the new property.
Once this change has taken effect, the deprecated flag --foreign_key_mode
can be dropped from all VTGates. Note that this is only required if running in disallow
mode.
No action is needed to use unmanaged
mode.
VTAdmin
vtadmin-web updated to node v18.16.0 (LTS)
Building vtadmin-web
now requires node >= v18.16.0 (LTS). Breaking changes from v16 to v18 are listed
in https://nodejs.org/en/blog/release/v18, but none apply to VTAdmin. Full details on node v18.16.0 are listed
on https://nodejs.org/en/blog/release/v18.16.0.
Deprecations and Deletions
Legacy Client Binaries
vtctldclient
is our new modern Vitess controller daemon (vtctld
) client – which you will use to perform commands
and take actions in your Vitess clusters. It is replacing the legacy vtctl
/vtctlclient
binaries.
Some of the benefits are:
- Dedicated RPCs for each command
that are used betweenvtctldclient
andvtctld
– this offers clean separation of commands and makes it easier to
develop new features without impacting other commands. This also presents an API that other clients (both Vitess and
3rd party) can use to interface with Vitess. - Use of modern frameworks:
pFlag
,Cobra
, andViper
.
This makes development easier while also offering a better UX. For example, this offers a way to use
configuration files with support for
dynamic configuration (see also). - The reference documentation is now built through code. This
removes a burden from developers while helping users by ensuring the docs are always correct and up-to-date.
In Vitess 18 we have completed migrating all client commands to vtctldclient
– the last ones being the OnlineDDL
and VReplication commands. With this work now completed, the
legacy vtctl
/vtctlclient
binaries are now fully deprecated and we plan to remove them in Vitess 19. You should
begin your transition before upgrading to Vitess 18.
Deprecated Command Line Flags
Throttler related vttablet
flags:
--throttle_threshold
is deprecated and will be removed inv19
--throttle_metrics_query
is deprecated and will be removed inv19
--throttle_metrics_threshold
is deprecated and will be removed inv19
--throttle_check_as_check_self
is deprecated and will be removed inv19
--throttler-config-via-topo
is deprecated after assumedtrue
inv17
. It will be removed in a future version.
Cache related vttablet
flags:
--queryserver-config-query-cache-lfu
is deprecated and will be removed inv19
. The query cache always uses a LFU implementation now.--queryserver-config-query-cache-size
is deprecated and will be removed inv19
. This option only applied to LRU caches, which are now unsupported.
Buffering related vtgate
flags:
--buffer_implementation
is deprecated and will be removed inv19
Cache related vtgate
flags:
--gate_query_cache_lfu
is deprecated and will be removed inv19
. The query cache always uses a LFU implementation now.--gate_query_cache_size
is deprecated and will be removed inv19
. This option only applied to LRU caches, which are now unsupported.
VTGate flags:
--schema_change_signal_user
is deprecated and will be removed inv19
--foreign_key_mode
is deprecated and will be removed inv19
. For more detail read the foreign keys section.
VDiff v1:
VDiff v2 was added in Vitess 15 and marked as GA in 16.
The legacy v1 client command is now deprecated in Vitess 18 and will be removed in Vitess 19.
Please switch all of your usage to the new VDiff client command ASAP.
Deprecated Stats
The following EmergencyReparentShard
stats are deprecated in v18
and will be removed in v19
:
ers_counter
ers_success_counter
ers_failure_counter
These metrics are replaced by new reparenting stats introduced in v18
.
VTBackup stat DurationByPhase
is deprecated. Use the binary-valued Phase
stat instead.
Deleted Command Line Flags
Flags in vtcombo
:
--vtctld_addr
Flags in vtctldclient ApplySchema
:
--skip-preflight
Flags in vtctl ApplySchema
:
--skip_preflight
Flags in vtgate
:
--vtctld_addr
Flags in vttablet
:
--vtctld_addr
--use_super_read_only
--disable-replication-manager
--init_populate_metadata
--queryserver-config-pool-prefill-parallelism
--queryserver-config-stream-pool-prefill-parallelism
--queryserver-config-transaction-pool-prefill-parallelism
--queryserver-config-schema-change-signal-interval
--enable-lag-throttler
Flags in vtctld
:
--vtctld_show_topology_crud
--durability_policy
Flags in vtorc
:
--lock-shard-timeout
--orc_web_dir
Deleted v3
planner
The Gen4
planner has been the default planner since Vitess 14. The v3
planner was deprecated in Vitess 15 and has been removed in Vitess 18.
Deleted k8stopo
k8stopo
was deprecated in Vitess 17, see #13298. It has now been removed.
Deleted vtgr
vtgr
was deprecated in Vitess 17, see #13300. It has now been removed.
Deleted query_analyzer
The undocumented query_analyzer
binary has been removed in Vitess 18, see #14054.
New stats
VTGate Vindex unknown parameters
The VTGate stat VindexUnknownParameters
gauges unknown Vindex parameters found in the latest VSchema pulled from the topology.
VTBackup Phase
stat
In v17, the vtbackup
stat DurationByPhase
stat was added to measure the time spent by vtbackup
in each phase. This stat turned out to be awkward to use in production, and has been replaced in v18 by a binary-valued Phase
stat.
Phase
reports a 1 (active) or a 0 (inactive) for each of the following phases:
CatchupReplication
InitialBackup
RestoreLastBackup
TakeNewBackup
To calculate how long vtbackup
has spent in a given phase, sum the 1-valued data points over time and multiply by the data collection or reporting interval. For example, in Prometheus:
sum_over_time(vtbackup_phase{phase="TakeNewBackup"}) * <interval>
VTBackup PhaseStatus
stat
PhaseStatus
reports a 1 (active) or a 0 (inactive) for each of the following phases and statuses:
CatchupReplication
phase has statusesStalled
andStopped
.Stalled
is set to1
when replication stops advancing.Stopped
is set to1
when replication stops beforevtbackup
catches up with the primary.
Backup and restore metrics for AWS S3
Requests to AWS S3 are instrumented in backup and restore metrics. For example:
vtbackup_backup_count{component="BackupStorage",implementation="S3",operation="AWS:Request:Send"} 823
vtbackup_backup_duration_nanoseconds{component="BackupStorage",implementation="S3",operation="AWS:Request:Send"} 1.33632421437e+11
vtbackup_restore_count{component="BackupStorage",implementation="S3",operation="AWS:Request:Send"} 165
vtbackup_restore_count{component="BackupStorage",implementation="S3",operation="AWS:Request:Send"} 165
VTCtld and VTOrc reparenting stats
New VTCtld and VTOrc stats were added to measure frequency of reparents by keyspace/shard:
emergency_reparent_counts
- Number of timesEmergencyReparentShard
has been run. It is further subdivided by the keyspace, shard and the result of the operation.planned_reparent_counts
- Number of timesPlannedReparentShard
has been run. It is further subdivided by the keyspace, shard and the result of the operation.
Also, the reparent_shard_operation_timings
stat was added to provide per-operation timings of reparent operations.
VTTablet
New ResetSequences rpc
A new VTTablet RPC ResetSequences
has been added, which is being used by MoveTables
and Migrate
for workflows
where a sequence
table is being moved (#13238). This has an impact on the
Vitess upgrade process from an earlier version if you need to use such a workflow before the entire cluster is upgraded.
Any MoveTables or Migrate workflow that moves a sequence table should only be run after all vitess components have been
upgraded, and no upgrade should be done while such a workflow is in progress.
New Dry-run/monitoring-only mode for the transaction throttler
A new CLI flag --tx-throttler-dry-run
to set the Transaction Throttler to monitoring-only/dry-run mode has been added.
If the transaction throttler is enabled with --enable-tx-throttler
and the new dry-run flag is also specified, the
tablet will not actually throttle any transactions; however, it will increase the counters for transactions throttled
(vttablet_transaction_throttler_throttled
). This allows users to deploy the transaction throttler in production and
gain observability on how much throttling would take place, without actually throttling any requests.
Docker Builds
Bookworm added and made default
Bookworm was released on 2023-06-10, and will be the new default base container for Docker builds.
Bullseye images will still be built and available as long as the OS build is current, tagged with the -bullseye
suffix.
Buster removed
Buster LTS supports will stop in June 2024, and Vitess 18 will be supported through October 2024.
To prevent supporting a deprecated buster build for several months after June 2024, we are preemptively
removing Vitess support for Buster.
Durability Policies
New Durability Policies
Two new built-in durability policies have been added in Vitess 18: semi_sync_with_rdonly_ack
and cross_cell_with_rdonly_ack
.
These policies are similar to semi_sync
and cross_cell
respectively, the only difference is that rdonly
tablets can also send semi-sync ACKs.
The entire changelog for this release can be found here.
The release includes 420 merged Pull Requests.
Thanks to all our contributors: @GuptaManan100, @Juneezee, @L3o-pold, @adsr, @ajm188, @app/dependabot, @app/github-actions, @app/vitess-bot, @arvind-murty, @austenLacy, @brendar, @davidpiegza, @dbussink, @deepthi, @derekperkins, @ejortegau, @frouioui, @harshit-gangal, @hkdsun, @jfg956, @jspawar, @mattlord, @maxenglander, @mdlayher, @notfelineit, @olyazavr, @pbibra, @peterlyoo, @rafer, @rohit-nayak-ps, @shlomi-noach, @systay, @timvaillancourt, @vmg, @yields