Skip to content

Commit

Permalink
docs: update what is database schema drift
Browse files Browse the repository at this point in the history
  • Loading branch information
tianzhou committed Jan 21, 2024
1 parent 51afbb9 commit f7a9624
Showing 1 changed file with 16 additions and 7 deletions.
23 changes: 16 additions & 7 deletions content/blog/what-is-database-schema-drift.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: What is Database Schema Drift?
author: Tianzhou
published_at: 2022/01/29 12:29:20
published_at: 2023/12/20 12:29:20
feature_image: /content/blog/what-is-database-schema-drift/database-schema-drift.webp
tags: Explanation
description: Database schema drift is the case where the actual schema in the live database is different from the source of truth. It's also one of the most frequent root cause of the database related outages.
Expand All @@ -18,20 +18,21 @@ This is a series of articles about database version control and database-as-code

---

Database schema drift or just schema drift is the case where the actual schema in the live database (the actual state) is different from the source of truth (the desired state). It's also one of the most frequent root causes of the database related outages.
**Database schema drift** or just **schema drift** is the case where the actual schema in the live database (the actual state) is different from the source of truth (the desired state). It's also one of the most frequent root causes of the database related outages.

The **live database part** is easy to understand:

- For MySQL, it's the output of  *mysqldump --no-data*
- For PostgreSQL, it's the output of _pg_dump --schema-only_
- For MySQL, it's the output of `mysqldump --no-data`
- For PostgreSQL, it's the output of `pg_dump --schema-only`

While the **source of truth part** is more complex...

## The source of truth

One may first wonder why we need to keep a separate source of truth. The reason is because the schema in the live database may not always be the desired state. Human error or software bugs could both accidentally change the database schema. So it's better to have a separate source of truth, this idea is similar to the classic [double-entry bookkeeping](https://en.wikipedia.org/wiki/Double-entry_bookkeeping) used in accounting.

Naturally, a good place to store this source of truth is the version control system (VCS), the same place where the application code is stored. This is known as [database-as-code](/blog/database-as-code), a GitOps practice. Solutions like Liquibase, Flyway and etc all support this approach. Bytebase also supports this and even go a step further to provide point-and-click UI to configure this [VCS integration](/docs/vcs-integration/overview).
Naturally, a good place to store this source of truth is the version control system (VCS), the same place where the application code is stored. This is known as database-as-code, a GitOps practice. Solutions like Liquibase, Flyway support this approach. Bytebase also supports this and even go a step further to provide point-and-click UI to configure this.

![_](/content/blog/what-is-database-schema-drift/project-vcs.webp)

## The format of source of truth
Expand All @@ -41,7 +42,7 @@ After we figure out where to store the source of truth, next we need fo decide w
- State-based approach stores the desired end state of the entire schema in the code repository
- Migration-based approach stores the migration scripts in the repository. Each script contains a set of DDL statements such as CREATE/ALTER/DROP TABLE. The desired schema state is achieved by executing each of those scripts in a deterministic order.

State-based approach is a more intuitive format since a single file corresponds to a database schema. However, state-based approach has its limitation and is hard to get right in all scenarios (check [state-based or migration-based](/blog/database-version-control-state-based-vs-migration-based) post for details). That's why Liquibase, Flyway as well as Bytebase all choose migration-based approach.
State-based approach is a more intuitive format since a single file corresponds to a database schema. However, state-based approach has its limitation. e.g. it can't distinguish between table rename and drop/create table. Bytebase supports both approaches.

## Schema drift detection

Expand All @@ -62,4 +63,12 @@ The detailed drift👇

Database schema drift is one of the most frequent root causes of the database related outages. Recording the desired database schema state in the version control system like GitLab is the first stepping stone to tackle it. Bytebase takes the extra mile to present an easy-to-use UI for users to adopt this practice and surfaces detailed schema drift info once detected.

If this interests you, please do check out our [demo](/view-live-demo) or use 1 line command to [deploy yourself](/docs/get-started/self-host/#docker).
If this interests you, please do check out our [demo](/view-live-demo) or use 1 liner to [deploy yourself](/docs/get-started/self-host/#docker).

![change-query-secure-govern-database-all-in-one](/images/db-scheme-lg.png)

## Further Readings

- [Database-as-Code](/blog/database-as-code)
- [Bytebase Database-as-Code VCS integration](/docs/vcs-integration/overview)
- [Database Version Control, State-based or Migration-based?](/blog/database-version-control-state-based-vs-migration-based)

1 comment on commit f7a9624

@vercel
Copy link

@vercel vercel bot commented on f7a9624 Jan 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.