Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add doc for schema cache #18646

Merged
merged 22 commits into from
Oct 15, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions TOC.md
Original file line number Diff line number Diff line change
Expand Up @@ -1026,6 +1026,7 @@
- [`schema_unused_indexes`](/sys-schema/sys-schema-unused-indexes.md)
- [元数据锁](/metadata-lock.md)
- [TiDB 加速建表](/accelerated-table-creation.md)
- [schema 缓存](/schema-cache.md)
lilin90 marked this conversation as resolved.
Show resolved Hide resolved
- UI
- TiDB Dashboard
- [简介](/dashboard/dashboard-intro.md)
Expand Down
32 changes: 32 additions & 0 deletions schema-cache.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
title: schema 缓存
lilin90 marked this conversation as resolved.
Show resolved Hide resolved
aliases: ['/docs-cn/dev/information-schema-cache']
summary: TiDB 对于 schema 信息采用缓存机制,在大量数据库和表的场景下能够显著减少 schema 信息的内存占用以及提高性能。
wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved
---

# schema 缓存
lilin90 marked this conversation as resolved.
Show resolved Hide resolved

在一些多租户的场景下,可能会存在几十万甚至上百万个数据库和表。这些数据库和表的 schema 信息如果全部加载到内存中,一方面会占用大量的内存,另一方面会导致相关的访问性能变差。为了解决这个问题,TiDB 引入了 schema 缓存机制。采用类似于 LRU 的机制,只将最近用到的数据库和表的 schema 信息缓存到内存中。
wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved

wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved
## 配置
lilin90 marked this conversation as resolved.
Show resolved Hide resolved

可以通过配置系统变量 [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-从-v800-版本开始引入) 来打开 schema 缓存特性。
wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved

## 最佳实践

- 在大量数据库和表的场景下(例如10万以上的数据库和表数量)或者当数据库和表的数量大到影响系统性能时,建议打开 schema 缓存特性。
wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved
- 可以通过观测 TiDB 监控中 Schema load 下的子面板 Infoschema v2 Cache Operation 来查看 schema 缓存的命中率。如果命中率较低,可以调大 [`tidb_schema_cache_size`](/system-variables.md#tidb_schema_cache_size-从-v800-版本开始引入)
- 可以通过观测 TiDB 监控中 Schema load 下的子面板 Infoschema v2 Cache Size 来查看当前使用的 schema 缓存的大小。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

要不要贴个截图

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉不需要,用户如果熟悉监控的话也能找到

hfxsd marked this conversation as resolved.
Show resolved Hide resolved
- 建议关闭 [`performance.force-init-stats`](/tidb-configuration-file.md#force-init-stats-从-v657-和-v710-版本开始引入) 以减少 TiDB 的启动时间。
- 建议关闭 [`split-table`](/tidb-configuration-file.md#split-table) 以减少 region 数量,从而降低 TiKV 的内存。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议加上场景“如果需要创建大量的表(例如 10 万张以上),建议将此参数设置为 false”

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

你用建议模式改下?

wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved

## 已知限制

- 在大量数据库和表的场景下,统计信息不一定能够及时收集。
- 在大量数据库和表的场景下,一些元数据信息的访问会变慢。
- 在大量数据库和表的场景下,TiDB 的启动时间会变长。开启 schema 缓冲能够缓解这个问题。
wjhuang2016 marked this conversation as resolved.
Show resolved Hide resolved
- 在大量数据库和表的场景下,切换 schema 缓存开关需要等待一段时间。
- 在大量数据库和表的场景下,全量列举所有元数据信息的相关操作会变慢,如:
- `SHOW FULL TABLES`
- `FLASHBACK`
- `ALTER TABLE ... SET TIFLASH MODE ...`