Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update TTL doc to allow split tasks for utf8 column #18713

Merged
merged 5 commits into from
Oct 16, 2024
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion time-to-live.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,16 @@
* The TTL attribute cannot be set on temporary tables, including local temporary tables and global temporary tables.
* A table with the TTL attribute does not support being referenced by other tables as the primary table in a foreign key constraint.
* It is not guaranteed that all expired data is deleted immediately. The time when expired data is deleted depends on the scheduling interval and scheduling window of the background cleanup job.
* For tables that use [clustered indexes](/clustered-indexes.md), if the primary key is neither an integer nor a binary string type, the TTL job cannot be split into multiple tasks. This will cause the TTL job to be executed sequentially on a single TiDB node. If the table contains a large amount of data, the execution of the TTL job might become slow.
* For tables that use [clustered indexes](/clustered-indexes.md), a TTL job can be split into multiple subtasks only in the following scenarios:
- The first column of the primary key or composite primary key is of `INTEGER` or binary string types. The binary string types mainly refer to the following:
- `CHAR(N) CHARACTER SET BINARY`
- `VARCHAR(N) CHARACTER SET BINARY`
- `BINARY(N)`
- `VARBINARY(N)`
- `BIT(N)`
- The character set of the first column of the primary key or composite primary key is `utf8` or `utf8mb4` and the collate is `utf8_bin`, `utf8mb4_bin` or `utf8mb4_0900_bin`.
lcwangchao marked this conversation as resolved.
Show resolved Hide resolved
- For tables where the primary key column type is `utf8` or `utf8mb4`, subtasks are split only based on the range of visible ASCII characters. If many primary key values have the same ASCII prefix, it might cause uneven task splitting.

Check warning on line 268 in time-to-live.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion. Raw Output: {"message": "[PingCAP.Ambiguous] Consider using a clearer word than 'many' because it may cause confusion.", "location": {"path": "time-to-live.md", "range": {"start": {"line": 268, "column": 151}}}, "severity": "INFO"}
lcwangchao marked this conversation as resolved.
Show resolved Hide resolved
- For tables that do not support splitting a TTL job into multiple subtasks, the TTL job will be executed sequentially on a single TiDB node. If the table contains a large amount of data, the execution of the TTL job might become slow.

## FAQs

Expand Down
Loading