-
Notifications
You must be signed in to change notification settings - Fork 685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
system tables: add a note for DATA_LENGTH, INDEX_LENGTH, and TABLE_SIZE #19391
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should:
- Link to the FAQ about storage size
- Explain the storage size (possibly on another page):
2a. The size of the table as it is currently.
2b. The size of the table including deleted/updated rows. (before they are removed by GC). Maybe link togarbage-collection-overview.md
.
2c. The size on disk of various levels in RocksDB, including compression. Maybe link tocompression-per-level
. - Link from I_S.tables to
TABLE_STORAGE_STATS
- Explain how TiFlash storage size affects this
- Explain how PD's
max-replicas
causes up to 3 copies of the data and how this influences I_S.tables.DATA_LENGTH - As storage efficiency is a major advantage of our product, maybe list this on
overview.md
- Maybe explain why DATA_FREE is 0 for TiKV. For MySQL/InnoDB tablespaces never shrink and one needs to rebuild to freeup space. This is also a advantage of our product.
@@ -59,5 +59,5 @@ Fields in the `TABLE_STORAGE_STATS` table are described as follows: | |||
* `PEER_COUNT`: The number of replicas of the table. | |||
* `REGION_COUNT`: The number of Regions. | |||
* `EMPTY_REGION_COUNT`: The number of Regions that do not contain data in this table. | |||
* `TABLE_SIZE`: The total size of the table, in the unit of MiB. | |||
* `TABLE_SIZE`: The total size of the table, in the unit of MiB. Note that this value is a logical estimate based on table statistics and does not represent the actual compressed physical size in storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that this is based on statistics is good. However that it doesn't represent the "actual compressed physical size in storage" is not because the statistics aren't precise enough. This is because the size in storage and is measuring something different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it represent the statistical size of the uncompressed data? With or without possible changed rows or tombstones?
@mjonss I assume that partitioning and global index work as expected when it comes to the storage size explained here? |
The result is that size calculation would fail with this error:
If a custom Prometheus is used then the address might not be configured correctly in pd/etcd. If there is no Prometheus we may want to see if there are any alternatives available |
@AilinKid: adding LGTM is restricted to approvers and reviewers in OWNERS files. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
First-time contributors' checklist
What is changed, added or deleted? (Required)
Add a note for DATA_LENGTH, INDEX_LENGTH, and TABLE_SIZE to explain they are logical estimates instead of actual physical sizes.
Which TiDB version(s) do your changes apply to? (Required)
Tips for choosing the affected version(s):
By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.
For details, see tips for choosing the affected versions.
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?