Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

system tables: add a note for DATA_LENGTH, INDEX_LENGTH, and TABLE_SIZE #19391

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

qiancai
Copy link
Collaborator

@qiancai qiancai commented Nov 13, 2024

First-time contributors' checklist

What is changed, added or deleted? (Required)

Add a note for DATA_LENGTH, INDEX_LENGTH, and TABLE_SIZE to explain they are logical estimates instead of actual physical sizes.

Which TiDB version(s) do your changes apply to? (Required)

Tips for choosing the affected version(s):

By default, CHOOSE MASTER ONLY so your changes will be applied to the next TiDB major or minor releases. If your PR involves a product feature behavior change or a compatibility change, CHOOSE THE AFFECTED RELEASE BRANCH(ES) AND MASTER.

For details, see tips for choosing the affected versions.

  • master (the latest development version)
  • v8.5 (TiDB 8.5 versions)
  • v8.4 (TiDB 8.4 versions)
  • v8.3 (TiDB 8.3 versions)
  • v8.2 (TiDB 8.2 versions)
  • v8.1 (TiDB 8.1 versions)
  • v7.5 (TiDB 7.5 versions)
  • v7.1 (TiDB 7.1 versions)
  • v6.5 (TiDB 6.5 versions)
  • v6.1 (TiDB 6.1 versions)
  • v5.4 (TiDB 5.4 versions)
  • v5.3 (TiDB 5.3 versions)

What is the related PR or file link(s)?

  • This PR is translated from:
  • Other reference link(s):

Do your changes match any of the following descriptions?

  • Delete files
  • Change aliases
  • Need modification after applied to another branch
  • Might cause conflicts after applied to another branch

Copy link

ti-chi-bot bot commented Nov 13, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from qiancai, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added missing-translation-status This PR does not have translation status info. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Nov 13, 2024
@qiancai qiancai added type/enhancement The issue or PR belongs to an enhancement. needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. needs-cherry-pick-release-8.4 Should cherry pick this PR to release-8.4 branch. translation/doing This PR's assignee is translating this PR. labels Nov 13, 2024
@ti-chi-bot ti-chi-bot bot removed the missing-translation-status This PR does not have translation status info. label Nov 13, 2024
@qiancai qiancai self-assigned this Nov 13, 2024
Copy link
Contributor

@dveeden dveeden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should:

  1. Link to the FAQ about storage size
  2. Explain the storage size (possibly on another page):
    2a. The size of the table as it is currently.
    2b. The size of the table including deleted/updated rows. (before they are removed by GC). Maybe link to garbage-collection-overview.md.
    2c. The size on disk of various levels in RocksDB, including compression. Maybe link to compression-per-level.
  3. Link from I_S.tables to TABLE_STORAGE_STATS
  4. Explain how TiFlash storage size affects this
  5. Explain how PD's max-replicas causes up to 3 copies of the data and how this influences I_S.tables.DATA_LENGTH
  6. As storage efficiency is a major advantage of our product, maybe list this on overview.md
  7. Maybe explain why DATA_FREE is 0 for TiKV. For MySQL/InnoDB tablespaces never shrink and one needs to rebuild to freeup space. This is also a advantage of our product.

@@ -59,5 +59,5 @@ Fields in the `TABLE_STORAGE_STATS` table are described as follows:
* `PEER_COUNT`: The number of replicas of the table.
* `REGION_COUNT`: The number of Regions.
* `EMPTY_REGION_COUNT`: The number of Regions that do not contain data in this table.
* `TABLE_SIZE`: The total size of the table, in the unit of MiB.
* `TABLE_SIZE`: The total size of the table, in the unit of MiB. Note that this value is a logical estimate based on table statistics and does not represent the actual compressed physical size in storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this is based on statistics is good. However that it doesn't represent the "actual compressed physical size in storage" is not because the statistics aren't precise enough. This is because the size in storage and is measuring something different.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it represent the statistical size of the uncompressed data? With or without possible changed rows or tombstones?

@dveeden dveeden requested a review from mjonss November 13, 2024 06:43
@dveeden
Copy link
Contributor

dveeden commented Nov 13, 2024

@mjonss I assume that partitioning and global index work as expected when it comes to the storage size explained here?

@dveeden
Copy link
Contributor

dveeden commented Nov 13, 2024

TIKV_REGION_STATUS seems to depend on Prometheus being available. However this isn't noted on https://docs.pingcap.com/tidb/stable/information-schema-tikv-region-status

The result is that size calculation would fail with this error:

ERROR 1105 (HY000): query metric error: [domain:9009]Prometheus address is not set in PD and etcd

If a custom Prometheus is used then the address might not be configured correctly in pd/etcd. If there is no Prometheus we may want to see if there are any alternatives available

Copy link

ti-chi-bot bot commented Nov 14, 2024

@AilinKid: adding LGTM is restricted to approvers and reviewers in OWNERS files.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@qiancai qiancai added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-8.1 Should cherry pick this PR to release-8.1 branch. needs-cherry-pick-release-8.4 Should cherry pick this PR to release-8.4 branch. needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. translation/doing This PR's assignee is translating this PR. type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants