Skip to content

Commit

Permalink
[docs](stats) Update statistics related content apache#21766
Browse files Browse the repository at this point in the history
1. Update grammar of `ANALYZE`
2. Add command description about how to delete a analyze job
  • Loading branch information
Kikyou1997 authored Jul 13, 2023
1 parent e167394 commit 06d129c
Show file tree
Hide file tree
Showing 2 changed files with 32 additions and 16 deletions.
24 changes: 16 additions & 8 deletions docs/en/docs/query-acceleration/statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl(city, age, sex);

##### Collect histogram information

Column histogram information is used to describe the distribution of columns. It divides the data into several intervals (buckets) according to the size, and uses simple statistics to represent the characteristics of the data in each interval. Collected by `ANALYZE TABLE` statement fit `UPDATE HISTOGRAM`.
Column histogram information is used to describe the distribution of columns. It divides the data into several intervals (buckets) according to the size, and uses simple statistics to represent the characteristics of the data in each interval. Collected by `ANALYZE TABLE` statement fit `WITH HISTOGRAM`.

Columns can be specified to collect their histogram information in the same way that normal statistics are collected. Collecting histogram information takes longer than normal statistics, so to reduce overhead, we can just collect histogram information for specific columns for the optimizer to use.

Expand All @@ -196,7 +196,7 @@ Example:
- Collects `example_tbl` histograms for all columns of a table, using the following syntax:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM;
+--------+
| job_id |
+--------+
Expand All @@ -207,7 +207,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM;
- Collect `example_tbl` histograms for table `city` `age` `sex` columns, using the following syntax:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl(city, age, sex) UPDATE HISTOGRAM;
mysql> ANALYZE TABLE stats_test.example_tbl(city, age, sex) WITH HISTOGRAM;
+--------+
| job_id |
+--------+
Expand All @@ -219,15 +219,15 @@ mysql> ANALYZE TABLE stats_test.example_tbl(city, age, sex) UPDATE HISTOGRAM;

```SQL
-- use with buckets
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM WITH BUCKETS 2;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM WITH BUCKETS 2;
+--------+
| job_id |
+--------+
| 52018 |
+--------+

-- configure num.buckets
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM PROPERTIES("num.buckets" = "2");
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM PROPERTIES("num.buckets" = "2");
+--------+
| job_id |
+--------+
Expand Down Expand Up @@ -330,7 +330,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl PROPERTIES("sample.percent" = "50");
- Samples collect `example_tbl` histogram information for a table, similar to normal statistics, using the following syntax:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM WITH SAMPLE ROWS 5;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM WITH SAMPLE ROWS 5;
+--------+
| job_id |
+--------+
Expand All @@ -357,7 +357,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl PROPERTIES("sync" = "true");
- Samples collect `example_tbl` histogram information for a table, similar to normal statistics, using the following syntax:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM WITH SYNC;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM WITH SYNC;
```

### Automatic collection
Expand Down Expand Up @@ -393,7 +393,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl PROPERTIES("period.seconds" = "86400
- Collects `example_tbl` histogram information for a table periodically (every other day), similar to normal statistics, using the following syntax:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM WITH PERIOD 86400;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM WITH PERIOD 86400;
+--------+
| job_id |
+--------+
Expand Down Expand Up @@ -861,6 +861,14 @@ mysql> DROP STATS stats_test.example_tbl;
mysql> DROP STATS stats_test.example_tbl(city, age, sex);
```

## Delete Analyze Job

User can delete automatic/periodic Analyze jobs based on job ID.

```sql
DROP ANALYZE JOB [JOB_ID]
```

## ANALYZE configuration item

To be added.
24 changes: 16 additions & 8 deletions docs/zh-CN/docs/query-acceleration/statistics.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,7 +218,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl(city, age, sex);

##### 收集直方图信息

列直方图信息用于描述列分布的情况,它将数据根据大小分成若干个区间(桶),并使用简单的统计量来表示每个区间中数据的特征。通过 `ANALYZE TABLE` 语句配合 `UPDATE HISTOGRAM` 进行收集。
列直方图信息用于描述列分布的情况,它将数据根据大小分成若干个区间(桶),并使用简单的统计量来表示每个区间中数据的特征。通过 `ANALYZE TABLE` 语句配合 `WITH HISTOGRAM` 进行收集。

和收集普通统计信息一样,可以指定列来收集其直方图信息。相比普通统计信息,收集直方图信息耗时更长,所以为了降低开销,我们可以只收集特定列的直方图信息供优化器使用。

Expand All @@ -227,7 +227,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl(city, age, sex);
- 收集 `example_tbl` 表所有列的直方图,使用以下语法:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM;
+--------+
| job_id |
+--------+
Expand All @@ -238,7 +238,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM;
- 收集 `example_tbl``city`, `age`, `sex` 列的直方图,使用以下语法:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl(city, age, sex) UPDATE HISTOGRAM;
mysql> ANALYZE TABLE stats_test.example_tbl(city, age, sex) WITH HISTOGRAM;
+--------+
| job_id |
+--------+
Expand All @@ -250,15 +250,15 @@ mysql> ANALYZE TABLE stats_test.example_tbl(city, age, sex) UPDATE HISTOGRAM;

```SQL
-- 使用with buckets
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM WITH BUCKETS 2;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM WITH BUCKETS 2;
+--------+
| job_id |
+--------+
| 52018 |
+--------+

-- 配置num.buckets
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM PROPERTIES("num.buckets" = "2");
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM PROPERTIES("num.buckets" = "2");
+--------+
| job_id |
+--------+
Expand Down Expand Up @@ -361,7 +361,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl PROPERTIES("sample.percent" = "50");
- 抽样收集 `example_tbl` 表的直方图信息,与普通统计信息类似,使用以下语法:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM WITH SAMPLE ROWS 5;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM WITH SAMPLE ROWS 5;
+--------+
| job_id |
+--------+
Expand All @@ -388,7 +388,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl PROPERTIES("sync" = "true");
- 抽样收集 `example_tbl` 表的直方图信息,与普通统计信息类似,使用以下语法:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM WITH SYNC;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM WITH SYNC;
```

### 自动收集
Expand Down Expand Up @@ -424,7 +424,7 @@ mysql> ANALYZE TABLE stats_test.example_tbl PROPERTIES("period.seconds" = "86400
- 周期性(每隔一天)收集 `example_tbl` 表的直方图信息,与普通统计信息类似,使用以下语法:

```SQL
mysql> ANALYZE TABLE stats_test.example_tbl UPDATE HISTOGRAM WITH PERIOD 86400;
mysql> ANALYZE TABLE stats_test.example_tbl WITH HISTOGRAM WITH PERIOD 86400;
+--------+
| job_id |
+--------+
Expand Down Expand Up @@ -924,6 +924,14 @@ mysql> DROP STATS stats_test.example_tbl;
mysql> DROP STATS stats_test.example_tbl(city, age, sex);
```

## 删除Analyze Job

用于根据job id删除自动/周期Analyze作业

```sql
DROP ANALYZE JOB [JOB_ID]
```

## ANALYZE 配置项

待补充。

0 comments on commit 06d129c

Please sign in to comment.