Merge branch 'main' of https://github.com/ClickHouse/clickhouse-docs …

…into menu-improvement
Blargian · Jan 14, 2025 · b56eccd · b56eccd
2 parents bdb1ff1 + a7d1b45
commit b56eccd
Show file tree

Hide file tree

Showing 151 changed files with 5,116 additions and 4,640 deletions.
diff --git a/clickhouseapi.js b/clickhouseapi.js
@@ -52,7 +52,7 @@ function generateDocusaurusMarkdown(spec, groupedEndpoints, prefix) {
 
       markdownContent += `| Method | Path |\n`
       markdownContent += `| :----- | :--- |\n`
-      markdownContent += `| ${method.toUpperCase()} | ${path} |\n\n`
+      markdownContent += `| ${method.toUpperCase()} | \`${path}\` |\n\n`
 
       markdownContent += `### Request\n\n`;
 

diff --git a/copyClickhouseRepoDocs.sh b/copyClickhouseRepoDocs.sh
@@ -1,7 +1,8 @@
-#! ./bin/bash
+#! /bin/bash
 
 SCRIPT_NAME=$(basename "$0")
 
+rm -rf ClickHouse
 echo "[$SCRIPT_NAME] Start tasks for copying docs from ClickHouse repo"
 
 # Clone ClickHouse repo

diff --git a/docs/en/cloud/reference/changelog.md b/docs/en/cloud/reference/changelog.md
@@ -957,7 +957,7 @@ Adds support for a subset of features in ClickHouse 23.1, for example:
 - New functions, including `age()`, `quantileInterpolatedWeighted()`, `quantilesInterpolatedWeighted()`
 - Ability to use structure from insertion table in `generateRandom` without arguments
 - Improved database creation and rename logic that allows the reuse of previous names
-- See the 23.1 release [webinar slides](https://presentations.clickhouse.com/release_23.1/#cover) and [23.1 release changelog](/docs/en/whats-new/changelog/index.md/#clickhouse-release-231) for more details
+- See the 23.1 release [webinar slides](https://presentations.clickhouse.com/release_23.1/#cover) and [23.1 release changelog](/docs/en/whats-new/changelog/index.md#clickhouse-release-231) for more details
 
 ### Integrations changes
 - [Kafka-Connect](/docs/en/integrations/data-ingestion/kafka/index.md): Added support for Amazon MSK

diff --git a/docs/en/data-compression/compression-modes.md b/docs/en/data-compression/compression-modes.md
@@ -41,7 +41,7 @@ From [facebook benchmarks](https://facebook.github.io/zstd/#benchmarks):
 | mode            | byte    | Compression mode                                 |
 | compressed_data | binary  | Block of compressed data                         |
 
-![compression block diagram](../native-protocol/images/ch_compression_block.drawio.svg)
+![compression block diagram](./images/ch_compression_block.png)
 
 Header is (raw_size + data_size + mode), raw size consists of len(header + compressed_data).
 

diff --git a/docs/en/data-compression/images/ch_compression_block.png b/docs/en/data-compression/images/ch_compression_block.png
diff --git a/docs/en/data-modeling/backfilling.md b/docs/en/data-modeling/backfilling.md
@@ -448,7 +448,7 @@ GROUP BY
 
 Here, we create a Null table, `pypi_v2,` to receive the rows that will be used to build our materialized view. Note how we limit the schema to only the columns we need. Our materialized view performs an aggregation over rows inserted into this table (one block at a time), sending the results to our target table, `pypi_downloads_per_day`.
 
-::note
+:::note
 We have used `pypi_downloads_per_day` as our target table here. For additional resiliency, users could create a duplicate table, `pypi_downloads_per_day_v2`, and use this as the target table of the view, as shown in previous examples. On completion of the insert, partitions in `pypi_downloads_per_day_v2` could, in turn, be moved to `pypi_downloads_per_day.` This would allow recovery in the case our insert fails due to memory issues or server interruptions i.e. we just truncate `pypi_downloads_per_day_v2`, tune settings, and retry. 
 :::
 

diff --git a/docs/en/guides/developer/understanding-query-execution-with-the-analyzer.md b/docs/en/guides/developer/understanding-query-execution-with-the-analyzer.md
@@ -63,12 +63,10 @@ Each node has corresponding children and the overall tree represents the overall
 
 ## Analyzer
 
-<BetaBadge />
-
-ClickHouse currently has two architectures for the Analyzer. You can use the old architecture by setting: `allow_experimental_analyzer=0`. If you want to use the new architecture, you should set `allow_experimental_analyzer=1`. We are going to describe only the new architecture here, given the old one is going to be deprecated once the new analyzer is generally available.
+ClickHouse currently has two architectures for the Analyzer. You can use the old architecture by setting: `enable_analyzer=0`. The new architecture is enabled by default. We are going to describe only the new architecture here, given the old one is going to be deprecated once the new analyzer is generally available.
 
 :::note
-The new analyzer is in Beta. The new architecture should provide us with a better framework to improve ClickHouse's performance. However, given it is a fundamental component of the query processing steps, it also might have a negative impact on some queries. After moving to the new analyzer, you may see performance degradation, queries failing, or queries giving you an unexpected result. You can revert back to the old analyzer by changing the `allow_experimental_analyzer` setting at the query or user level. Please report any issues in GitHub.
+The new architecture should provide us with a better framework to improve ClickHouse's performance. However, given it is a fundamental component of the query processing steps, it also might have a negative impact on some queries and there are [known incompatibilities](/docs/en/operations/analyzer#known-incompatibilities). You can revert back to the old analyzer by changing the `enable_analyzer` setting at the query or user level. 
 :::
 
 The analyzer is an important step of the query execution. It takes an AST and transforms it into a query tree. The main benefit of a query tree over an AST is that a lot of the components will be resolved, like the storage for instance. We also know from which table to read, aliases are also resolved, and the tree knows the different data types used. With all these benefits, the analyzer can apply optimizations. The way these optimizations work is via “passes”. Every pass is going to look for different optimizations. You can see all the passes [here](https://github.com/ClickHouse/ClickHouse/blob/76578ebf92af3be917cd2e0e17fea2965716d958/src/Analyzer/QueryTreePassManager.cpp#L249), let’s see it in practice with our previous query:

diff --git a/docs/en/guides/inserting-data.md b/docs/en/guides/inserting-data.md
@@ -89,7 +89,7 @@ With asynchronous inserts, data is inserted into a buffer first and then written
 <br />
 
 <img src={require('./images/postgres-inserts.png').default}    
-     class="image"
+     className="image"
      alt="NEEDS ALT"
      style={{width: '600px'}} 
 />

diff --git a/docs/en/guides/sre/user-management/index.md b/docs/en/guides/sre/user-management/index.md
@@ -200,10 +200,6 @@ This article shows the basics of defining SQL users and roles and applying those
     GRANT ALL ON *.* TO clickhouse_admin WITH GRANT OPTION;
     ```
 
-<Content />
-
-
-
 ## ALTER permissions
 
 This article is intended to provide you with a better understanding of how to define permissions, and how permissions work when using `ALTER` statements for privileged users.

diff --git a/docs/en/integrations/data-ingestion/clickpipes/index.md b/docs/en/integrations/data-ingestion/clickpipes/index.md
@@ -18,7 +18,7 @@ import PostgresSVG from "../../images/logos/postgresql.svg";
 
 ## Introduction
 
-[ClickPipes](https://clickhouse.com/cloud/clickpipes) is a managed integration platform that makes ingesting data from a diverse set of sources as simple as clicking a few buttons. Designed for the most demanding workloads, ClickPipes's robust and scalable architecture ensures consistent performance and reliability. ClickPipes can be used for long-term streaming needs or one-time data loading job.
+[ClickPipes](/docs/en/integrations/clickpipes) is a managed integration platform that makes ingesting data from a diverse set of sources as simple as clicking a few buttons. Designed for the most demanding workloads, ClickPipes's robust and scalable architecture ensures consistent performance and reliability. ClickPipes can be used for long-term streaming needs or one-time data loading job.
 
 ![ClickPipes stack](./images/clickpipes_stack.png)
 
@@ -64,7 +64,7 @@ Steps:
 ![Assign a custom role](./images/cp_custom_role.png)
 
 ## Error reporting
-ClickPipes will create a table next to your destination table with the postfix `<destination_table_name>_clickpipes_error`. This table will contain any errors from the operations of your ClickPipe (network, connectivity, etc.) and also any data that don't conform to the schema. The error table has a [TTL](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree#table_engine-mergetree-ttl) of 7 days.
+ClickPipes will create a table next to your destination table with the postfix `<destination_table_name>_clickpipes_error`. This table will contain any errors from the operations of your ClickPipe (network, connectivity, etc.) and also any data that don't conform to the schema. The error table has a [TTL](/docs/en/engines/table-engines/mergetree-family/mergetree#table_engine-mergetree-ttl) of 7 days.
 If ClickPipes cannot connect to a data source or destination after 15min., ClickPipes instance stops and  stores an appropriate message in the error table (providing the ClickHouse instance is available).
 
 ## F.A.Q
@@ -74,7 +74,7 @@ If ClickPipes cannot connect to a data source or destination after 15min., Click
 
 - **Does ClickPipes support data transformation?**
 
-  Yes, ClickPipes supports basic data transformation by exposing the DDL creation. You can then apply more advanced transformations to the data as it is loaded into its destination table in a ClickHouse Cloud service leveraging ClickHouse's [materialized views feature](https://clickhouse.com/docs/en/guides/developer/cascading-materialized-views).
+  Yes, ClickPipes supports basic data transformation by exposing the DDL creation. You can then apply more advanced transformations to the data as it is loaded into its destination table in a ClickHouse Cloud service leveraging ClickHouse's [materialized views feature](/docs/en/guides/developer/cascading-materialized-views).
 
 - **Does using ClickPipes incur an additional cost?**
 

diff --git a/docs/en/integrations/data-ingestion/clickpipes/postgres/index.md b/docs/en/integrations/data-ingestion/clickpipes/postgres/index.md
@@ -4,12 +4,14 @@ description: Seamlessly connect your Postgres to ClickHouse Cloud.
 slug: /en/integrations/clickpipes/postgres
 ---
 
+import PrivatePreviewBadge from '@theme/badges/PrivatePreviewBadge';
+
 # Ingesting Data from Postgres to ClickHouse (using CDC)
 
-:::info
+<PrivatePreviewBadge/>
 
+:::info
 Currently, ingesting data from Postgres to ClickHouse Cloud via ClickPipes is in Private Preview. If you are interested in trying it out, please sign up [here](https://clickpipes.peerdb.io/).
-
 :::
 
 

diff --git a/docs/en/integrations/data-ingestion/data-formats/csv-tsv.md b/docs/en/integrations/data-ingestion/data-formats/csv-tsv.md
@@ -40,7 +40,7 @@ FROM INFILE 'data_small.csv'
 FORMAT CSV
 ```
 
-Here, we use the `FORMAT CSV` clause so ClickHouse understands the file format. We can also load data directly from URLs using [url()](/docs/en/sql-reference/table-functions/url.md/) function or from S3 files using [s3()](/docs/en/sql-reference/table-functions/s3.md/) function.
+Here, we use the `FORMAT CSV` clause so ClickHouse understands the file format. We can also load data directly from URLs using [url()](/docs/en/sql-reference/table-functions/url.md) function or from S3 files using [s3()](/docs/en/sql-reference/table-functions/s3.md) function.
 
 :::tip
 We can skip explicit format setting for `file()` and `INFILE`/`OUTFILE`.

diff --git a/docs/en/integrations/data-ingestion/data-formats/json/formats.md b/docs/en/integrations/data-ingestion/data-formats/json/formats.md
@@ -273,7 +273,7 @@ Note that `JSONAsString` works perfectly fine in cases we have JSON object-per-l
 
 ## Schema for nested objects
 
-In cases when we're dealing with [nested JSON objects](../assets/list-nested.json), we can additionally define schema and use complex types ([`Array`](/docs/en/sql-reference/data-types/array.md/), [`Object Data Type`](/en/sql-reference/data-types/object-data-type) or [`Tuple`](/docs/en/sql-reference/data-types/tuple.md/)) to load data:
+In cases when we're dealing with [nested JSON objects](../assets/list-nested.json), we can additionally define schema and use complex types ([`Array`](/docs/en/sql-reference/data-types/array.md), [`Object Data Type`](/en/sql-reference/data-types/object-data-type) or [`Tuple`](/docs/en/sql-reference/data-types/tuple.md)) to load data:
 
 ```sql
 SELECT *

diff --git a/docs/en/integrations/data-ingestion/data-formats/json/schema.md b/docs/en/integrations/data-ingestion/data-formats/json/schema.md
@@ -7,7 +7,7 @@ keywords: [json, clickhouse, inserting, loading, formats, schema]
 
 # Designing your schema
 
-While [schema inference](/docs/en/integrations/data-formats/JSON/inference) can be used to establish an initial schema for JSON data and query JSON data files in place, e.g., in S3, users should aim to establish an optimized versioned schema for their data. We discuss the options for modeling JSON structures below.
+While [schema inference](/docs/en/integrations/data-formats/json/inference) can be used to establish an initial schema for JSON data and query JSON data files in place, e.g., in S3, users should aim to establish an optimized versioned schema for their data. We discuss the options for modeling JSON structures below.
 
 ## Extract where possible
 

diff --git a/docs/en/integrations/data-ingestion/data-formats/templates-regex.md b/docs/en/integrations/data-ingestion/data-formats/templates-regex.md
@@ -126,7 +126,7 @@ FORMAT Template SETTINGS format_template_resultset = 'output.results',
 ```
 
 ### Exporting to HTML files
-Template-based results can also be exported to files using an [`INTO OUTFILE`](/docs/en/sql-reference/statements/select/into-outfile.md/) clause. Let's generate HTML files based on given [resultset](assets/html.results) and [row](assets/html.row) formats:
+Template-based results can also be exported to files using an [`INTO OUTFILE`](/docs/en/sql-reference/statements/select/into-outfile.md) clause. Let's generate HTML files based on given [resultset](assets/html.results) and [row](assets/html.row) formats:
 
 ```sql
 SELECT

diff --git a/docs/en/integrations/data-ingestion/dbms/postgresql/data-type-mappings.md b/docs/en/integrations/data-ingestion/dbms/postgresql/data-type-mappings.md
@@ -39,4 +39,4 @@ The following table shows the equivalent ClickHouse data types for Postgres.
 | JSON* | [String](/en/sql-reference/data-types/string), [Variant](/en/sql-reference/data-types/variant), [Nested](/en/sql-reference/data-types/nested-data-structures/nested#nestedname1-type1-name2-type2-), [Tuple](/en/sql-reference/data-types/tuple) |
 | JSONB | [String](/en/sql-reference/data-types/string) |
 
-*\* Production support for JSON in ClickHouse is under development. Currently users can either map JSON as String, and use [JSON functions](/en/sql-reference/functions/json-functions), or map the JSON directly to [Tuples](/en/sql-reference/data-types/tuple) and [Nested](/en/sql-reference/data-types/nested-data-structures/nested) if the structure is predictable. Read more about JSON [here](/en/integrations/data-formats/json#handle-as-structured-data).*
+*\* Production support for JSON in ClickHouse is under development. Currently users can either map JSON as String, and use [JSON functions](/en/sql-reference/functions/json-functions), or map the JSON directly to [Tuples](/en/sql-reference/data-types/tuple) and [Nested](/en/sql-reference/data-types/nested-data-structures/nested) if the structure is predictable. Read more about JSON [here](/en/integrations/data-formats/json/overview).*
-Original file line number
+Diff line change
@@ Expand Up / @@ -448,7 +448,7 @@ GROUP BY @@
     Here, we create a Null table, `pypi_v2,` to receive the rows that will be used to build our materialized view. Note how we limit the schema to only the columns we need. Our materialized view performs an aggregation over rows inserted into this table (one block at a time), sending the results to our target table, `pypi_downloads_per_day`.
-    ::note
+    :::note
     We have used `pypi_downloads_per_day` as our target table here. For additional resiliency, users could create a duplicate table, `pypi_downloads_per_day_v2`, and use this as the target table of the view, as shown in previous examples. On completion of the insert, partitions in `pypi_downloads_per_day_v2` could, in turn, be moved to `pypi_downloads_per_day.` This would allow recovery in the case our insert fails due to memory issues or server interruptions i.e. we just truncate `pypi_downloads_per_day_v2`, tune settings, and retry.
     :::
@@ Expand Down @@