Skip to content

Commit

Permalink
Minor doc changes
Browse files Browse the repository at this point in the history
Signed-off-by: Marcel Coetzee <[email protected]>
  • Loading branch information
Pipboyguy committed Jun 20, 2024
1 parent 825982b commit 4912173
Showing 1 changed file with 10 additions and 8 deletions.
18 changes: 10 additions & 8 deletions docs/website/docs/dlt-ecosystem/destinations/clickhouse.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ or with `pip install "dlt[clickhouse]"`, which installs the `dlt` library and th

### 2. Setup ClickHouse database

To load data into ClickHouse, you need to create a ClickHouse database. While we recommend asking our GPT-4 assistant for details, weve provided a general outline of the process below:
To load data into ClickHouse, you need to create a ClickHouse database. While we recommend asking our GPT-4 assistant for details, we've provided a general outline of the process below:

1. You can use an existing ClickHouse database or create a new one.

Expand Down Expand Up @@ -91,7 +91,7 @@ To load data into ClickHouse, you need to create a ClickHouse database. While we
2. You can pass a database connection string similar to the one used by the `clickhouse-driver` library. The credentials above will look like this:

```toml
# keep it at the top of your toml file, before any section starts.
# keep it at the top of your toml file before any section starts.
destination.clickhouse.credentials="clickhouse://dlt:Dlt*12345789234567@localhost:9000/dlt?secure=1"
```

Expand All @@ -111,7 +111,8 @@ Data is loaded into ClickHouse using the most efficient method depending on the

`Clickhouse` does not support multiple datasets in one database, dlt relies on datasets to exist for multiple reasons.
To make `clickhouse` work with `dlt`, tables generated by `dlt` in your `clickhouse` database will have their name prefixed with the dataset name separated by
the configurable `dataset_table_separator`. Additionally, a special sentinel table that doesn’t contain any data will be created, so dlt knows which virtual datasets already exist in a
the configurable `dataset_table_separator`.
Additionally, a special sentinel table that doesn't contain any data will be created, so dlt knows which virtual datasets already exist in a
clickhouse
destination.

Expand All @@ -122,14 +123,15 @@ destination.

The `clickhouse` destination has a few specific deviations from the default sql destinations:

1. `Clickhouse` has an experimental `object` datatype, but we’ve found it to be a bit unpredictable, so the dlt clickhouse destination will load the complex datatype to a `text` column. If you need
1. `Clickhouse` has an experimental `object` datatype, but we've found it to be a bit unpredictable, so the dlt clickhouse destination will load the complex datatype to a `text` column.
If you need
this feature, get in touch with our Slack community, and we will consider adding it.
2. `Clickhouse` does not support the `time` datatype. Time will be loaded to a `text` column.
3. `Clickhouse` does not support the `binary` datatype. Binary will be loaded to a `text` column. When loading from `jsonl`, this will be a base64 string, when loading from parquet this will be
the `binary` object converted to `text`.
4. `Clickhouse` accepts adding columns to a populated table that aren’t null.
5. `Clickhouse` can produce rounding errors under certain conditions when using the float / double datatype. Make sure to use decimal if you can’t afford to have rounding errors. Loading the value
12.7001 to a double column with the loader file format jsonl set will predictbly produce a rounding error for example.
12.7001 to a double column with the loader file format jsonl set will predictably produce a rounding error, for example.

## Supported column hints

Expand All @@ -154,8 +156,8 @@ clickhouse_adapter(my_resource, table_engine_type="merge_tree")
Supported values for `table_engine_type` are:

- `merge_tree` (default) - creates tables using the `MergeTree` engine, suitable for most use cases. [Learn more about MergeTree](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree).
- `shared_merge_tree` - creates tables using the `SharedMergeTree` engine, optimized for cloud-native environments with shared storage. This table is **only** available on ClickHouse Cloud and it the default selection if `merge_tree` is selected. [Learn more about SharedMergeTree](https://clickhouse.com/docs/en/cloud/reference/shared-merge-tree).
- `replicated_merge_tree` - creates tables using the `ReplicatedMergeTree` engine, which supports data replication across multiple nodes for high availability. [Learn more about ReplicatedMergeTree](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication). This also defaults to `shared_merge_tree` on ClickHouse Cloud.
- `shared_merge_tree` - creates tables using the `SharedMergeTree` engine, optimized for cloud-native environments with shared storage. This table is **only** available on ClickHouse Cloud, and it the default selection if `merge_tree` is selected. [Learn more about SharedMergeTree](https://clickhouse.com/docs/en/cloud/reference/shared-merge-tree).
- `replicated_merge_tree` - creates tables using the `ReplicatedMergeTree` engine, which supports data replication across multiple nodes for high availability. [Learn more about ReplicatedMergeTree](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication). This defaults to `shared_merge_tree` on ClickHouse Cloud.
- Experimental support for the `Log` engine family with `stripe_log` and `tiny_log`.

For local development and testing with ClickHouse running locally, the `MergeTree` engine is recommended.
Expand Down Expand Up @@ -222,7 +224,7 @@ dlt's staging mechanisms for ClickHouse.

### dbt support

Integration with [dbt](../transformations/dbt/dbt.md) is generally supported via dbt-clickhouse, but not tested by us.
Integration with [dbt](../transformations/dbt/dbt.md) is generally supported via dbt-clickhouse but not tested by us.

### Syncing of `dlt` state

Expand Down

0 comments on commit 4912173

Please sign in to comment.