Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/follow-up-of-#2762' into follow-…
Browse files Browse the repository at this point in the history
…up-of-#2762
  • Loading branch information
KeXiangWang committed Nov 25, 2024
2 parents 7b11543 + 18f0582 commit 3a75371
Show file tree
Hide file tree
Showing 9 changed files with 105 additions and 12 deletions.
1 change: 1 addition & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -209,6 +209,7 @@ OpenSearch
codebase
Databricks
SDKs
sdk
RWUs
roadmap
terraform
Expand Down
31 changes: 21 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,18 +3,29 @@

This repository contains the latest RisingWave documentation. [The old repository](https://github.com/risingwavelabs/risingwave-docs-legacy) now hosts the archived documentation up to v2.0 of RisingWave.

# Documentation structure

Below are the main topic groups. Some groups are elevated to be tabs shown on the top of a documentation page.

- get-started
- demos
- sql
- ingestion
- processing
- delivery
- deploy
- operate
- python-sdk
- client-libraries
- performance
- troubleshoot
- integrations
- faq
- reference
- cloud
- changelog


# Mintlify Starter Kit

Click on `Use this template` to copy the Mintlify starter kit. The starter kit contains examples including

- Guide pages
- Navigation
- Customizations
- API Reference pages
- Use of popular components

### Development

Install the [Mintlify CLI](https://www.npmjs.com/package/mintlify) to preview the documentation changes locally. To install, use the following command
Expand Down
1 change: 1 addition & 0 deletions changelog/product-lifecycle.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ Below is a list of all features in the public preview phase:

| Feature name | Start version |
| :-- | :-- |
| [Shared source](/sql/commands/sql-create-source/#shared-source) | 2.1 |
| [ASOF join](/docs/current/query-syntax-join-clause/#asof-joins) | 2.1 |
| [Partitioned Postgres CDC table](/docs/current/ingest-from-postgres-cdc/) | 2.1 |
| [Map type](/docs/current/data-type-map/) | 2.0 |
Expand Down
2 changes: 1 addition & 1 deletion get-started/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ For extensive testing or single-machine deployment, consider [starting RisingWav
Open a terminal and run the following `curl` command.

```bash
curl https://risingwave.com/sh | sh
curl -L https://risingwave.com/sh | sh
```

To start a RisingWave instance, run the following command.
Expand Down
Binary file added images/non-shared-source.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/shared-source.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/table-with-connectors.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
80 changes: 80 additions & 0 deletions sql/commands/sql-create-source.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,86 @@ If Kafka is part of your technical stack, you can also use the Kafka connector i

For complete step-to-step guides about ingesting MySQL and PostgreSQL data using both approaches, see [Ingest data from MySQL](/docs/current/ingest-from-mysql-cdc/) and [Ingest data from PostgreSQL](/docs/current/ingest-from-postgres-cdc/).

## Shared source

Shared source improves resource utilization and data consistency when working with Kafka sources in RisingWave. This will only affect Kafka sources created after the version updated and will not affect any existing Kafka sources.

<Note>
**PUBLIC PREVIEW**

This feature is in the public preview stage, meaning it's nearing the final product but is not yet fully stable. If you encounter any issues or have feedback, please contact us through our [Slack channel](https://www.risingwave.com/slack). Your input is valuable in helping us improve the feature. For more information, see our [Public preview feature list](/product-lifecycle/#features-in-the-public-preview-stage).
</Note>

### Configure

Shared source is enabled by default. You can also set the session variable `streaming_use_shared_source` to control whether to enable it.

```sql
# change the config in the current session
SET streaming_use_shared_source=[true|false];

# change the default value of the session variable in the cluster
# (the current session is not affected)
ALTER SYSTEM SET streaming_use_shared_source=[true|false];
```

To completely disable it at the cluster level, go to [`risingwave.toml`](https://github.com/risingwavelabs/risingwave/blob/main/src/config/example.toml#L146) configuration file, and set the `stream_enable_shared_source` to `false`.

### Compared with non-shared source

With non-shared sources, when using the `CREATE SOURCE` statement:
- No streaming jobs would be instantiated. A source is just a set of metadata stored in the catalog.
- Only when a materialized view or sink references the source, a `SourceExecutor` will be created to start the process of data ingestion.

This leads to increased resource usage and potential inconsistencies:
- Each `SourceExecutor` consumed Kafka resources independently, adding pressure to both the Kafka broker and RisingWave.
- Independent `SourceExecutor` instances could result in different consumption progress, causing temporary inconsistencies when joining materialized views.

<Frame>
<img src="/images/non-shared-source.png"/>
</Frame>

With shared sources, when using the `CREATE SOURCE` statement:
- It will instantiate a single `SourceExecutor` immediately.
- All materialized views referencing the same source share the `SourceExecutor`.
- The downstream materialized views will only forwards data from the upstream sources, instead of consuming from Kafka independently.

This improves resource utilization and consistency.

<Frame>
<img src="/images/shared-source.png"/>
</Frame>

When creating a materialized view, RisingWave backfills historical data from Kafka. The process blocks the DDL statement until backfill completes.

- To configure this behavior, use the [SET BACKGROUND_DDL](/sql/commands/sql-set-background-ddl) command. This is similar to the backfilling procedure when creating a materialized view on tables and materialized views.

- To monitoring backfill progress, use the [SHOW JOBS](/sql/commands/sql-show-jobs) command or check `Kafka Consumer Lag Size` in the Grafana dashboard (under `Streaming`).


<Note>If you set up a retention policy or if the external system can only be accessed once (like message queues), and the data is no longer available, any newly created materialized views won’t be able to backfill the complete historical data. This can lead to inconsistencies with earlier materialized views.</Note>


### Compared with table

A `CREATE TABLE` statement can provide similar benefits to shared sources, except that it needs to persist all consumed data.

For table with connector, downstream materialized views backfill historical data from the table instead of external sources, which may be more efficient and cause less pressure to the external system. This also gives table stronger consistency guarantee, as historical data will be ensured to be present.

Tables offer other features that enhance their utility in data ingestion workflows. See [Table with connectors](/ingestion/overview#table-with-connectors).

<Frame>
<img src="/images/table-with-connectors.png"/>
</Frame>

<Note>
**LIMITATION**

Currently, shared source is only applicable to Kafka sources. Other sources are unaffected. We plan to gradually upgrade other sources to the be shared as well in the future.

Shared sources do not support `ALTER SOURCE`. Use non-shared sources if you require this functionality.
</Note>

## See also

<CardGroup>
Expand Down
2 changes: 1 addition & 1 deletion sql/query-syntax/value-exp.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ expression::type
| Parameter | Description |
| :----------- | :-------------------------------------------------------------------------------------------------------------------------------- |
| _expression_ | The expression of which the data type to be converted. |
| _type_ | The data type of the returned value.For the types you can cast the value to, see \[Casting\](/sql/data-types/data-type-casting.md |
| _type_ | The data type of the returned value. For the types you can cast the value to, see [Casting](/sql/data-types/casting). |

## Row constructors

Expand Down

0 comments on commit 3a75371

Please sign in to comment.