Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: new documentation structure + other issues #168

Merged
merged 35 commits into from
Oct 21, 2024
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
27637ca
add headers and refactor
raulb Oct 17, 2024
6d9563a
remove unused images
raulb Oct 17, 2024
8826941
different structure
raulb Oct 17, 2024
e08eaef
add link to conduit platform
raulb Oct 17, 2024
5671d4e
tweak css
raulb Oct 17, 2024
dc2f185
style differently the first element
raulb Oct 17, 2024
b1ed215
fix menu items on small viewports
raulb Oct 17, 2024
07b72c2
disable until we finish so we have preview deploys
raulb Oct 17, 2024
b3c8356
fix flickering
raulb Oct 18, 2024
ebc5bf8
another iteration
raulb Oct 18, 2024
d5fa6aa
leave sidebar tidy
raulb Oct 18, 2024
3d4769a
structure pipeline pages
raulb Oct 18, 2024
5df19b6
update header nav bar
raulb Oct 18, 2024
c9bc04f
document pipeline statuses
raulb Oct 18, 2024
734b65f
document CLI flags
raulb Oct 18, 2024
47feab8
structure finished
raulb Oct 18, 2024
b82116e
add link to opencdc
raulb Oct 18, 2024
d34d196
consistent formatting
raulb Oct 18, 2024
5965254
better urls
raulb Oct 18, 2024
60608e3
Merge branch 'main' into raul/restructure-pages
raulb Oct 18, 2024
6f937c8
update some links and rename pages
raulb Oct 18, 2024
d90d8d9
upate links
raulb Oct 19, 2024
6027845
fix more broken links
raulb Oct 19, 2024
737bd75
fix redirect
raulb Oct 19, 2024
89fddea
finish broken links
raulb Oct 19, 2024
b3ccb9f
remove conclusion
raulb Oct 19, 2024
cd43783
fix statuses definition
raulb Oct 21, 2024
307f178
update again
raulb Oct 21, 2024
a823ef5
update links
raulb Oct 21, 2024
ddac170
add some redirects
raulb Oct 21, 2024
4b86cc7
tweak implication
raulb Oct 21, 2024
810279b
update specs
raulb Oct 21, 2024
ffde855
sort redirects
raulb Oct 21, 2024
d94296d
fix redirect
raulb Oct 21, 2024
b3177bb
fix broken links
raulb Oct 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion changelog/2024-03-24-conduit-0-9-0-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,5 @@ Revolutionize your data processing with [**Conduit v0.9**](https://github.com/Co
- **Getting Started Guide**: A user-friendly guide is available to help new users set up Conduit and explore the latest features quickly.

:::tip
For an in-depth look at how the enhanced processors can transform your data processing workflows, check out our [blog post](https://meroxa.com/blog/introducing-conduit-0.9-revolutionizing-data-processing-with-enhanced-processors/), and visit our [Processors documentation page](/docs/processors).
For an in-depth look at how the enhanced processors can transform your data processing workflows, check out our [blog post](https://meroxa.com/blog/introducing-conduit-0.9-revolutionizing-data-processing-with-enhanced-processors/), and visit our [Processors documentation page](/docs/using/processors/getting-started.
:::
2 changes: 1 addition & 1 deletion changelog/2024-08-19-conduit-0-11-0-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ We’re thrilled to announce the release of [**Conduit v0.11**](https://github.c
- **Enhanced Transformation Capabilities:** Easily transform data as it flows through your pipelines, making integration smoother and more efficient.

:::tip
For an in-depth look at how these new features can elevate your data integration processes, check out our [blog post](https://meroxa.com/blog/conduit-v0.11-unveils-powerful-schema-support-for-enhanced-data-integration/), our [Schema Support documentation page](/docs/features/schema-support).
For an in-depth look at how these new features can elevate your data integration processes, check out our [blog post](https://meroxa.com/blog/conduit-v0.11-unveils-powerful-schema-support-for-enhanced-data-integration/), our [Schema Support documentation page](/docs/using/other-features/schema-support).
:::
2 changes: 1 addition & 1 deletion changelog/2024-10-10-conduit-0-12-0-release.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,5 @@ We’re excited to announce the release of [**Conduit v0.12.0**](https://github.
- **Smart Retry Management:** Limits on retries prevent indefinite restarts, keeping your pipelines efficient and reliable.

:::tip
For a detailed overview of how Pipeline Recovery works and its benefits, check out our [blog post](https://meroxa.com/blog/unlocking-resilience:-conduit-v0.12.0-introduces-pipeline-recovery/), or our documentation for [Pipeline Recovery](/docs/features/pipeline-recovery) and learn how to make your data streaming experience smoother than ever!
For a detailed overview of how Pipeline Recovery works and its benefits, check out our [blog post](https://meroxa.com/blog/unlocking-resilience:-conduit-v0.12.0-introduces-pipeline-recovery/), or our documentation for [Pipeline Recovery](/docs/using/other-features/pipeline-recovery) and learn how to make your data streaming experience smoother than ever!
:::
4 changes: 2 additions & 2 deletions changelog/2024-10-15-pipelines-exit-on-degraded.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ $ conduit --help
...
```

If you were using a [Conduit Configuration file](/docs/features/configuration) this should look like:
If you were using a [Conduit Configuration file](/docs/configuration#configuration-file) this should look like:

```yaml title="conduit.yaml"
# ...
Expand All @@ -28,7 +28,7 @@ pipelines:
# ...
```

Previously, this functionality was handled by `pipelines.exit-on-error`. However, with the introduction of [Pipeline Recovery](/docs/features/pipeline-recovery), the old description no longer accurately reflected the behavior, as a pipeline may not necessarily exit even in the presence of an error.
Previously, this functionality was handled by `pipelines.exit-on-error`. However, with the introduction of [Pipeline Recovery](/docs/using/other-features/pipeline-recovery), the old description no longer accurately reflected the behavior, as a pipeline may not necessarily exit even in the presence of an error.

:::warning
The previous flag `pipelines.exit-on-error` will still be valid but is now hidden. We encourage all users to transition to `pipelines.exit-on-degraded` for improved clarity and functionality.
Expand Down
28 changes: 14 additions & 14 deletions docs/introduction.mdx → docs/0-what-is/0-introduction.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ sidebar_position: 0
hide_title: true
title: 'Introduction'
sidebar_label: "Introduction"
slug: /
slug: '/'
---

<img
Expand All @@ -12,23 +12,23 @@ slug: /
src="/img/conduit/on-white-conduit-logo.png"
/>

Conduit is a data integration tool for software engineers. Its purpose is to
Conduit is a data integration tool for software engineers, powered by [Meroxa](https://meroxa.io). Its purpose is to
help you move data from A to B. You can use Conduit to send data from Kafka to
Postgres, between files and APIs,
between [supported connectors](/docs/connectors/connector-list),
and [any datastore you can build a plugin for](/docs/connectors/building-connectors/).
between [supported connectors](/docs/using/connectors/list),
and [any datastore you can build a plugin for](/docs/developing/connectors/).

It's written in [Go](https://go.dev/), compiles to a binary, and is designed to
be easy to use and [deploy](/docs/getting-started/installing-and-running?option=binary).
be easy to use and [deploy](/docs/installing-and-running?option=binary).

Out of the box, Conduit comes with:

- A UI
- Common connectors
- Processors
- Observability
- Schema Support

In this getting started guide we'll use a pre-built binary, but Conduit can also be run using [Docker](/docs/getting-started/installing-and-running?option=docker).
In this getting started guide we'll use a pre-built binary, but Conduit can also be run using [Docker](/docs/installing-and-running?option=docker).

## Some of its features

Expand All @@ -49,7 +49,7 @@ allows your data applications to act upon those changes in real-time.
Conduit connectors are plugins that communicate with Conduit via a gRPC
interface. This means that plugins can be written in any language as long as
they conform to the required interface. Check out
our [connector docs](/docs/connectors)!
our [connector docs](/docs/using/connectors/getting-started)!

## Installing

Expand All @@ -63,7 +63,7 @@ curl https://conduit.io/install.sh | bash

If you're not using macOS or Linux system, you can still install Conduit
following one of the different options provided
in [our installation page](/docs/getting-started/installing-and-running).
in [our installation page](/docs/installing-and-running).

## Starting Conduit
Now that we have Conduit installed let's start it up to see what happens.
Expand Down Expand Up @@ -116,7 +116,7 @@ Now that we have Conduit up and running you can now navigate to `http://localhos
![Conduit Pipeline](/img/conduit/pipeline.png)

## Building a pipeline
While you can provision pipelines via Conduit's UI, the recommended way to do so is using a [pipeline configuation file](/docs/pipeline-configuration-files/getting-started).
While you can provision pipelines via Conduit's UI, the recommended way to do so is using a [pipeline configuation file](/docs/using/pipelines/configuration-file).

For this example we'll create a pipeline that will move data from one file to another.

Expand Down Expand Up @@ -267,9 +267,9 @@ Congratulations! You've pushed data through your first Conduit pipeline.
Looking for more examples? Check out the examples in our [repo](https://github.com/ConduitIO/conduit/tree/main/examples).

Now that you've got the basics of running Conduit and creating a pipeline covered. Here are a few places to dive in deeper:
- [Connectors](/docs/connectors/getting-started)
- [Pipelines](/docs/pipeline-configuration-files/getting-started)
- [Processors](/docs/processors/getting-started)
- [Conduit Architecture](/docs/getting-started/architecture)
- [Connectors](/docs/using/connectors/getting-started)
- [Pipelines](/docs/using/pipelines/configuration-file)
- [Processors](/docs/using/processors/getting-started)
- [Conduit Architecture](/docs/core-concepts/architecture)

![scarf pixel conduit-site-docs-introduction](https://static.scarf.sh/a.png?x-pxid=01346572-0d57-4df3-8399-1425db913a0a)
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Conduit Architecture"
sidebar_position: 2
slug: '/core-concepts/architecture'
---

Here is an overview of the internal Conduit Architecture.
Expand Down Expand Up @@ -93,14 +93,14 @@ as soon as possible without draining the pipeline.

This layer is used directly by the [Orchestration layer](#orchestration-layer) and indirectly by the [Core layer](#core-layer), and [Schema registry service](#schema-registry-service) (through stores) to persist data. It provides the functionality of creating transactions and storing, retrieving and deleting arbitrary data like configurations or state.

More information on [storage](/docs/features/storage).
More information on [storage](/docs/using/other-features/storage).

## Connector utility services

### Schema registry service

The schema service is responsible for managing the schema of the records that flow through the pipeline. It provides functionality to infer a schema from a record. The schema is stored in the schema store and can be referenced by connectors and processors. By default, Conduit provides a built-in schema registry, but this service can be run separately from Conduit.

More information on [Schema Registry](/docs/features/schema-support#schema-registry).
More information on [Schema Registry](/docs/using/other-features/schema-support#schema-registry).

![scarf pixel conduit-site-docs-introduction](https://static.scarf.sh/a.png?x-pxid=01346572-0d57-4df3-8399-1425db913a0a)
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Pipeline Semantics"
sidebar_position: 6
slug: '/core-concepts/pipeline-semantics'
---

This document describes the inner workings of a Conduit pipeline, its structure, and behavior. It also describes a
Expand Down
3 changes: 3 additions & 0 deletions docs/0-what-is/1-core-concepts/_category_.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"label": "Core concepts"
}
38 changes: 38 additions & 0 deletions docs/0-what-is/1-core-concepts/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: "Core concepts"
slug: '/core-concepts'
---

## Pipeline

A pipeline receives records from one or multiple source connectors, pushes them through zero
or multiple processors until they reach one or multiple destination connectors.

## Connector

A connector is the internal entity that communicates with a connector plugin and either pushes
records from the plugin into the pipeline (source connector) or the other way around
(destination connector).

## Connector plugin

Sometimes also referred to as "plugin", is an external process which communicates with Conduit
and knows how to read/write records from/to a data source/destination (e.g. a database).

## Processor

A component that executes an operation on a single record that flows through the pipeline.
It can either change the record or filter it out based on some criteria.

## OpenCDC Record

A record represents a single piece of data that flows through a pipeline (e.g. one database row).
[More info here](/docs/core-concepts/opencdc-record).

## Collection

A generic term used in Conduit to describe an entity in a 3rd party system from which records
are read from or to which records they are written to. Examples are: topics (in Kafka), tables
(in a database), indexes (in a search engine), collections (in NoSQL databases), etc.

![scarf pixel conduit-site-docs-introduction](https://static.scarf.sh/a.png?x-pxid=01346572-0d57-4df3-8399-1425db913a0a)
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
---
title: 'Getting Started with Pipeline Configuration Files'
title: 'Getting Started'
sidebar_label: "Getting Started"
sidebar_position: 0
slug: '/getting-started'
---

Pipeline configuration files give you the ability to define pipelines that are
Expand All @@ -13,7 +14,7 @@ configurations.

:::tip

In our [Conduit repository](https://github.com/ConduitIO/conduit), you can find [more examples](https://github.com/ConduitIO/conduit/tree/main/examples/pipelines), but to ilustrate a simple use case we'll show a pipeline using a file as a source, and another file as a destination. Check out the different [specifications](/docs/pipeline-configuration-files/specifications) to see the different configuration options.
In our [Conduit repository](https://github.com/ConduitIO/conduit), you can find [more examples](https://github.com/ConduitIO/conduit/tree/main/examples/pipelines), but to ilustrate a simple use case we'll show a pipeline using a file as a source, and another file as a destination. Check out the different [specifications](/docs/using/pipelines/configuration-file) to see the different configuration options.

:::

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Installing and running"
sidebar_position: 0
hide_table_of_contents: true
slug: '/installing-and-running'
---

import Tabs from '@theme/Tabs';
Expand Down Expand Up @@ -155,11 +155,11 @@ You should now be able to interact with the Conduit UI and HTTP API on port 8080
## Next Steps

Now that you have Conduit installed you can
learn [how to build a pipeline](/docs/how-to/build-generator-to-log-pipeline).
learn [how to get started]÷(/docs/getting-started).
You can also explore some other topics, such as:

- [Pipelines](/docs/pipeline-configuration-files/getting-started)
- [Connectors](/docs/connectors/getting-started)
- [Processors](/docs/processors/getting-started)
- [Pipelines](/docs/using/pipelines/configuration-file)
- [Connectors](/docs/using/connectors/getting-started)
- [Processors](/docs/using/processors/getting-started

![scarf pixel conduit-site-docs-running](https://static.scarf.sh/a.png?x-pxid=db6468a8-7998-463e-800f-58a619edd9b3)
94 changes: 94 additions & 0 deletions docs/1-using/1-configuration.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
title: 'How to configure Conduit'
sidebar_label: 'Configuration'
slug: '/configuration'
---

Conduit accepts CLI flags, environment variables and a configuration file to
configure its behavior. Each CLI flag has a corresponding environment variable
and a corresponding field in the configuration file. Conduit uses the value for
each configuration option based on the following priorities:

## CLI flags

**CLI flags** (highest priority) - if a CLI flag is provided it will always be
respected, regardless of the environment variable or configuration file. To
see a full list of available flags run `conduit --help`:


```bash
$ conduit --help
Usage of conduit:
-api.enabled
enable HTTP and gRPC API (default true)
-config string
global config file (default "conduit.yaml")
-connectors.path string
path to standalone connectors' directory (default "./connectors")
-db.badger.path string
path to badger DB (default "conduit.db")
-db.postgres.connection-string string
postgres connection string, may be a database URL or in PostgreSQL keyword/value format
-db.postgres.table string
postgres table in which to store data (will be created if it does not exist) (default "conduit_kv_store")
-db.sqlite.path string
path to sqlite3 DB (default "conduit.db")
-db.sqlite.table string
sqlite3 table in which to store data (will be created if it does not exist) (default "conduit_kv_store")
-db.type string
database type; accepts badger,postgres,inmemory,sqlite (default "badger")
-grpc.address string
address for serving the gRPC API (default ":8084")
-http.address string
address for serving the HTTP API (default ":8080")
-log.format string
sets the format of the logging; accepts json, cli (default "cli")
-log.level string
sets logging level; accepts debug, info, warn, error, trace (default "info")
-pipelines.error-recovery.backoff-factor int
backoff factor applied to the last delay (default 2)
-pipelines.error-recovery.max-delay duration
maximum delay before restart (default 10m0s)
-pipelines.error-recovery.max-retries int
maximum number of retries (default -1)
-pipelines.error-recovery.max-retries-window duration
amount of time running without any errors after which a pipeline is considered healthy (default 5m0s)
-pipelines.error-recovery.min-delay duration
minimum delay before restart (default 1s)
-pipelines.exit-on-degraded
exit Conduit if a pipeline enters a degraded state
-pipelines.path string
path to the directory that has the yaml pipeline configuration files, or a single pipeline configuration file (default "./pipelines")
-processors.path string
path to standalone processors' directory (default "./processors")
-schema-registry.confluent.connection-string string
confluent schema registry connection string
-schema-registry.type string
schema registry type; accepts builtin,confluent (default "builtin")
-version
prints current Conduit version
```

## Environment variables

**Environment variables** (lower priority) - an environment variable is only
used if no CLI flag is provided for the same option. Environment variables
have the prefix `CONDUIT` and contain underscores instead of dots and
hyphens (e.g. the flag `-db.postgres.connection-string` corresponds
to `CONDUIT_DB_POSTGRES_CONNECTION_STRING`).

## Configuration file

**Configuration file** (lowest priority) - Conduit by default loads the
file `conduit.yaml` placed in the same folder as Conduit. The path to the file
can be customized using the CLI flag `-config`. It is not required to provide
a configuration file and any value in the configuration file can be overridden
by an environment variable or a flag. The file content should be a YAML
document where keys can be hierarchically split on `.`. For example:

```yaml
db:
type: postgres # corresponds to flag -db.type and env variable CONDUIT_DB_TYPE
postgres:
connection-string: postgres://localhost:5432/conduitdb # -db.postgres.connection-string or CONDUIT_DB_POSTGRES_CONNECTION_STRING
```
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: 'OpenCDC record'
sidebar_position: 4
slug: '/core-concepts/opencdc-record'
---

An OpenCDC record in Conduit aims to standardize the format of data records
Expand Down Expand Up @@ -130,7 +130,7 @@ the `opencdc.StructuredData` type.

The supported data types for values in `opencdc.StructuredData` depend on following:
- connector or processor type (built-in or standalone)
- [schema support](/docs/features/schema-support) (enabled or disabled).
- [schema support](/docs/using/other-features/schema-support) (enabled or disabled).

In built-in connectors, the field values can be of any Go type, given that
there's no (de)serialization involved.
Expand Down Expand Up @@ -357,7 +357,7 @@ The version of the destination plugin that has written the record.
```

### `conduit.dlq.nack.error`
Contains the error that caused a record to be nacked and pushed to the [dead-letter queue (DLQ)](/docs/features/dead-letter-queue).
Contains the error that caused a record to be nacked and pushed to the [dead-letter queue (DLQ)](/docs/using/other-features/dead-letter-queue).

### `conduit.dlq.nack.node.id`
The ID of the internal node that nacked the record.
Expand Down
Loading