Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

renameDocsForKafakConnector #42775

Merged
merged 1 commit into from
Nov 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions sdk/cosmos/azure-cosmos-kafka-connect/dev/README_Sink.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,8 +167,8 @@ To delete the created Azure Cosmos DB service and its resource group using Azure

The following settings are used to configure the Cosmos DB Kafka Sink Connector. These configuration values determine which Kafka topics data is consumed, which Cosmos DB containers data is written into and formats to serialize the data. For an example configuration file with the default values, refer to [this config](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/src/docker/resources/sink.example.json).

- [Generic Configs For Sink And Source](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/doc/configuration-reference.md#generic-configurations)
- [Configs only for Sink](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/doc/configuration-reference.md#sink-connector-configurations)
- [Generic Configs For Sink And Source]<!--(https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/doc/configuration-reference.md#generic-configurations)-->
- [Configs only for Sink]<!--(https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/doc/configuration-reference.md#sink-connector-configurations)-->

Data will always be written to the Cosmos DB as JSON without any schema.

Expand Down
4 changes: 2 additions & 2 deletions sdk/cosmos/azure-cosmos-kafka-connect/dev/README_Source.md
Original file line number Diff line number Diff line change
Expand Up @@ -167,5 +167,5 @@ To delete the created Azure Cosmos DB service and its resource group using Azure

The following settings are used to configure the Cosmos DB Kafka Source Connector. These configuration values determine which Cosmos DB container is consumed, which Kafka topics data is written into and formats to serialize the data. For an example configuration file with the default values, refer to [this config](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/src/docker/resources/source.example.json).

- [Generic Configs For Sink And Source](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/doc/configuration-reference.md#generic-configurations)
- [Configs only for Source](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/doc/configuration-reference.md#source-connector-configurations)
- [Generic Configs For Sink And Source]<!--(https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/doc/configuration-reference.md#generic-configurations)-->
- [Configs only for Source]<!--(https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/doc/configuration-reference.md#source-connector-configurations)-->
132 changes: 132 additions & 0 deletions sdk/cosmos/azure-cosmos-kafka-connect/docs/Confluent_Cloud_Setup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
# Confluent Cloud Setup

This guide walks through setting up Confluent Cloud using Docker containers.

## Prerequisites

- Bash shell
- Will not work in Cloud Shell or WSL1
- Java 11+ ([download](https://www.oracle.com/java/technologies/javase-jdk11-downloads.html))
- Maven ([download](https://maven.apache.org/download.cgi))
- Docker ([download](https://www.docker.com/products/docker-desktop))
- CosmosDB [Setting up an Azure Cosmos DB Instance](https://github.com/Azure/azure-sdk-for-java/blob/main/sdk/cosmos/azure-cosmos-kafka-connect/dev/CosmosDB_Setup.md)

## Setup

### Create Confluent Cloud Account and Setup Cluster
Go to [create account](https://www.confluent.io/get-started/) and fill out the appropriate fields.

![SignupConfluentCloud](images/SignUpConfluentCloud.png)

---

Select environments.

![EnvironmentClick](images/environment-click.png)

---

Select default which is an environment automatically setup by confluent.

![DefaultClick](images/click-default.png)

---

- Select add cluster.

![Add Cluster](images/click-add-cluster.png)

---

- Select Azure create the cluster and choose the same region as the Cosmos DB instance you created.

![Select Azure](images/select-azure.png)

---

- Name the cluster, and then select launch cluster.

![Name and Launch](images/select-name-launch.png)


### Create ksqlDB Cluster
From inside the cluster select ksqlDB. Select add cluster. Select continue, name the cluster, and then select launch.

![ksqlDB](images/select-ksqlDB.png)

### Update Configurations
- The cluster key and secret can be found under api keys in the cluster. Choose the one for ksqlDB. Or generate client config using the CLI and Tools. ![CLI and Tools](images/cli-and-tools.png)
- The `BOOTSTRAP_SERVERS` endpoint can be found in the cluster under cluster settings and end endpoints. Or generate client config using the CLI and Tools. ![CLI and Tools](images/cli-and-tools.png)
- The schema registry key and secret can be found on the bottom of the right panel inside the confluent environment under credentials.
- The schema registry url can be found on the bottom of the right panel inside the confluent environment under Endpoint.

![Schema Registry url](images/schema-registry.png)
![Schema Registry key and secret](images/schema-key-and-secret.png)

### Run Integration Tests
To run the integration tests against a confluent cloud cluster, create ~/kafka-cosmos-local.properties with the following content:
```
ACCOUNT_HOST=[emulator endpoint or you cosmos masterKey]
ACCOUNT_KEY=[emulator masterKey or your cosmos masterKey]
ACCOUNT_TENANT_ID=[update if AAD auth is required in the integration tests]
ACCOUNT_AAD_CLIENT_ID=[update if AAD auth is required in the integration tests]
ACCOUNT_AAD_CLIENT_SECRET=[update is AAD auth is required in the integration tests]
SASL_JAAS=[credential configured on the confluent cloud cluster]
BOOTSTRAP_SERVER=[bootstrap server endpoint of the confluent cloud cluster]
SCHEMA_REGISTRY_URL=[schema registry url of the cloud cluster]
SCHEMA_REGISTRY_KEY=[schema registry key of the cloud cluster]
SCHEMA_REGISTRY_SECRET=[schema registry secret of the cloud cluster]
CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR=3
CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR=3
CONNECT_STATUS_STORAGE_REPLICATION_FACTOR=3
```
Integration tests are having ITest suffix. Use following command to run integration tests([create the topic ahead of time](#create-topic-in-confluent-cloud-ui) )
```bash
mvn -e -Dgpg.skip -Dmaven.javadoc.skip=true -Dcodesnippet.skip=true -Dspotbugs.skip=true -Dcheckstyle.skip=true -Drevapi.skip=true -pl ,azure-cosmos-kafka-connect test package -Pkafka-integration
```

### Run a local sink/source workload by using confluent platform locally
- Following [Install Confluent Platform using ZIP and TAR](https://docs.confluent.io/platform/current/installation/installing_cp/zip-tar.html#prod-kafka-cli-install) to download the library
- Copy src/docker/resources/sink.example.json to the above unzipped confluent folder
- Copy src/docker/resources/source.example.json to the above unzipped confluent folder
- Update the sink.example.json and source.example.json with your cosmos endpoint
- Build the cosmos kafka connector jar
```bash
mvn -e -DskipTests -Dgpg.skip -Dmaven.javadoc.skip=true -Dcodesnippet.skip=true -Dspotbugs.skip=true -Dcheckstyle.skip=true -Drevapi.skip=true -pl ,azure-cosmos,azure-cosmos-tests -am clean install
mvn -e -DskipTests -Dgpg.skip -Dmaven.javadoc.skip=true -Dcodesnippet.skip=true -Dspotbugs.skip=true -Dcheckstyle.skip=true -Drevapi.skip=true -pl ,azure-cosmos-kafka-connect clean install
```
- Copy the built cosmos kafka connector jar to the plugin path folder (you can find from the etc/distributed.properties plugin.path config)
- ```cd unzipped confluent folder```
- Update the etc/distributed.properties file with your confluent cloud cluster config
- Run ./bin/connect-distributed ./etc/distributed.properties
- Start your sink connector or source connector: ```curl -s -H "Content-Type: application/json" -X POST -d @<path-to-JSON-config-file> http://localhost:8083/connectors/ | jq .```
- Monitor the logs and check any exceptions, and also monitor the throughput and other metrics from your confluent cloud cluster

> If you want to delete your connector: ```curl -X DELETE http://localhost:8083/connectors/cosmosdb-source-connector-v2```. The connector name should match the one in your json config.

> If you want to restart your connector: ```curl -s -H "Content-Type: application/json" -X POST http://localhost:8083/connectors/cosmosdb-source-connector-v2/restart | jq .```

> Follow [Kafka Connect REST Interface for Confluent Platform](https://docs.confluent.io/platform/current/connect/references/restapi.html) to check other options.

### Create Topic in Confluent Cloud UI
For some cluster type, you will need to create the topic ahead of time. You can use the UI or through the [Confluent Cli](https://docs.confluent.io/cloud/current/client-apps/topics/manage.html#:~:text=Confluent%20CLI%20Follow%20these%20steps%20to%20create%20a,aren%E2%80%99t%20any%20topics%20created%20yet%2C%20click%20Create%20topic.) (Requires installing the Confluent Cli first).

Inside the Cluster Overview, scroll down and select topics and partitions.

![topic-partition](images/Topics-Partitions.png)

---

Select add topic.

![add-topic](images/add-topic.png)

---

Name the topic and select create with defaults. Afterward, a prompt will appear about creating a schema. This can be
skipped as the tests will create the schemas.

## Resources to Improve Infrastructure
- [Docker Configurations](https://docs.confluent.io/platform/current/installation/docker/config-reference.html)
- [Configuration Options](https://docs.confluent.io/platform/current/installation/configuration/index.html)
- [Connect Confluent Platform Components to Confluent Cloud](https://docs.confluent.io/cloud/current/cp-component/index.html)
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Confluent Platform Setup

This guide walks through setting up Confluent Platform using Docker containers.

## Prerequisites

- Bash shell
- Will not work in Cloud Shell or WSL1
- Java 11+ ([download](https://www.oracle.com/java/technologies/javase-jdk11-downloads.html))
- Maven ([download](https://maven.apache.org/download.cgi))
- Docker ([download](https://www.docker.com/products/docker-desktop))
- Powershell (optional) ([download](https://learn.microsoft.com/powershell/scripting/install/installing-powershell))

### Startup

> Running either script for the first time may take several minutes to run in order to download docker images for the Confluent platform components.
```bash

cd $REPO_ROOT/src/docker

# Option 1: Use the bash script to setup
./startup.sh

# Option 2: Use the powershell script to setup
pwsh startup.ps1

# verify the services are up and running
docker-compose ps

```

> You may need to increase the memory allocation for Docker to 3 GB or more.
>
> Rerun the startup script to reinitialize the docker containers.
Your Confluent Platform setup is now ready to use!

### Running Kafka Connect standalone mode

The Kafka Connect container that is included with the Confluent Platform setup runs as Kafka connect as `distributed mode`. Using Kafka Connect as `distributed mode` is *recommended* since you can interact with connectors using the Control Center UI.

If you instead would like to run Kafka Connect as `standalone mode`, which is useful for quick testing, continue through this section. For more information on Kafka Connect standalone and distributed modes, refer to these [Confluent docs](https://docs.confluent.io/home/connect/userguide.html#standalone-vs-distributed-mode).

### Access Confluent Platform components

| Name | Address | Description |
| --- |-------------------------| --- |
| Control Center | <http://localhost:9021> | The main webpage for all Confluent services where you can create topics, configure connectors, interact with the Connect cluster (only for distributed mode) and more. |
| Kafka Topics UI | <http://localhost:9000> | Useful to viewing Kafka topics and the messages within them. |
| Schema Registry UI | <http://localhost:9001> | Can view and create new schemas, ideal for interacting with Avro data. |
| ZooNavigator | <http://localhost:9004> | Web interface for Zookeeper. Refer to the [docs](https://zoonavigator.elkozmon.com/en/stable/) for more information. |

### Cleanup

Tear down the Confluent Platform setup and cleanup any unneeded resources

```bash

cd $REPO_ROOT/src/docker

# bring down all docker containers
docker-compose down

# remove dangling volumes and networks
docker system prune -f --volumes --filter "label=io.confluent.docker"

```
95 changes: 95 additions & 0 deletions sdk/cosmos/azure-cosmos-kafka-connect/docs/CosmosDB_Setup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# Setting up an Azure Cosmos DB Instance

## Prerequisites

- Azure subscription with permissions to create:
- Resource Groups, Cosmos DB
- Bash shell (tested on Visual Studio Codespaces, Cloud Shell, Mac, Ubuntu, Windows with WSL2)
- Will not work with WSL1
- Azure CLI ([download](https://learn.microsoft.com/cli/azure/install-azure-cli?view=azure-cli-latest))

## Create Azure Cosmos DB Instance, Database and Container

Login to Azure and select subscription.

```bash

az login

# show your Azure accounts
az account list -o table

# select the Azure subscription if necessary
az account set -s {subscription name or Id}

```

Create a new Azure Resource Group for this quickstart, then add to it a Cosmos DB Account, Database and Container using the Azure CLI.

> The `az cosmosdb sql` extension is currently in preview and is subject to change

```bash

# replace with a unique name
# do not use punctuation or uppercase (a-z, 0-9)
export Cosmos_Name={your Cosmos DB name}

## if true, change name to avoid DNS failure on create
az cosmosdb check-name-exists -n ${Cosmos_Name}

# set environment variables
export Cosmos_Location="centralus"
export Cosmos_Database="kafkaconnect"
export Cosmos_Container="kafka"

# Resource Group Name
export Cosmos_RG=${Cosmos_Name}-rg-cosmos

# create a new resource group
az group create -n $Cosmos_RG -l $Cosmos_Location

# create the Cosmos DB server
# this command takes several minutes to run
az cosmosdb create -g $Cosmos_RG -n $Cosmos_Name

# create the database
# 400 is the minimum --throughput (RUs)
az cosmosdb sql database create -a $Cosmos_Name -n $Cosmos_Database -g $Cosmos_RG --throughput 400

# create the container
# /id is the partition key (case sensitive)
az cosmosdb sql container create -p /id -g $Cosmos_RG -a $Cosmos_Name -d $Cosmos_Database -n $Cosmos_Container

# OPTIONAL: Enable Time to Live (TTL) on the container
export Cosmos_Container_TTL=1000
az cosmosdb sql container update -g $Cosmos_RG -a $Cosmos_Name -d $Cosmos_Database -n $Cosmos_Container --ttl=$Cosmos_Container_TTL

```

With the Azure Cosmos DB instance setup, you will need to get the Cosmos DB endpoint URI and primary connection key. These values will be used to setup the Cosmos DB Source and Sink connectors.

```bash

# Keep note of both of the following values as they will be used later

# get Cosmos DB endpoint URI
echo https://${Cosmos_Name}.documents.azure.com:443/

# get Cosmos DB primary connection key
az cosmosdb keys list -n $Cosmos_Name -g $Cosmos_RG --query primaryMasterKey -o tsv

```

### Cleanup

Remove the Cosmos DB instance and the associated resource group

```bash

# delete Cosmos DB instance
az cosmosdb delete -g $Cosmos_RG -n $Cosmos_Name

# delete Cosmos DB resource group
az group delete --no-wait -y -n $Cosmos_RG

```
Loading
Loading