Rework the Kafka Guide #124

matkuliak · 2024-09-16T22:13:47Z

Summary of the changes / Why this is an improvement

Original guide wasn't working anymore. It's similar, just dockerized and updated.

Preview

https://cratedb-guide--124.org.readthedocs.build/integrate/etl/kafka-connect.html

Checklist

Link to issue this PR refers to (if applicable): https://github.com/crate/tech-writing/issues/428
CLA is signed

amotl

Hi. Thanks a stack for bringing this up to speed. I did not verify the procedure, but the document looks good. On two spots, I added suggestions about converting to active voice, but in general it is good to go.

amotl · 2024-09-17T11:01:46Z

docs/integrate/etl/kafka-connect.rst

-For this example, this Python script will simulate the creation
-of random sensor data and push it into the ``metrics`` topic:
-
-.. code-block:: python
-
-   import time
-   import random
-
-   from confluent_kafka import avro
-   from confluent_kafka.avro import AvroProducer
-
-   # Define the Avro schema we want our produced records to conform to.
-   VALUE_SCHEMA_STR = """
-   {
-     "namespace": "cratedb.metrics",
-     "name": "value",
-     "type": "record",
-     "fields": [
-       {"name": "id", "type": "string"},
-       {"name": "timestamp", "type": "float"},
-       {"name": "payload", "type": {
-           "type": "record",
-           "name": "payload",
-           "fields": [
-             {"name": "temperature", "type": "float"},
-             {"name": "humidity", "type": "float"},
-             {"name": "pressure", "type": "float"},
-             {"name": "luminosity", "type": "float"}
-           ]
-         }
-       }
-     ]
-   }
-   """
-
-   # Load the Avro schema.
-   VALUE_SCHEMA = avro.loads(VALUE_SCHEMA_STR)
-
-   # Create an Avro producer using the defined schema, assuming that our
-   # Kafka servers are running at localhost:9092 and the Schema Registry
-   # server is running at localhost:8081.
-   AVRO_PRODUCER = AvroProducer(
-       {
-           "bootstrap.servers": "localhost:9092",
-           "schema.registry.url": "http://localhost:8081",
-       },
-       default_value_schema=VALUE_SCHEMA,
-   )
-
-   # Create a metric payload from a simulated sensor device.
-   def create_metric():
-       return {
-           "id": "sensor-" + str(random.choice(list(range(1, 21)))),
-           "timestamp": int(time.time()),
-           "payload": {
-               "temperature": random.uniform(-5, 35),
-               "humidity": random.uniform(0, 100),
-               "pressure": random.uniform(1000, 1030),
-               "luminosity": random.uniform(0, 65000),
-           },
-       }
-
-   # Create a new metric every 0.25 seconds and push it to the metrics topic.
-   while True:
-       AVRO_PRODUCER.produce(topic="metrics", value=create_metric())
-       AVRO_PRODUCER.flush()
-       time.sleep(0.25)
-
-This script can be run by installing the following dependencies and running it:
-
-.. code-block:: console
-
-   $ pip install "confluent-kafka[avro]" "avro-python3"
-   $ python simulator.py


I am sad to see this Python snippet removed. Maybe we can bring it back in one way or another, if not within this tutorial, maybe at another spot.

Absolutely. The current setup is pretty crude, I'm sure we can improve it with a similar script

amotl · 2024-09-17T11:02:24Z

docs/integrate/etl/kafka-connect.md

+This guide describes a dockerized procedure for integrating CrateDB with Kafka
+Connect. By following these steps, you will set up a pipeline to ingest data 
+from Kafka topics into CrateDB seamlessly.


Active voice, very good.

docs/integrate/etl/kafka-connect.md

Co-authored-by: Andreas Motl <[email protected]>

matkuliak requested review from hammerhead and amotl September 16, 2024 22:13

cla-bot bot added the cla-signed label Sep 16, 2024

Rework the Kafka Guide

09c21ee

matkuliak force-pushed the mm/kafka-connect-update branch from 66b387f to 09c21ee Compare September 16, 2024 22:26

matkuliak requested a review from zolbatar September 17, 2024 09:03

amotl approved these changes Sep 17, 2024

View reviewed changes

Apply suggestions from code review

afbf902

Co-authored-by: Andreas Motl <[email protected]>

matkuliak merged commit eb62278 into main Sep 17, 2024
4 checks passed

matkuliak deleted the mm/kafka-connect-update branch September 17, 2024 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework the Kafka Guide #124

Rework the Kafka Guide #124

matkuliak commented Sep 16, 2024 •

edited

Loading

amotl left a comment

amotl Sep 17, 2024

matkuliak Sep 17, 2024

amotl Sep 17, 2024

Rework the Kafka Guide #124

Rework the Kafka Guide #124

Conversation

matkuliak commented Sep 16, 2024 • edited Loading

Summary of the changes / Why this is an improvement

Preview

Checklist

amotl left a comment

Choose a reason for hiding this comment

amotl Sep 17, 2024

Choose a reason for hiding this comment

matkuliak Sep 17, 2024

Choose a reason for hiding this comment

amotl Sep 17, 2024

Choose a reason for hiding this comment

matkuliak commented Sep 16, 2024 •

edited

Loading