Example of three Kafka brokers depending on five Zookeeper instances.
To get consistent service DNS names kafka-N.broker.kafka
(.svc.cluster.local
), run everything in a namespace:
kubectl create -f 00namespace.yml
You may add storage class to the kafka StatefulSet declaration to enable automatic volume provisioning.
Alternatively create PVs and PVCs manually. For example in Minikube.
./bootstrap/pv.sh
kubectl create -f ./bootstrap/pvc.yml
# check that claims are bound
kubectl get pvc
There is a Zookeeper+StatefulSet blog post and example, but it appears tuned for workloads heavier than Kafka topic metadata.
The Kafka book (Definitive Guide, O'Reilly 2016) recommends that Kafka has its own Zookeeper cluster, so we use the official docker image but with a startup script change to guess node id from hostname.
Zookeeper runs as a Deployment without persistent storage:
kubectl create -f ./zookeeper/
If you lose your zookeeper cluster, kafka will be unaware that persisted topics exist. The data is still there, but you need to re-create topics.
Assuming you have your PVCs Bound
, or enabled automatic provisioning (see above), go ahead and:
kubectl create -f ./
You might want to verify in logs that Kafka found its own DNS name(s) correctly. Look for records like:
kubectl logs kafka-0 | grep "Registered broker"
# INFO Registered broker 0 at path /brokers/ids/0 with addresses: PLAINTEXT -> EndPoint(kafka-0.broker.kafka.svc.cluster.local,9092,PLAINTEXT)
There's a Kafka pod that doesn't start the server, so you can invoke the various shell scripts.
kubectl create -f test/99testclient.yml
See ./test/test.sh
for some sample commands.
This is WIP, but topic creation has been automated. Note that as a Job, it will restart if the command fails, including if the topic exists :(
kubectl create -f test/11topic-create-test1.yml
Pods that keep consuming messages (but they won't exit on cluster failures)
kubectl create -f test/21consumer-test1.yml
Testing and retesting... delete the namespace. PVs are outside namespaces so delete them too.
kubectl delete namespace kafka
rm -R ./data/ && kubectl delete pv datadir-kafka-0 datadir-kafka-1 datadir-kafka-2