-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate support for distributed kafka connect #109
Comments
This needs to be mulled over a bit. Pro's and cons as I see them, using the current docker files:
I can see this as an alternate run mode that could be added to the respective charts. The operator would then be in charge of sending the appropriate JSON file every time a pod restarts. This looks very error-prone to me. Alternatively, each relevant Dockerfile is adapted to use distributed mode. The entry point script then polls whether the distributed connector engine has started and send a JSON file from a predefined path. If you could make the relevant change to the connector you want to run in distributed mode, we could adapt the helm chart to cater for this change. |
If we alter the Dockerfile to use distributed mode can we still have them running with only a single instance? |
If it automatically starts up the actual component during startup, from K8S of view, it would be the same. The only difference is that it will then store offsets in Kafka instead of in a persistent volume. |
The Kafka connectors (eg Fitbit) are currently deployed in standalone mode, to take full advantage of the scalability of Kubernetes these can be deployed in distributed mode (in which the connectors themselves are stateless and store state in kafka). At KCL we have some use cases where running in distributed mode is necessary.
The text was updated successfully, but these errors were encountered: