Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not able to run Kong Gateway Operator (KGO) #1068

Closed
joran-fonjallaz opened this issue May 16, 2024 · 6 comments
Closed

not able to run Kong Gateway Operator (KGO) #1068

joran-fonjallaz opened this issue May 16, 2024 · 6 comments

Comments

@joran-fonjallaz
Copy link

following the official doc, KGO remains in a broken state, where the controller-manager fails with an error, and the controlplane and dataplane deployment get ready.

Steps to reproduce. Create a new GKE cluster. No special config.

  1. copy-paste the commands from Install KIC with Kong Gateway Operator
  2. copy-paste the commands from Create a GatewayClass
  3. copy-paste the commands from Create a Route

the gateway-operator (container manager) throw a few errors such as

"Internal error occurred: failed calling webhook "gateway-operator-validation.konghq.com": failed to call webhook: Post "https://gateway-operator-validating-webhook.kong-system.svc:443/validate?timeout=5s": no endpoints available for service "gateway-operator-validating-webhook""

with the stack trace

"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
	/home/runner/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"

or such messages

level: "info"
logger: "controlplane"
msg: "no ingress services found for dataplane"
name: "kong"
namespace: "default"

the dataplane never becomes ready with Readiness probe failed: HTTP probe failed with statuscode: 503

and similarly for the controlplane with Readiness probe failed: HTTP probe failed with statuscode: 404.

I've tried also playing with the values.yaml, but after a lot of hours, I resolve to asking for help. I don't manage to even make it work by following you doc, and on a fresh GKE cluster.

What am I missing ? Any pointer would be greatly appreciated ! Many thanks

@joran-fonjallaz
Copy link
Author

providing more info.

Controlplane

the response on the controlplane's readiness probe at :10254/readyz is a 404 page not found. The container logs show that the contolplane fails to reach the dataplane on the admin API on port 8444

2024-05-19T07:28:04Z	info	setup	Retrying kong admin api client call after error	{"v": 0, "retries": "50/60", "error": "making HTTP request: Get \"https://10-52-2-37.dataplane-admin-kong-wgz7n-mpvzh.default.svc:8444/\": dial tcp: lookup 10-52-2-37.dataplane-admin-kong-wgz7n-mpvzh.default.svc on 10.17.176.10:53: no such host"}

the format of the URI looks wrong https://10-52-2-37.dataplane-admin-kong-wgz7n-mpvzh.default.svc:8444, but maybe it's only the log format. I didn't dive into the operator's source code.

Dataplane

the response on the dataplane's readiness probe at :8100/status/ready is a 503 Service Temporarily Unavailable with body {"message":"no configuration available (empty configuration present)"}

which makes sense since the controlplane cannot reach the dataplane on its admin port, and thus configure it.

@pmalek
Copy link
Member

pmalek commented May 21, 2024

Hi @joran-fonjallaz,

Due to limitations in kube-dns (which is used by default on GKE) the Admin API endpoints are unreachable using the service scoped dns names which is what ControlPlane (KIC) uses by default (as defined by --gateway-discovery-dns-strategy).

We have 2 issues tracking this Kong/gateway-operator#179 and Kong/gateway-operator#140 and a workaround which uses coredns instead: Kong/gateway-operator#179 (comment).

@joran-fonjallaz
Copy link
Author

thank you @pmalek for getting back to me regarding this issue. Switching to CoreDNS is not an option for me. Do you have any idea if this issue will get solved for the native kube-dns ? If yes, any ticket I can track ?

@pmalek
Copy link
Member

pmalek commented May 22, 2024

I don't believe this is going to change for kube-dns anytime soon. I've created kubernetes/dns#633 to track this feature request.

In the meantime I'm going to close this issue as it's already tracked under Kong/gateway-operator#179 and Kong/gateway-operator#140.

@pmalek pmalek closed this as completed May 22, 2024
@joran-fonjallaz
Copy link
Author

thanks again @pmalek ! So do I understand correctly, you mean that kong as no plan to support GKE for the gateway-operator any time soon ?

@pmalek
Copy link
Member

pmalek commented May 22, 2024

We do want to support GKE but as of now the only option is to use coredns instead of kube-dns.

When Kong/gateway-operator#179 gets resolved we'll have a solution for GKE without the mentioned workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants