Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use structured logging everywhere #113

Open
astefanutti opened this issue May 30, 2023 · 8 comments
Open

Use structured logging everywhere #113

astefanutti opened this issue May 30, 2023 · 8 comments
Assignees

Comments

@astefanutti
Copy link
Contributor

astefanutti commented May 30, 2023

The operator pod logs contain unstructured statements like:

I0530 13:59:55.649049 1 request.go:690] Waited for 1.040481717s due to client-side throttling, not priority and fairness, request: GET:https://172.30.0.1:443/apis/cloud.network.openshift.io/v1?timeout=32s

Or:

I0530 13:59:57.605555 1 leaderelection.go:248] attempting to acquire leader lease openshift-operators/5a3ca514.codeflare.dev...
E0530 14:00:57.607293 1 leaderelection.go:330] error retrieving resource lock openshift-operators/5a3ca514.codeflare.dev: the server was unable to return a response in the time allotted, but may still be processing the request (get leases.coordination.k8s.io 5a3ca514.codeflare.dev)
E0530 14:02:01.484918 1 leaderelection.go:330] error retrieving resource lock openshift-operators/5a3ca514.codeflare.dev: the server was unable to return a response in the time allotted, but may still be processing the request (get leases.coordination.k8s.io 5a3ca514.codeflare.dev)
E0530 14:04:16.991007 1 leaderelection.go:330] error retrieving resource lock openshift-operators/5a3ca514.codeflare.dev: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/openshift-operators/leases/5a3ca514.codeflare.dev": dial tcp 172.30.0.1:443: connect: connection refused - error from a previous attempt: unexpected EOF
E0530 14:04:19.050293 1 leaderelection.go:330] error retrieving resource lock openshift-operators/5a3ca514.codeflare.dev: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/openshift-operators/leases/5a3ca514.codeflare.dev": dial tcp 172.30.0.1:443: connect: connection refused
I0530 14:04:49.264891 1 leaderelection.go:258] successfully acquired lease openshift-operators/5a3ca514.codeflare.dev

These come from components that rely on klog, which isn't configured to use the correct logging backend.

Also, klog CLI option flags should be bound, so it's possible, for example, to turn on JSON output for easier parsing by downstream log management tools.

Some information about Kubernetes structured logging can be found at https://kubernetes.io/blog/2020/09/04/kubernetes-1-19-introducing-structured-logs/.

@KPostOffice
Copy link
Collaborator

I think that this might just require adding a line like this to here.

@KPostOffice KPostOffice moved this from Todo to In Progress in Project CodeFlare Sprint Board Oct 31, 2023
@KPostOffice
Copy link
Collaborator

@astefanutti Is this actually two seperate issues? One issue being the configuration of klog (referenced above) and the other being using klog.InfoS instead of klog.InfoF when logging in MCAD and InstaScale?

@KPostOffice
Copy link
Collaborator

@astefanutti Is this actually two seperate issues? One issue being the configuration of klog (referenced above) and the other being using klog.InfoS instead of klog.InfoF when logging in MCAD and InstaScale?

@anishasthana Mentioned holding off on this aspect for the MCAD refactor

@astefanutti
Copy link
Contributor Author

@KPostOffice right, I think that should break down into:

  • Set klog logger to that of controller runtime
  • Migrate InstaScale to using controller runtime logging (instead of using klog directly), and use a named logger into the InstaScale controller (so each controller logs can be filtered)
  • Dito for MCAD, which may be tactically delayed until the MCAD refactor is completed as per @anishasthana suggestion.

@KPostOffice
Copy link
Collaborator

@astefanutti I'm confused about "use a named logger...so each controller logs can be filtered"

@astefanutti
Copy link
Contributor Author

@astefanutti I'm confused about "use a named logger...so each controller logs can be filtered"

@KPostOffice it's a fancy a way of saying we should use go-logr WithName API to create logger for each controller, so it's easy to filter logs from one or the other.

@KPostOffice
Copy link
Collaborator

@astefanutti I'm confused about "use a named logger...so each controller logs can be filtered"

@KPostOffice it's a fancy a way of saying we should use go-logr WithName API to create logger for each controller, so it's easy to filter logs from one or the other.

like this?

@astefanutti
Copy link
Contributor Author

@KPostOffice it's a fancy a way of saying we should use go-logr WithName API to create logger for each controller, so it's easy to filter logs from one or the other.

like this?

@KPostOffice yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

2 participants