Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring deployment #481

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Monitoring deployment #481

wants to merge 13 commits into from

Conversation

d7oc
Copy link
Contributor

@d7oc d7oc commented Feb 8, 2024

Description

This PR adds a monitoring example deployment to the deployment section. It includes the whole Grafana stack. So a Grafana frontend. Backends for Loki, Mimir and Tempo to collect logs, metrics and traces. And Grafana agents to collect the logs, metrics and traces and send them over to Loki, Mimir and Tempo.

Motivation and Context

As monitoring is a quite complex topic it makes sense to have some starting point.

How Has This Been Tested?

Tested with k3d locally.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Technical debt
  • Tests only (no source changes)

Checklist:

@d7oc d7oc requested a review from wkloucek February 9, 2024 18:27
d7oc added 6 commits February 12, 2024 17:26
…dentials

- added s3 storage to tempo as default doesn't configure this for minio even if minio is enabled
- added affinity documentation
- changed endpoint for tempo as new version listens on /v1/traces instead of /otlp/v1/traces
- removed grpc for tracing
- updated agend for rules section for k8s attributes

### Logging in

You can get the admin password with the following command:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably should add that for grafana, too:

kubectl -n grafana get secrets/grafana --template='{{ index .data "admin-password" | base64decode | printf "%s\n" }}'

Comment on lines +66 to +69
{{- with .Values.imagePullSecrets }}
imagePullSecrets:
{{- toYaml . | nindent 8 }}
{{- end }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't fail but is not defined in the values.yaml

agent: grafana-agent-logs
clients:
# default value, is overwritten in LogsInstance
- url: http://{{ .Values.remoteLokiHost }}/loki/api/v1/push
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying it in minkube and having trouble on the grafana-agent-logs-logs pod:

grafana-agent-logs-logs-wjhr4 grafana-agent ts=2024-02-16T15:04:40.879266992Z caller=client.go:430 level=error component=logs logs_config=grafana-agent/primary component=client host=loki.kube.owncloud.test msg="final error sending batch" status=308 tenant= error="server returned HTTP status 308 Permanent Redirect (308): <html>"

Probably this might be a HTTP -> HTTPS redirect?

Curl at least supports that:

curl http://loki.kube.owncloud.test -vv 
* Host loki.kube.owncloud.test:80 was resolved.
* IPv6: (none)
* IPv4: 192.168.49.2
*   Trying 192.168.49.2:80...
* Connected to loki.kube.owncloud.test (192.168.49.2) port 80
> GET / HTTP/1.1
> Host: loki.kube.owncloud.test
> User-Agent: curl/8.5.0
> Accept: */*
> 
< HTTP/1.1 308 Permanent Redirect
< Date: Fri, 16 Feb 2024 15:17:15 GMT
< Content-Type: text/html
< Content-Length: 164
< Connection: keep-alive
< Location: https://loki.kube.owncloud.test
< 
<html>
<head><title>308 Permanent Redirect</title></head>
<body>
<center><h1>308 Permanent Redirect</h1></center>
<hr><center>nginx</center>
</body>
</html>
* Connection #0 to host loki.kube.owncloud.test left intact
*

Turns out I need to explicitly turn off the HTTP -> HTTPS redirect by adding ssl-redirect: "false" on kubectl -n ingress-nginx edit cm/ingress-nginx-controller

Comment on lines +35 to +41
- datasourceUid: tempo
matcherRegex: '"traceid":"([^"]*)"'
name: TraceID
# url will be interpreted as query for the datasource
url: '$${__value.raw}'
# optional for URL Label to set a custom display label for the link.
urlDisplayLabel: 'View Trace'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't I have a "View Trace" button somewhere here?

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should. We need to check next week why this is missing.

@nazliyanik as you also ran that: Was the button available for you?

@wkloucek wkloucek changed the base branch from master to main March 14, 2024 14:02
@wkloucek
Copy link
Contributor

wkloucek commented Jul 9, 2024

@d7oc maybe we could replace your efforts by the LGTM-umbrella Chart, see https://github.com/grafana/helm-charts/blob/lgtm-distributed-2.1.0/charts/lgtm-distributed/values.yaml

@d7oc
Copy link
Contributor Author

d7oc commented Jul 9, 2024

In this case we would add another "component". I don't know the shape this charts is in. And as we use the same manually setup stack internally I would like to benefit from a playground in here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants