Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5xx errors in load generator #1800

Open
agardnerIT opened this issue Nov 28, 2024 · 6 comments
Open

5xx errors in load generator #1800

agardnerIT opened this issue Nov 28, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@agardnerIT
Copy link

agardnerIT commented Nov 28, 2024

Bug Report

helm list -A
my-otel-demo    ......   deployed    opentelemetry-demo-0.33.4     1.12.0

Symptom

Lots and lots of 5xx errors.

Failed to load resource: the server responded with a status of 503 (Service Unavailable)
Failed to load resource: net::ERR_CONNECTION_REFUSED
Failed to load resource: net::ERR_CONNECTION_REFUSED
Failed to load resource: net::ERR_CONNECTION_REFUSED
Failed to load resource: net::ERR_CONNECTION_REFUSED
Failed to load resource: the server responded with a status of 503 (Service Unavailable)

What is the expected behavior?

No 5xx errors

What do you expect to see?

What is the actual behavior?

Please describe the actual behavior experienced.

Reproduce

Could you provide the minimum required steps to resolve the issue you're seeing?

We will close this issue if:

  • The steps you provided are complex.
  • If we can not reproduce the behavior you're reporting.

Additional Context

kubectl -n <REDACTED> describe pod/my-otel-demo-loadgenerator-867c949bd7-dcgtt
Name:             my-otel-demo-loadgenerator-867c949bd7-dcgtt
Namespace:        <REDACTED>
Priority:         0
Service Account:  my-otel-demo
Start Time:       Thu, 28 Nov 2024 14:16:51 +1000
Labels:           app.kubernetes.io/component=loadgenerator
                  app.kubernetes.io/instance=my-otel-demo
                  app.kubernetes.io/name=my-otel-demo-loadgenerator
                  opentelemetry.io/name=my-otel-demo-loadgenerator
                  pod-template-hash=867c949bd7
Annotations:      <none>
Status:           Running
IP:               <REDACTED>
IPs:
  IP:           <REDACTED>
Controlled By:  ReplicaSet/my-otel-demo-loadgenerator-867c949bd7
Containers:
  loadgenerator:
    Container ID:   containerd://344df4921a5b3b84577b3a85bed54f9c6db1ef1547c5f46a47c6798fbc24912b
    Image:          ghcr.io/open-telemetry/demo:1.12.0-loadgenerator
    Image ID:       ghcr.io/open-telemetry/demo@sha256:85c9935ff31b7ab575903fbd0b56a3161ec13e508966df25dc68fcfe7af5ec98
    Port:           8089/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Thu, 28 Nov 2024 14:16:52 +1000
    Ready:          True
    Restart Count:  0
    Limits:
      memory:  1500Mi
    Requests:
      memory:  1500Mi
    Environment:
      OTEL_SERVICE_NAME:                                   (v1:metadata.labels['app.kubernetes.io/component'])
      OTEL_COLLECTOR_NAME:                                my-otel-demo-otelcol
      OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE:  cumulative
      LOCUST_WEB_PORT:                                    8089
      LOCUST_USERS:                                       10
      LOCUST_SPAWN_RATE:                                  1
      LOCUST_HOST:                                        http://my-otel-demo-frontendproxy.<REDACTED>.svc.cluster.local:8080
      LOCUST_HEADLESS:                                    false
      LOCUST_AUTOSTART:                                   true
      LOCUST_BROWSER_TRAFFIC_ENABLED:                     true
      PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION:             python
      FLAGD_HOST:                                         my-otel-demo-flagd
      FLAGD_PORT:                                         8013
      OTEL_EXPORTER_OTLP_ENDPOINT:                        http://$(OTEL_COLLECTOR_NAME):4317
      OTEL_RESOURCE_ATTRIBUTES:                           service.name=$(OTEL_SERVICE_NAME),service.namespace=opentelemetry-demo,service.version=1.12.0
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-qzcxc (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True
  Initialized                 True
  Ready                       True
  ContainersReady             True
  PodScheduled                True
Volumes:
  kube-api-access-qzcxc:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  4m14s  default-scheduler  Successfully assigned <REDACTED>/my-otel-demo-loadgenerator-867c949bd7-dcgtt to <REDACTED>
  Normal  Pulled     4m14s  kubelet            Container image "ghcr.io/open-telemetry/demo:1.12.0-loadgenerator" already present on machine
  Normal  Created    4m14s  kubelet            Created container loadgenerator
  Normal  Started    4m14s  kubelet            Started container loadgenerator
kubectl -n <REDACTED> get svc
NAME                                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                                                     AGE
my-otel-demo-adservice               ClusterIP   <REDACTED>   <none>        8080/TCP                                                                    169m
my-otel-demo-cartservice             ClusterIP   <REDACTED>    <none>        8080/TCP                                                                    169m
my-otel-demo-checkoutservice         ClusterIP   <REDACTED>     <none>        8080/TCP                                                                    169m
my-otel-demo-currencyservice         ClusterIP   <REDACTED>    <none>        8080/TCP                                                                    169m
my-otel-demo-emailservice            ClusterIP   <REDACTED>    <none>        8080/TCP                                                                    169m
my-otel-demo-flagd                   ClusterIP   <REDACTED>     <none>        8013/TCP,4000/TCP                                                           169m
my-otel-demo-frontend                ClusterIP   <REDACTED>   <none>        8080/TCP                                                                    169m
my-otel-demo-frontendproxy           ClusterIP   <REDACTED>    <none>        8080/TCP                                                                    169m
my-otel-demo-imageprovider           ClusterIP   <REDACTED>    <none>        8081/TCP                                                                    169m
my-otel-demo-kafka                   ClusterIP   <REDACTED>     <none>        9092/TCP,9093/TCP                                                           169m
my-otel-demo-loadgenerator           ClusterIP   <REDACTED>     <none>        8089/TCP                                                                    169m
my-otel-demo-otelcol                 ClusterIP   <REDACTED>   <none>        6831/UDP,14250/TCP,14268/TCP,8888/TCP,4317/TCP,4318/TCP,9464/TCP,9411/TCP   169m
my-otel-demo-paymentservice          ClusterIP   <REDACTED>    <none>        8080/TCP                                                                    169m
my-otel-demo-productcatalogservice   ClusterIP   <REDACTED>   <none>        8080/TCP                                                                    169m
my-otel-demo-quoteservice            ClusterIP   <REDACTED>   <none>        8080/TCP                                                                    169m
my-otel-demo-recommendationservice   ClusterIP   <REDACTED>     <none>        8080/TCP                                                                    169m
my-otel-demo-shippingservice         ClusterIP   <REDACTED>    <none>        8080/TCP                                                                    169m
my-otel-demo-valkey                  ClusterIP  <REDACTED>      <none>        6379/TCP                                                                    169m

curl from on-cluster pod

It works:

$ kubectl -n <REDACTED> run mycurlpod --image=curlimages/curl -i --tty -- sh
~ $ curl http://my-otel-demo-frontendproxy.<REDACTED>.svc.cluster.local:8080
<!DOCTYPE html><html><head><meta charSet="utf-8"/><meta name.....
@agardnerIT agardnerIT added the bug Something isn't working label Nov 28, 2024
@julianocosta89
Copy link
Member

@agardnerIT are all pods running?

@agardnerIT
Copy link
Author

I've turned some off (like OpenSearch, Jaeger, Grafana and accounting service because it keeps crashing) but other than that, yes they are.

components:
  grafana:
    enabled: false
  opensearch:
    enabled: false
  jaeger:
    enabled: false
  prometheus:
    enabled: false
  accountingService:
    enabled: false
$ kubectl -n <REDACTED> get pods
NAME                                                        READY   STATUS      RESTARTS        AGE
my-otel-demo-adservice-6f4f57766f-nq8bj                     1/1     Running     0               6h27m
my-otel-demo-cartservice-6bfc654788-l64ft                   1/1     Running     0               6h27m
my-otel-demo-checkoutservice-5cdc66f5cc-kdckm               1/1     Running     0               6h27m
my-otel-demo-currencyservice-cfd644bbf-hcljk                1/1     Running     0               6h27m
my-otel-demo-emailservice-5955c8ddfd-mf9h2                  1/1     Running     0               6h27m
my-otel-demo-flagd-5b9f48f7b5-ft7rf                         2/2     Running     15              6h27m
my-otel-demo-frauddetectionservice-6895d4c998-5c5d4         1/1     Running     0               6h27m
my-otel-demo-frontend-65c644ffdf-ntvsq                      1/1     Running     0               3h50m
my-otel-demo-frontendproxy-fbb8588cf-c8s2m                  1/1     Running     0               3h50m
my-otel-demo-imageprovider-65dd7698c9-bkfk7                 1/1     Running     0               6h27m
my-otel-demo-kafka-6678c45b5c-6x588                         1/1     Running     1 (59m ago)     6h27m
my-otel-demo-loadgenerator-867c949bd7-b5bsn                 1/1     Running     0               3h24m
my-otel-demo-otelcol-7cb964855d-sd2p9                       1/1     Running     0               3h55m
my-otel-demo-paymentservice-69fb7df989-mnfwd                1/1     Running     0               6h27m
my-otel-demo-productcatalogservice-6f4b457bfd-82t4v         1/1     Running     0               3h52m
my-otel-demo-quoteservice-7d4f6f9666-2ppx2                  1/1     Running     0               6h27m
my-otel-demo-recommendationservice-9f4496497-72hs8          1/1     Running     0               6h27m
my-otel-demo-shippingservice-695d794b8d-4wrkc               1/1     Running     0               6h27m
my-otel-demo-valkey-6cf4dcccbf-nxbcv                        1/1     Running     0               6h27m
mycurlpod                                                   1/1     Running     1 (3h12m ago)   3h18m

@julianocosta89
Copy link
Member

@puckpuck any ideas here?
I'm starting to suspect of the PR #1785.

Have you redeployed your running demo in K8s after that was merged?

@agardnerIT
Copy link
Author

Have you redeployed your running demo in K8s after that was merged? Have I?

I deployed this today.

@julianocosta89
Copy link
Member

Have you redeployed your running demo in K8s after that was merged? Have I?

I deployed this today.

Sorry @agardnerIT, this question was for @puckpuck 😅 .
He has the demo running also using Helm.

Locally, we have one issue with accountingservice, but yours is new.

@agardnerIT
Copy link
Author

agardnerIT commented Nov 29, 2024

Side note: It would be good to document which services can be disabled and what effects (if any) that has on the core demo / usecases.

For example, I've turned off my accounting service - but I have no idea whether that matters to the "core system" or not.

Perhaps a new column / table on this page like:

service core service technology
accountingservice .NET
grafana 🔴 ...

Note: Turning off core services may effect the demo system. Non-core services are "safe" to disable and cause minimal disruption to the core usecases for the demo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants