Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Application pods are not communicating using service name. #506

Open
click2cloud-rajat opened this issue Jun 11, 2021 · 5 comments
Open
Assignees

Comments

@click2cloud-rajat
Copy link

Repro steps:

  1. Ubuntu 18.04.05 and 5.6.0-rc2 kernel
  2. Start K8s 1.21.1 master via kubeadm Steps followed
  3. Install Mizar v0.8 using https://github.com/CentaurusInfra/mizar/blob/dev-next/etc/deploy/deploy.mizar.yaml
  4. Deploy 2 application, in which, Application A (Kafka) is dependent on Application B (Zookeeper), we have provided service name of Application B in configuration of Application A.
  5. Output: Application B comes in Running state and Application A goes into error or CrashLoopBackOff
root@node1:~# kubectl get po
NAME                              READY   STATUS             RESTARTS   AGE
kafka-7b88788f47-89qbc            0/1     CrashLoopBackOff   24         102m
mizar-daemon-dxdgk                1/1     Running            0          21h
mizar-operator-79d4846f95-6lvvx   1/1     Running            0          21h
zookeeper-7856868df8-5czvs        1/1     Running            0          103m

root@node1:~# kubectl get svc
NAME         TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
kafka        ClusterIP   10.97.10.23    <none>        9092/TCP   105m
kubernetes   ClusterIP   10.96.0.1      <none>        443/TCP    21h
zookeeper    ClusterIP   10.96.148.83   <none>        2181/TCP   105m

Events of Kafka:

Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  55s                default-scheduler  Successfully assigned default/kafka-7b88788f47-6dntk to node1
  Normal   Logging    54s                kopf               Creation event is processed: 1 succeeded; 0 failed.
  Normal   Logging    54s                kopf               Handler 'builtins_on_pod' succeeded.
  Normal   Pulled     15s (x3 over 51s)  kubelet            Container image "wurstmeister/kafka:2.11-2.0.1" already present on machine
  Normal   Created    15s (x3 over 51s)  kubelet            Created container kafka
  Normal   Started    14s (x3 over 51s)  kubelet            Started container kafka
  Warning  BackOff    5s (x2 over 28s)   kubelet            Back-off restarting failed container

  • If we change the Kafka's configuration YAML by replacing cluster ip of Zookeeper service instead of service name, Kafka pods comes in running state.
  • This is due to the 'Not Ready' state of CoreDNS service due to which it is not resolving service-name of Zookeeper.
@vinaykul
Copy link
Member

vinaykul commented Oct 5, 2021

C2C we have fixed this issue with release v0.9. Can you please verify?
Thanks.

@click2cloud-rajat
Copy link
Author

Sure, we will verify from our side and revert if any issue arises

@Hong-Chang
Copy link
Collaborator

Following is the script for step 4. (It's applying a yaml file to deploy zookeeper and kafka)

cat > /tmp/tmp.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
namespace: default
labels:
io.soda.service: zookeeper
app: soda-zookeeper
name: zookeeper
spec:
replicas: 1
selector:
matchLabels:
io.soda.service: zookeeper
strategy: {}
template:
metadata:
labels:
io.soda.service: zookeeper
app: soda-zookeeper
spec:
containers:
- image: wurstmeister/zookeeper
name: zookeeper
ports:
- containerPort: 2181
resources: {}
restartPolicy: Always

apiVersion: v1
kind: Service
metadata:
namespace: default
creationTimestamp: null
labels:
io.soda.service: zookeeper
name: zookeeper
spec:
ports:
- name: "2181"
port: 2181
targetPort: 2181
selector:
io.soda.service: zookeeper
status:
loadBalancer: {}


apiVersion: apps/v1
kind: Deployment
metadata:
namespace: default
labels:
io.soda.service: kafka
name: kafka
spec:
replicas: 1
selector:
matchLabels:
io.soda.service: kafka
strategy: {}
template:
metadata:
labels:
io.soda.service: kafka
spec:
containers:
- env:
- name: KAFKA_PORT
value: "9092"
- name: KAFKA_ADVERTISED_LISTENERS
value: PLAINTEXT://kafka:9092
- name: KAFKA_LISTENERS
value: PLAINTEXT://:9092
- name: KAFKA_ZOOKEEPER_CONNECT
value: zookeeper:2181
image: wurstmeister/kafka:2.11-2.0.1
name: kafka
ports:
- containerPort: 9092
resources: {}
restartPolicy: Always


apiVersion: v1
kind: Service
metadata:
namespace: default
labels:
io.soda.service: kafka
name: kafka
spec:
ports:
- name: "9092"
port: 9092
targetPort: 9092
selector:
io.soda.service: kafka
status:
loadBalancer: {}

(ctrl+D)

kubectl apply -f /tmp/tmp.yaml

@Hong-Chang
Copy link
Collaborator

I did repro and got following findings:

  1. kafka pod depends on zookeeper pod. kafka pod also depends on coredns pods. When coredns pods running as 1/1, kafka pod will be running well.
  2. Currently because coredns pods issue #549 , this issue will always repro.
  3. coredns pods issue #549 is caused from commit c7d6ab7. If I sync code to be state right before the commit, this issue is affected by issue Instable and restarting pods coredns-* and local-path-provisioner-* after deployment #545 . Instable and restarting pods coredns-* and local-path-provisioner-* after deployment #545 is a timing issue. If delete the coredns pods, then coredns pods will be restarted and running well, then kafka will immediately be running well.
  4. But after commit c7d6ab7, the coredns pods are not running well. Delete the coredns pods cannot make them running back well. Hence this issue will always repro.

We need to address issue #549 first, then address issue #545 , then this issue will be ok.

@click2cloud-rajat
Copy link
Author

Thanks for the update. Our team has also tested for the below scenarios:

  • K8s (v 1.21.1)+ mizar -- Service communication not working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants