Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add label values into traffic in agent side and retrieve the values from traffic in transit side #478

Merged
merged 3 commits into from
May 12, 2021

Conversation

Hong-Chang
Copy link
Collaborator

What does this PR do?

  1. Add necessary geneve option structs for label values
  2. Add label values intro traffic in agent side
  3. Retrieve label values from traffic in transit side

How was this tested?
I did manual e2e test and label values can be retrieved correctly in transit side from packet.

Are there any user facing / API changes?
No.

@vinaykul
Copy link
Member

vinaykul commented May 7, 2021

Can you see the labels on the wire? (e.g. tcpdump capture on the egress NIC of the node)
[Hong] The data is on the wire because it's correctly retrieved in transit side. I checked tcpdump data but it's difficult to recognize because there are many other data in the packet.

Copy link
Member

@vinaykul vinaykul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should handle the cases where there are no labels, and cases where just one of the labels (pod / ns) are present as well.

[Hong] label value = 0 covers the cases.

@vinaykul
Copy link
Member

vinaykul commented May 7, 2021

Please add unit tests.

[Hong] There is no unit test for this part (src/xdp/*). And this part was written in a way of non unit testable. Let's consider whether we should put some efforts to enforce unit test for this part.

@vinaykul
Copy link
Member

vinaykul commented May 7, 2021

Please add unit tests.

[Hong] There is no unit test for this part (src/xdp/*). And this part was written in a way of non unit testable. Let's consider whether we should put some efforts to enforce unit test for this part.

What do we have under mizar/src/tests and mizar/tests and mizar/testse2e ?
[Hong] I don't see mizar/src/tests. For mizar/tests, it's for python code. I don't think mizar/testse2e applies for this pr.

@vinaykul
Copy link
Member

vinaykul commented May 7, 2021

Please add unit tests.
[Hong] There is no unit test for this part (src/xdp/*). And this part was written in a way of non unit testable. Let's consider whether we should put some efforts to enforce unit test for this part.

What do we have under mizar/src/tests and mizar/tests and mizar/testse2e ?
[Hong] I don't see mizar/src/tests. For mizar/tests, it's for python code. I don't think mizar/testse2e applies for this pr.

In the description, can you please add tcpdump trace of packets captured on the wire for each of the 4 cases for simple ping test? Scenarios: No labels, both ns and pod labels, only pod label, only ns label.

You can create two pods so that they are scheduled on different worker nodes, and ping one pod from the other using this example yaml:

apiVersion: v1
kind: Pod
metadata:
  name: netpod1
  labels:
    podkey: netpodkey1
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
            - kind-worker
  restartPolicy: OnFailure
  terminationGracePeriodSeconds: 2
  containers:
  - name: netctr
    image: yuvalif/fedora-tcpdump
    command: ["tail", "-f", "/dev/null"]
---
apiVersion: v1
kind: Pod
metadata:
  name: netpod2
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: podkey
            operator: In
            values:
            - netpodkey1
        topologyKey: kubernetes.io/hostname
  restartPolicy: OnFailure
  terminationGracePeriodSeconds: 2
  containers:
  - name: netctr
    image: yuvalif/fedora-tcpdump
    command: ["tail", "-f", "/dev/null"]

Then look for ICMP packets doing tcpdump on the veth of the kind-node with the ping source pod. e.g

ubuntu on ip-192-168-1-59 in ~ 
❯ docker exec -ti $(docker ps | grep kind-worker$ | awk '{print $1}') ip a s | grep ": eth0"                   
1698: eth0@if1699: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 xdpgeneric/id:1877 qdisc noqueue state UP group default 

ubuntu on ip-192-168-1-59 in ~ 
❯ ip a s | grep "1699:"                                                                                                           
1699: veth53c269b@if1698: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master br-fea389efa3d5 state UP group default 

ubuntu on ip-192-168-1-59 in ~ 
❯ kubectl get no -owide
NAME                 STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION   CONTAINER-RUNTIME
kind-control-plane   Ready    master   13m   v1.18.2   172.18.0.2    <none>        Ubuntu 20.04 LTS   5.6.0-rc2        containerd://1.3.3-14-g449e9269
kind-worker          Ready    <none>   13m   v1.18.2   172.18.0.3    <none>        Ubuntu 20.04 LTS   5.6.0-rc2        containerd://1.3.3-14-g449e9269
kind-worker2         Ready    <none>   13m   v1.18.2   172.18.0.4    <none>        Ubuntu 20.04 LTS   5.6.0-rc2        containerd://1.3.3-14-g449e9269

ubuntu on ip-192-168-1-59 in ~ 
> sudo tcpdump -nvXi veth53c269b ip dst 172.18.0.4

@vinaykul
Copy link
Member

vinaykul commented May 7, 2021

Please add unit tests.
[Hong] There is no unit test for this part (src/xdp/*). And this part was written in a way of non unit testable. Let's consider whether we should put some efforts to enforce unit test for this part.

What do we have under mizar/src/tests and mizar/tests and mizar/testse2e ?
[Hong] I don't see mizar/src/tests. For mizar/tests, it's for python code. I don't think mizar/testse2e applies for this pr.

Also, if you want to respond to my comment, please reply below or quote-reply. If you edit my comment, I don't get notification.

I don't think we should even have that permission to edit someone else's comments. Let me investigate that ..

@vinaykul vinaykul requested a review from w-yue May 7, 2021 23:59
Copy link
Contributor

@phudtran phudtran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me

}

pkt->namespace_label_value_opt = (void *)pkt->pod_label_value_opt + sizeof(*pkt->pod_label_value_opt);
if (pkt->namespace_label_value_opt + 1 > pkt->data_end) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also check the option class like above

Copy link
Collaborator

@w-yue w-yue May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I see option type is not checked or used anywhere in transit xdp. Should we check here as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check for both class and type.

@@ -729,6 +729,22 @@ static __inline int trn_process_geneve(struct transit_packet *pkt)
return XDP_ABORTED;
}

pkt->pod_label_value_opt = (void *)pkt->scaled_ep_opt + sizeof(*pkt->scaled_ep_opt);
if (pkt->pod_label_value_opt + 1 > pkt->data_end) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also check the option class like above

Copy link
Collaborator

@w-yue w-yue May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above for option type comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check for both class and type.

@Hong-Chang
Copy link
Collaborator Author

Please add unit tests.
[Hong] There is no unit test for this part (src/xdp/*). And this part was written in a way of non unit testable. Let's consider whether we should put some efforts to enforce unit test for this part.

What do we have under mizar/src/tests and mizar/tests and mizar/testse2e ?
[Hong] I don't see mizar/src/tests. For mizar/tests, it's for python code. I don't think mizar/testse2e applies for this pr.

In the description, can you please add tcpdump trace of packets captured on the wire for each of the 4 cases for simple ping test? Scenarios: No labels, both ns and pod labels, only pod label, only ns label.

You can create two pods so that they are scheduled on different worker nodes, and ping one pod from the other using this example yaml:

apiVersion: v1
kind: Pod
metadata:
  name: netpod1
  labels:
    podkey: netpodkey1
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: kubernetes.io/hostname
            operator: In
            values:
            - kind-worker
  restartPolicy: OnFailure
  terminationGracePeriodSeconds: 2
  containers:
  - name: netctr
    image: yuvalif/fedora-tcpdump
    command: ["tail", "-f", "/dev/null"]
---
apiVersion: v1
kind: Pod
metadata:
  name: netpod2
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
      - labelSelector:
          matchExpressions:
          - key: podkey
            operator: In
            values:
            - netpodkey1
        topologyKey: kubernetes.io/hostname
  restartPolicy: OnFailure
  terminationGracePeriodSeconds: 2
  containers:
  - name: netctr
    image: yuvalif/fedora-tcpdump
    command: ["tail", "-f", "/dev/null"]

Then look for ICMP packets doing tcpdump on the veth of the kind-node with the ping source pod. e.g

ubuntu on ip-192-168-1-59 in ~ 
❯ docker exec -ti $(docker ps | grep kind-worker$ | awk '{print $1}') ip a s | grep ": eth0"                   
1698: eth0@if1699: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 xdpgeneric/id:1877 qdisc noqueue state UP group default 

ubuntu on ip-192-168-1-59 in ~ 
❯ ip a s | grep "1699:"                                                                                                           
1699: veth53c269b@if1698: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc noqueue master br-fea389efa3d5 state UP group default 

ubuntu on ip-192-168-1-59 in ~ 
❯ kubectl get no -owide
NAME                 STATUS   ROLES    AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE           KERNEL-VERSION   CONTAINER-RUNTIME
kind-control-plane   Ready    master   13m   v1.18.2   172.18.0.2    <none>        Ubuntu 20.04 LTS   5.6.0-rc2        containerd://1.3.3-14-g449e9269
kind-worker          Ready    <none>   13m   v1.18.2   172.18.0.3    <none>        Ubuntu 20.04 LTS   5.6.0-rc2        containerd://1.3.3-14-g449e9269
kind-worker2         Ready    <none>   13m   v1.18.2   172.18.0.4    <none>        Ubuntu 20.04 LTS   5.6.0-rc2        containerd://1.3.3-14-g449e9269

ubuntu on ip-192-168-1-59 in ~ 
> sudo tcpdump -nvXi veth53c269b ip dst 172.18.0.4

I added a new issue and let's work on this later.
#480 Optimize Geneve option to be dynamic spaced

Copy link
Collaborator Author

@Hong-Chang Hong-Chang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed comments

}

pkt->namespace_label_value_opt = (void *)pkt->pod_label_value_opt + sizeof(*pkt->pod_label_value_opt);
if (pkt->namespace_label_value_opt + 1 > pkt->data_end) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check for both class and type.

@@ -729,6 +729,22 @@ static __inline int trn_process_geneve(struct transit_packet *pkt)
return XDP_ABORTED;
}

pkt->pod_label_value_opt = (void *)pkt->scaled_ep_opt + sizeof(*pkt->scaled_ep_opt);
if (pkt->pod_label_value_opt + 1 > pkt->data_end) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check for both class and type.

@vinaykul
Copy link
Member

/approve

@vinaykul
Copy link
Member

@phudtran Does it look good to you now?

@vinaykul vinaykul merged commit 6c290df into CentaurusInfra:dev-next May 12, 2021
@Hong-Chang Hong-Chang deleted the hc-p-2 branch May 14, 2021 22:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants