Skip to content

Commit

Permalink
netobserv
Browse files Browse the repository at this point in the history
  • Loading branch information
wangzheng422 committed Apr 8, 2024
1 parent 42fd2b8 commit 54c64c2
Show file tree
Hide file tree
Showing 3 changed files with 288 additions and 5 deletions.
46 changes: 41 additions & 5 deletions redhat/ocp4/4.14/4.14.net.observ.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ RTT value need loki as backend, we install loki first.

## install a minio as backend

we need s3 storage, we will use minio as backend, and use local disk as storage.

deploy a minio for testing only, not for production. becuase official minio will enable https, it will bring so many trouble into app integration, we use a old version minio.

```bash
Expand Down Expand Up @@ -131,7 +133,9 @@ oc create -n netobserv -f ${BASE_DIR}/data/install/s3-codellama.yaml

```

## install loki
## install loki operator

we have a s3 storage, and we will install loki operator.

![](imgs/2024-04-04-00-15-56.png)

Expand Down Expand Up @@ -186,7 +190,6 @@ oc create --save-config -n netobserv -f ${BASE_DIR}/data/install/loki-netobserv.
# run below, if reinstall
oc adm groups new cluster-admin

# especially, this user add step
oc adm groups add-users cluster-admin admin

oc adm policy add-cluster-role-to-group cluster-admin cluster-admin
Expand All @@ -195,6 +198,7 @@ oc adm policy add-cluster-role-to-group cluster-admin cluster-admin

# install net observ

we will install net observ operator, the installation is simple, just follow the official document. If you use eBPF agent, there seems have bugs with the installation steps, it is better to restart nodes to make the ebpf agent function well.

![](imgs/2024-04-04-00-24-27.png)

Expand All @@ -210,6 +214,8 @@ enable rtt tracing, following official document.

![](imgs/2024-04-07-13-49-51.png)

or if you want to change yaml directly

```yaml
apiVersion: flows.netobserv.io/v1beta2
kind: FlowCollector
Expand All @@ -229,12 +235,14 @@ spec:
## RRT
we can see RRT based on each flow.
we can see RRT based on each flow. At this point, we did not introduct network latency on backend service, so the RTT is very low.
![](imgs/2024-04-07-16-11-33.png)
## deploy egress IP
next, we will deploy egress IP on worker-02, and make traffic from worker-01 to worker-02, and see the RTT value.
```bash

# label a node to host egress ip
Expand Down Expand Up @@ -269,6 +277,8 @@ oc get egressip -o json | jq -r '.items[] | [.status.items[].egressIP, .status.i

## make traffic and see result

then, we create backend http service, and introduce network latency by 1s.

```bash

# on 6.8
Expand Down Expand Up @@ -296,6 +306,12 @@ sudo tc qdisc show dev ens192
# qdisc htb 1: root refcnt 3 r2q 10 default 0 direct_packets_stat 453 direct_qlen 1000
# qdisc netem 10: parent 1:1 limit 1000 delay 1s

```

and then, we create testing pod, and curl from the pod to backend service

```bash

# go back to helper
# create a dummy pod
cat << EOF > ${BASE_DIR}/data/install/demo1.yaml
Expand Down Expand Up @@ -328,11 +344,11 @@ while true; do curl http://172.21.6.8:13000 && sleep 1; done;

```

and you get the result like this:
and you get the result like this, you can see the RTT become 1s.:

![](imgs/2024-04-08-12-00-50.png)

after setting the columne.
after setting the columnes, we can see something interesting.

![](imgs/2024-04-08-12-03-32.png)

Expand Down Expand Up @@ -494,6 +510,9 @@ oc exec -it ${VAR_POD} -c ovn-controller -n openshift-ovn-kubernetes -- ovn-nbct
# snat 172.21.6.26 10.133.0.90
# snat 172.21.6.26 10.133.0.93

oc exec -it ${VAR_POD} -c ovn-controller -n openshift-ovn-kubernetes -- ovn-sbctl dump-flows | grep 172.21.6.22
# no result


# search on worker-02
VAR_POD=`oc get pod -n openshift-ovn-kubernetes -o wide | grep worker-02-demo | awk '{print $1}'`
Expand Down Expand Up @@ -548,8 +567,25 @@ oc exec -it ${VAR_POD} -c ovn-controller -n openshift-ovn-kubernetes -- ovn-nbct
# mac: "0a:58:64:40:00:01"
# networks: ["100.64.0.1/16"]


oc exec -it ${VAR_POD} -c ovn-controller -n openshift-ovn-kubernetes -- ovn-sbctl dump-flows | grep 172.21.6.22
# table=3 (lr_in_ip_input ), priority=90 , match=(arp.op == 1 && arp.tpa == 172.21.6.22), action=(eth.dst = eth.src; eth.src = xreg0[0..47]; arp.op = 2; /* ARP reply */ arp.tha = arp.sha; arp.sha = xreg0[0..47]; arp.tpa <-> arp.spa; outport = inport; flags.loopback = 1; output;)
# table=4 (lr_in_unsnat ), priority=90 , match=(ip && ip4.dst == 172.21.6.22), action=(ct_snat;)
# table=3 (lr_out_snat ), priority=33 , match=(ip && ip4.src == 10.133.0.43 && (!ct.trk || !ct.rpl)), action=(ct_snat(172.21.6.22);)
# table=27(ls_in_l2_lkup ), priority=80 , match=(flags[1] == 0 && arp.op == 1 && arp.tpa == 172.21.6.22), action=(clone {outport = "etor-GR_worker-02-demo"; output; }; outport = "_MC_flood_l2"; output;)
# table=27(ls_in_l2_lkup ), priority=80 , match=(flags[1] == 0 && arp.op == 1 && arp.tpa == 172.21.6.22), action=(outport = "jtor-GR_worker-02-demo"; output;)


```

Here is the ovn logical network topology:

![](imgs/2024-04-08-13-42-36.png)

based on our case, lets draw the network topology:

![](dia/4.14.netobserv.case.drawio.svg)

<!-- ![](imgs/2024-04-07-20-24-21.png)
Expand Down
Loading

0 comments on commit 54c64c2

Please sign in to comment.