From 56ea32a7589bcf27f79831502637e4799ef1ee48 Mon Sep 17 00:00:00 2001 From: Junhua Zhai Date: Tue, 23 Jul 2024 16:56:03 +0000 Subject: [PATCH 1/5] Initial bmv2 dpapp HLD --- .../dataplane/dash-bmv2-data-plane-app.md | 341 ++++++++++++++++++ .../dash-bmv2-data-plane-app-arch.drawio.svg | 4 + ...sh-bmv2-data-plane-app-overview.drawio.svg | 4 + 3 files changed, 349 insertions(+) create mode 100644 documentation/dataplane/dash-bmv2-data-plane-app.md create mode 100644 documentation/dataplane/images/dash-bmv2-data-plane-app-arch.drawio.svg create mode 100644 documentation/dataplane/images/dash-bmv2-data-plane-app-overview.drawio.svg diff --git a/documentation/dataplane/dash-bmv2-data-plane-app.md b/documentation/dataplane/dash-bmv2-data-plane-app.md new file mode 100644 index 000000000..cf7a3291a --- /dev/null +++ b/documentation/dataplane/dash-bmv2-data-plane-app.md @@ -0,0 +1,341 @@ +# DASH BMv2 Data Plane App HLD + +| Rev | Date | Author | Change Description | +| --- | ---- | ------ | ------------------ | +| 0.1 | 06/29/2024 | Junhua Zhai | Initial version | + +1. [Terminology](#1-terminology) +2. [Background](#2-background) +3. [Project scenario](#3-project-scenario) +4. [Resource modeling, requirement, and SLA](#4-resource-modeling-requirement-and-sla) +5. [System Architecture overview](#5-system-architecture-overview) +6. [Detailed design](#6-detailed-design) + - [6.1. DASH metadata](#61-dash-metadata) + - [6.2. Basic Flow](#62-basic-flow) + - [6.3. Flow resimulation](#63-flow-resimulation) + - [6.4. HA flow](#64-ha-flow) + - [6.5. HA flow resimulation](#65-ha-flow-resimulation) + - [6.6. SAI](#66-sai) +7. [Test Plan](#7-test-plan) +8. [Appendix](#8-appendix) + + +## 1. Terminology + +| Term | Explanation | +| --- | --- | +| VPP | Vector Packet Processing | +| DPAPP | Data plane Application | + + +## 2. Background + +Referring to data path logical architecture, shown at the below figure: + +![dash_dpapp_overview](images/dash-bmv2-data-plane-app-overview.drawio.svg) + +Data plane app is another packet processing engine running on CPUs. It adds extra capacities onto DASH capable ASIC: +- It serves the slow (exception) path of packet processing to implement complicated pipeline logic, which is hard to be done in ASIC, for example adding/updating/removing table entry inline. +- It optionally implements some DASH-SAI APIs, which are not proper/easy to be done upon ASIC SDK. + +### 2.1. Scope + +This document only focuses on describing the design of a data-plane app example, as a proof of concept, how it cooperates with [DASH pipeline BMv2](https://github.com/sonic-net/DASH/tree/main/dash-pipeline/bmv2) to implement DASH data plane. The app will be based on [VPP](https://fd.io/). + +## 3. Project scenario + +### 3.1. Stateful packet process - flow +- Flow Creation
+In DASH pipeline, after 5-tuple flow keys are well extracted, packet goes to flow lookup stage. It does the flow lookup. If any flow is matched, packet is marked a flow-hit flag, otherwise flow-miss flag. The packet continues to go to next stages, like ACL, (destination) NAT, routing, etc. After routing stage, if route is found and packet is flow-miss, it will bypass the rest stages and be forwarded to data-plane app. The data-plane app will use dash-sai APIs to create flow in flow table, and then re-inject the packet back to pipeline. +- Flow Deletion
+In flow lookup stage, TCP FIN/RST packet is always marked flow-miss and later forwarded to data-plane app. +- Flow Age-out
+In flow lookup stage, if packet hits one flow, it will refresh flow timestamp. Data-plane app periodically scans flow table and check if flow is timed out according to (current timestamp - flow timestamp) vs idle timeout value. + +### 3.2. HA +- Inline flow replication
+In HA context, Active data-plane app creates flow, replicates the flow in metadata, glues it with original packet, and sends the packet to Standby data-plane app via DPU data-plane channel. Standby data-plane app recreates the flow, and acknowledges Active data-plane app to finish flow creation. The same logics apply for flow deletion, flow age-out. +- Flow bulk sync
+Flow bulk sync replicates batch flows from one DPU to another to make flow table consistency on Active and Standby DPUs. When HA agents starts a bulk sync via DASH SAI, Active data-plane app will walk flow table based on sync method (perfect/range), generate batch flows and send them to Standby data-plane app with gRPC via control-plane channel. Standby date-plane app will create flows in order. + +### 3.3. Flow re-simulation +When SONiC changes polices via DASH SAI, flow could be impacted. Data-plane app is raised to re-simulate flow. In HA context, Active data-plane app also needs to sync the updated flows to Standby. + +## 4. Resource modeling, requirement, and SLA +Refer to [SONiC DASH HLD](https://github.com/sonic-net/DASH/blob/main/documentation/general/dash-sonic-hld.md) + +## 5. System Architecture overview + +![dash_dpapp_arch](images/dash-bmv2-data-plane-app-arch.drawio.svg) + +Referring to the above figure, data-plane app overall is a multi-thread vpp application, running in a standalone container. It includes these components: + +- vpp master, it runs dashsai server to receive dashsai requests (dash object CRUD) via northbound RPC channel and then invoke DASH SAI APIs to handle them. The server also processes flow creation/deletion notification from vpp workers. +- vpp workers, they serve as an exception path of packet processing, running on multi-cpus. It creates a flow in local flow table and notifies dashsai server to offload it to BMv2 flow table. The packet is temporarily queued. After workers know the success of flow offloading to BMv2, they deque the packet and send it back to P4 pipeline via VPP port. The workers also do flow age-out task with proper scheduling. +- flow table, is a local cache of BMv2 flow table. +- DASH SAI, is a unique interface for DASH object CRUD of DASH pipeline, implemented by DASH BMv2. +- VPP port, is a veth interface and connects to BMv2 via veth pair. It serves as datapath channel to receive/send all packets between date-plane app and BMv2. Generally the port supports multi RSS queues, each queue binds to one vpp worker. + +**Note:** For simplicity and concept verification, vpp workers may directly call DASH SAI to offload flow to BMv2. The concern is that DASH SAI blocking API can block packet processing of vpp workers. + +## 6. Detailed design + +Referring to the below figure from [HA API HLD], it greatly outlines the whole packet flow in data plane for both standalone and HA context. + +![packet flow in data plane](https://github.com/sonic-net/DASH/blob/main/documentation/high-avail/images/ha-bm-packet-flow.svg) + +From the perspective of DPAPP, its core task in slow path is to create flow in flow table in case of flow miss in Flow (ConnTrack) lookup stage, and then maintain flow state. The sub sections will depict these functions around flow object. + + +### 6.1. DASH metadata +DASH metadata records the packet processing result of DASH pipeline. It can have the following info to help flow creation/removal, etc operations: +- flow key entry - eni_mac, vnet_id, 5-tuples +- common flow data - state, direction, eni, flow actions, metering class, policy ID +- flow overlay rewrite data +- flow underlay encapsulation data +- flow tunnel encapsulation data + +When DASH pipeline requests DPAPP for flow creation, it encapsulates DASH metadata in an ethernet frame with EtherType DASH_METADATA and appends the original customer packet. The packet sent to DPAPP is like: + +
+Ethernet HEADER|DASH metadata|customer packet +
+ + The number of DASH_METADATA is 0x876D, which reuses the number of EtherType SECURE_DATA (vpp/src/vnet/ethernet/types.def at master · FDio/vpp · GitHub). +DASH metadata is encoded in protocol DASH_METADATA, whose message format is defined as the below figure: + +``` + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Source | Type |Subtype| Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Flow key entry | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Flow data : + : | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Source: Where is packet from, dash_packet_source_t + Type: Packet type under certain source + Subtype: Packet subtype under certain type + Length: Total length of DASH header + Flow key entry: eni_mac, 5-tuples + Flow data: As next figure +``` + +``` + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |D| Version | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Action | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Meter Class | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | [overlay rewrite] : + : | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | [tunnel encap] : + : | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | [underlay encap] : + : | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + D: flow direction, 0, oubound; 1, inbound + Version: flow version + Action: flow data – action + Meter: Meter class + ... +``` + +### 6.2. Basic Flow + +Basic flow means basic operation of flow, i.e. CRUD. Generally DPAPP may run a flow state machine to manage flow state and do proper operation of CRUD. The state transformation is triggered by these events: +- Packets with DASH metadata from pipeline, for flow creation/deletion/refresh +- Timer, for flow age-out + +The below sequence chart shows the detail steps of flow creation, flow state becomes FLOW_CREATED. Other operations are similar. + +```mermaid +sequenceDiagram + participant P as P4 Pipeline + participant R as P4 Runtime + participant W as VPP worker + participant M as VPP master + autonumber + P->>+W: DASH metadata (source PIPELINE) + customer packet + W->>+W: Create flow in local flow table + W->>+M: Request current flow offload + M->>+R: Call DASH SAI create_flow (gRPC) + R->>R: Write flow to P4 flow table + R->>M: create_flow returns OK + M->>W: Reply flow offload done + W->>P: DASH metadata (source DPAPP) + customer packet +``` + +It is remarkable that VPP worker should not call DASH SAI API directly, otherwise DASH SAI may block VPP worker to handle other packets. Each flow has a packet queue. In step 3, it enqueues current packet firstly and then requests flow offload. In step 8, it dequeues the packet and then sends the packet back to P4 pipeline. + +### 6.3. Flow resimulation + +```mermaid +sequenceDiagram + participant C as DASH SAI Client + participant P as P4 Pipeline + participant R as P4 Runtime + participant W as VPP worker + participant S as DASH SAI Server + autonumber + C->>+S: ENI_ATTR_FULL_FLOW_RESIMULATION_REQUESTED + S->>+R: Call DASH SAI set_eni_attr (epoch) + R->>R: Update eni epoch + R->>-S: set_eni_attr returns OK + S->>-C: FULL_FLOW_RESIMULATION_REQUESTED return OK + P->>+P: packet comes and starts flow resimulation due to eni.epoch > flow.epoch + P->>+W: slow-path, update flow data +``` + +### 6.4. HA flow +Base on basic flow, HA flow adds an extra FLOW_SYNCED state, which involves extra sync req/response ping-pang between DPAPP and PEER DPAPP. + +### 6.5. HA flow resimulation +Same as section 6.3, but also do the below: +- Sync eni epoch +- Update flow data in PEER + +### 6.6. SAI +DPAPP will implement a dashsai server, which is a shim layer of underlying DASH SAI. It works as below: +1. Accept the request from dashsai client +2. Parse the request, identify DASH SAI object and construct its attr list +3. If the SAI object is implemented in DPAPP itself, call its function +4. Else invoke DASH SAI API of the object +5. Reply dashsai client + +## 7. Test Plan +Test objective: +1. Verify flow CRUD in standalone and HA +2. Verify DASH SAI + +Test scope:
+The test only covers the functionality verification of DPAPP. The test of performance and capacity is out of scope. + +Test environment:
+- BMv2 P4 pipeline + DPAPP +- scapy + +The below table lists all test cases, all of which must be scripted based on DASH test infrastructure. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Test Suite

Test Case

Test Description

1. Unit Test

1.1 DASH SAI

+

Target: Verify dashsai server behaving well as DASH SAI proxy

+

Steps:

+
    +
  1. Start P4 BMv2 and DPAPP
  2. +
  3. dashsai client sends request for DASH SAI objects CRUD
  4. +
  5. Verify dashsai server handles them correctly
  6. +
  7. Verify dashsai server calls proper DASH SAI APIs
  8. +
  9. Verify dashsai server replys dashsai client
  10. +
+

1.2 Flow Creation

+

Target: Verify flow creation OK

+

Steps:

+
    +
  1. Start P4 BMv2 and DPAPP
  2. +
  3. Configure vnet2vnet via DASH SAI
  4. +
  5. Send TCP SYN packet to BMV2 external port
  6. +
  7. Verify flow via vppctl “show dash flow”
  8. +
  9. Verify flow via DASH SAI flow get
  10. +
+

1.3 Flow Deletion

+

Target: Verify flow deletion OK

+

Steps:

+
    +
  1. Same test steps as test case 1.2
  2. +
  3. Send TCP FIN packet to BMV2 external port
  4. +
  5. Verify flow not found via vppctl “show dash flow”
  6. +
  7. Verify flow not found via DASH SAI flow get
  8. +
+

1.4 Flow Age-out

+

Target: Verify flow age-out OK

+

Steps:

+
    +
  1. Same test steps as test case 1.2
  2. +
  3. Change flow timeout to 10s
  4. +
  5. Wait 10s
  6. +
  7. Verify flow not found via DASH SAI flow get
  8. +
+

1.5 Flow resimulation

Target: Verify flow resimulation

+

Steps:

+
    +
  1. Same test steps as test case 1.2
  2. +
  3. Update outbound CA->PA mapping via SAI
  4. +
  5. Verify underly DIP is changed in CA entry of table outbound_ca_to_pa
  6. +
  7. Get flow attr UNDERLAY_DIP via SAI flow get
  8. +
  9. Verify the above attr value unchanged
  10. +
  11. Send TCP DATA packet
  12. +
  13. Get flow attr UNDERLAY_DIP again via SAI flow get
  14. +
  15. Verify the attr value changed to new PA
  16. +
  17. Triger multiple flows of one eni
  18. +
  19. Set eni attr FULL_RESIMULATION_REQ via SAI
  20. +
+

1.6 1.2-1.5 in HA

Target: Verify flow CRUD in HA

1.7 bulk sync in HA

Target: Verify bulk sync in HA

2. Negative

+

2.1 Flow not creation

+
+

Target: Verify flow not creation OK

+

Steps:

+
    +
  1. Send an illegal TCP packet (only ACK flag) as the first packet to the external port
  2. +
  3. P4 pipeline drop it or let DPAPP determines to drop it ??
  4. +
+
+ +## 8. Appendix +- [DASH FLOW API](https://github.com/sonic-net/DASH/blob/main/documentation/dataplane/dash-flow-api.md) +- [SONiC DASH HLD](https://github.com/sonic-net/DASH/blob/main/documentation/general/dash-sonic-hld.md) +- [DASH scaling requirements](https://github.com/sonic-net/SONiC/blob/master/doc/dash/dash-sonic-hld.md#14-scaling-requirements) +- [Smart Switch HA HLD](https://github.com/sonic-net/SONiC/blob/master/doc/smart-switch/high-availability/smart-switch-ha-hld.md) +- [HA API HLD](https://github.com/sonic-net/DASH/blob/main/documentation/high-avail/ha-api-hld.md) +- [Flow Resimulation](https://github.com/sonic-net/DASH/blob/main/documentation/dataplane/dash-flow-resimulation.md) +- [DASH-SAI pipeline packet flow](https://github.com/sonic-net/DASH/blob/main/documentation/general/dash-sai-pipeline-packet-flow.md) + diff --git a/documentation/dataplane/images/dash-bmv2-data-plane-app-arch.drawio.svg b/documentation/dataplane/images/dash-bmv2-data-plane-app-arch.drawio.svg new file mode 100644 index 000000000..0de61c614 --- /dev/null +++ b/documentation/dataplane/images/dash-bmv2-data-plane-app-arch.drawio.svg @@ -0,0 +1,4 @@ + + + +
DASH BMv2
P4 tables
P4 pipeline
P4 Runtime
DPU-Port1
DPU-PortN
......
CPU Port
RD


Dataplane App
VPP port
vpp 
workers
Flow table
RD
DASH SAI
vpp master
dashsai server
GRPC
GRPC
veth pair
RPC
RW
sairedis
dashsai client
\ No newline at end of file diff --git a/documentation/dataplane/images/dash-bmv2-data-plane-app-overview.drawio.svg b/documentation/dataplane/images/dash-bmv2-data-plane-app-overview.drawio.svg new file mode 100644 index 000000000..bfa2dbf63 --- /dev/null +++ b/documentation/dataplane/images/dash-bmv2-data-plane-app-overview.drawio.svg @@ -0,0 +1,4 @@ + + + +
Hardware
DASH capable ASICs
Kernel Space
ASIC Driver
User Space
Technology Provider ASIC SDK
DASH-SAI APIs
Technology Provider Data Plane App
\ No newline at end of file From fdd642c77f9733136c63e88a354e223e12784258 Mon Sep 17 00:00:00 2001 From: Junhua Zhai Date: Tue, 20 Aug 2024 16:39:25 +0000 Subject: [PATCH 2/5] Fix spelling check error --- .wordlist.txt | 17 ++++++++++++++ .../dataplane/dash-bmv2-data-plane-app.md | 22 +++++++++++++------ 2 files changed, 32 insertions(+), 7 deletions(-) diff --git a/.wordlist.txt b/.wordlist.txt index 44ec48bb5..19462aaf8 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -100,6 +100,8 @@ configs Conntrack Containerlab CP +cpus +CPUs CreatedHalfOpenFlow CreatedOtherFlow CreatedTcpFlow @@ -120,6 +122,7 @@ CurrentUdpFlow customizable Cx cyberithub +dashsai dashsubmodule DASHOrch dashorch @@ -146,6 +149,8 @@ decap'ing decaps deliverables DemoDays +dequeue +dequeues designator DEST dest @@ -171,6 +176,8 @@ dockerized DoS DotNet downcasting +dpapp +DPAPP DPDK DPU dpu @@ -215,11 +222,14 @@ eni Eni ENIs ENI's +enqueue +enqueues entrypoints enum EPUs EPYC ethernet +EtherType executables ExpressRoute extern @@ -229,6 +239,7 @@ failovers fakesai FastPath fc +FDio FEC ffe FINPackets @@ -315,6 +326,7 @@ Jinja jitter journaled json +Junhua keepalive keepalives Keysight @@ -507,6 +519,7 @@ routable RPC RPCs RPF +RSS RST RSTPackets RSTs @@ -695,6 +708,9 @@ vport VPort vPORT VPorts +vpp +VPP +vppctl VTEP VTEPs VxLAN @@ -738,6 +754,7 @@ Gan Ze AppDBMemoryEstimation Ivantsiv +Zhai Zhixiong Niu validonly diff --git a/documentation/dataplane/dash-bmv2-data-plane-app.md b/documentation/dataplane/dash-bmv2-data-plane-app.md index cf7a3291a..c0c7d4854 100644 --- a/documentation/dataplane/dash-bmv2-data-plane-app.md +++ b/documentation/dataplane/dash-bmv2-data-plane-app.md @@ -54,7 +54,11 @@ In flow lookup stage, if packet hits one flow, it will refresh flow timestamp. D ### 3.2. HA - Inline flow replication
-In HA context, Active data-plane app creates flow, replicates the flow in metadata, glues it with original packet, and sends the packet to Standby data-plane app via DPU data-plane channel. Standby data-plane app recreates the flow, and acknowledges Active data-plane app to finish flow creation. The same logics apply for flow deletion, flow age-out. +In HA context, Active data-plane app creates flow, replicates the flow in +metadata, glues it with original packet, and sends the packet to Standby +data-plane app via DPU data-plane channel. Standby data-plane app recreates +the flow, and acknowledges Active data-plane app to finish flow creation. The +same logic can apply for flow deletion, flow age-out. - Flow bulk sync
Flow bulk sync replicates batch flows from one DPU to another to make flow table consistency on Active and Standby DPUs. When HA agents starts a bulk sync via DASH SAI, Active data-plane app will walk flow table based on sync method (perfect/range), generate batch flows and send them to Standby data-plane app with gRPC via control-plane channel. Standby date-plane app will create flows in order. @@ -71,7 +75,10 @@ Refer to [SONiC DASH HLD](https://github.com/sonic-net/DASH/blob/main/documentat Referring to the above figure, data-plane app overall is a multi-thread vpp application, running in a standalone container. It includes these components: - vpp master, it runs dashsai server to receive dashsai requests (dash object CRUD) via northbound RPC channel and then invoke DASH SAI APIs to handle them. The server also processes flow creation/deletion notification from vpp workers. -- vpp workers, they serve as an exception path of packet processing, running on multi-cpus. It creates a flow in local flow table and notifies dashsai server to offload it to BMv2 flow table. The packet is temporarily queued. After workers know the success of flow offloading to BMv2, they deque the packet and send it back to P4 pipeline via VPP port. The workers also do flow age-out task with proper scheduling. +- vpp workers, they serve as an exception path of packet processing, running +on multi-cpus. It creates a flow in local flow table and notifies dashsai +server to offload it to BMv2 flow table. The packet is temporarily queued. +After workers know the success of flow offloading to BMv2, they dequeue the packet and send it back to P4 pipeline via VPP port. The workers also do flow age-out task with proper scheduling. - flow table, is a local cache of BMv2 flow table. - DASH SAI, is a unique interface for DASH object CRUD of DASH pipeline, implemented by DASH BMv2. - VPP port, is a veth interface and connects to BMv2 via veth pair. It serves as datapath channel to receive/send all packets between date-plane app and BMv2. Generally the port supports multi RSS queues, each queue binds to one vpp worker. @@ -84,7 +91,7 @@ Referring to the below figure from [HA API HLD], it greatly outlines the whole p ![packet flow in data plane](https://github.com/sonic-net/DASH/blob/main/documentation/high-avail/images/ha-bm-packet-flow.svg) -From the perspective of DPAPP, its core task in slow path is to create flow in flow table in case of flow miss in Flow (ConnTrack) lookup stage, and then maintain flow state. The sub sections will depict these functions around flow object. +From the perspective of DPAPP, its core task in slow path is to create flow in flow table in case of flow miss in Flow (Conntrack) lookup stage, and then maintain flow state. The sub sections will depict these functions around flow object. ### 6.1. DASH metadata @@ -198,7 +205,8 @@ sequenceDiagram ``` ### 6.4. HA flow -Base on basic flow, HA flow adds an extra FLOW_SYNCED state, which involves extra sync req/response ping-pang between DPAPP and PEER DPAPP. +Base on basic flow, HA flow adds an extra FLOW_SYNCED state, which involves +extra sync for request/response ping-pang between DPAPP and PEER DPAPP. ### 6.5. HA flow resimulation Same as section 6.3, but also do the below: @@ -244,7 +252,7 @@ The below table lists all test cases, all of which must be scripted based on DAS
  • dashsai client sends request for DASH SAI objects CRUD
  • Verify dashsai server handles them correctly
  • Verify dashsai server calls proper DASH SAI APIs
  • -
  • Verify dashsai server replys dashsai client
  • +
  • Verify dashsai server replies dashsai client
  • @@ -295,13 +303,13 @@ The below table lists all test cases, all of which must be scripted based on DAS
    1. Same test steps as test case 1.2
    2. Update outbound CA->PA mapping via SAI
    3. -
    4. Verify underly DIP is changed in CA entry of table outbound_ca_to_pa
    5. +
    6. Verify underlay DIP is changed in CA entry of table outbound_ca_to_pa
    7. Get flow attr UNDERLAY_DIP via SAI flow get
    8. Verify the above attr value unchanged
    9. Send TCP DATA packet
    10. Get flow attr UNDERLAY_DIP again via SAI flow get
    11. Verify the attr value changed to new PA
    12. -
    13. Triger multiple flows of one eni
    14. +
    15. Trigger multiple flows of one eni
    16. Set eni attr FULL_RESIMULATION_REQ via SAI
    From 245ea4d5ed8c9b4067af7dfbf667652795bad26e Mon Sep 17 00:00:00 2001 From: Junhua Zhai Date: Thu, 29 Aug 2024 16:58:09 +0000 Subject: [PATCH 3/5] Fix per review comments --- .../dataplane/dash-bmv2-data-plane-app.md | 69 +++++++++++++------ 1 file changed, 47 insertions(+), 22 deletions(-) diff --git a/documentation/dataplane/dash-bmv2-data-plane-app.md b/documentation/dataplane/dash-bmv2-data-plane-app.md index c0c7d4854..6a0d3573d 100644 --- a/documentation/dataplane/dash-bmv2-data-plane-app.md +++ b/documentation/dataplane/dash-bmv2-data-plane-app.md @@ -45,25 +45,49 @@ This document only focuses on describing the design of a data-plane app example, ## 3. Project scenario ### 3.1. Stateful packet process - flow -- Flow Creation
    -In DASH pipeline, after 5-tuple flow keys are well extracted, packet goes to flow lookup stage. It does the flow lookup. If any flow is matched, packet is marked a flow-hit flag, otherwise flow-miss flag. The packet continues to go to next stages, like ACL, (destination) NAT, routing, etc. After routing stage, if route is found and packet is flow-miss, it will bypass the rest stages and be forwarded to data-plane app. The data-plane app will use dash-sai APIs to create flow in flow table, and then re-inject the packet back to pipeline. -- Flow Deletion
    -In flow lookup stage, TCP FIN/RST packet is always marked flow-miss and later forwarded to data-plane app. -- Flow Age-out
    -In flow lookup stage, if packet hits one flow, it will refresh flow timestamp. Data-plane app periodically scans flow table and check if flow is timed out according to (current timestamp - flow timestamp) vs idle timeout value. +- Flow Creation + + In DASH pipeline, after flow keys are well extracted, packet goes to flow + lookup stage. It does the flow lookup. If any flow is matched, packet is + marked a flow-hit flag, otherwise flow-miss flag. If flow-hit, the packet + should bypass policy matching stages, such as ACL, routing, etc, and + continue doing the rest stages, like flow actions. If flow-miss, the packet + should do policy matching stages and then be forwarded to data-plane app. + The data-plane app will use dash-sai APIs to create flow in flow table, and + then re-inject the packet back to pipeline. + +- Flow Deletion + + In flow lookup stage, TCP FIN/RST packet should be treated specially with + FLOW-DELETE hint and forwarded to data-plane app. +- Flow Age-out + + In flow lookup stage, if packet hits one flow, it will refresh flow + timestamp. Data-plane app periodically scans flow table and check if flow is + timed out according to (current timestamp - flow timestamp) vs idle timeout value. ### 3.2. HA -- Inline flow replication
    -In HA context, Active data-plane app creates flow, replicates the flow in -metadata, glues it with original packet, and sends the packet to Standby -data-plane app via DPU data-plane channel. Standby data-plane app recreates +- Inline flow replication + + In HA context, Active data-plane app creates flow, serializes the flow in +metadata header, glues it with original packet, and sends the packet to Standby +via DPU data-plane channel. Standby data-plane app recreates the flow, and acknowledges Active data-plane app to finish flow creation. The same logic can apply for flow deletion, flow age-out. -- Flow bulk sync
    -Flow bulk sync replicates batch flows from one DPU to another to make flow table consistency on Active and Standby DPUs. When HA agents starts a bulk sync via DASH SAI, Active data-plane app will walk flow table based on sync method (perfect/range), generate batch flows and send them to Standby data-plane app with gRPC via control-plane channel. Standby date-plane app will create flows in order. +- Flow bulk sync + + Flow bulk sync replicates batch flows from one DPU to another to make flow + table consistent on Active and Standby DPUs. When HA agents starts a bulk + sync via DASH SAI, Active data-plane app will walk flow table based on sync + method (perfect/range), generate batch flows and send them to Standby + data-plane app with gRPC via control-plane channel. Standby date-plane app + will create flows in order. ### 3.3. Flow re-simulation -When SONiC changes polices via DASH SAI, flow could be impacted. Data-plane app is raised to re-simulate flow. In HA context, Active data-plane app also needs to sync the updated flows to Standby. +When SONiC changes polices via DASH SAI, some flows might need to be updated +to get the latest policy applied. Data-plane app is raised to re-simulate +flow. In HA context, Active data-plane app also needs to sync the updated +flows to Standby. ## 4. Resource modeling, requirement, and SLA Refer to [SONiC DASH HLD](https://github.com/sonic-net/DASH/blob/main/documentation/general/dash-sonic-hld.md) @@ -87,7 +111,7 @@ After workers know the success of flow offloading to BMv2, they dequeue the pack ## 6. Detailed design -Referring to the below figure from [HA API HLD], it greatly outlines the whole packet flow in data plane for both standalone and HA context. +Referring to the below figure from [HA API HLD](https://github.com/sonic-net/DASH/blob/main/documentation/high-avail/ha-api-hld.md), it greatly outlines the whole packet flow in data plane for both standalone and HA context. ![packet flow in data plane](https://github.com/sonic-net/DASH/blob/main/documentation/high-avail/images/ha-bm-packet-flow.svg) @@ -104,11 +128,11 @@ DASH metadata records the packet processing result of DASH pipeline. It can have When DASH pipeline requests DPAPP for flow creation, it encapsulates DASH metadata in an ethernet frame with EtherType DASH_METADATA and appends the original customer packet. The packet sent to DPAPP is like: -
    -Ethernet HEADER|DASH metadata|customer packet -
    +``` + Ethernet HEADER|DASH metadata|customer packet +``` - The number of DASH_METADATA is 0x876D, which reuses the number of EtherType SECURE_DATA (vpp/src/vnet/ethernet/types.def at master · FDio/vpp · GitHub). +The number of DASH_METADATA is 0x876D, which reuses the number of EtherType SECURE_DATA (vpp/src/vnet/ethernet/types.def at master · FDio/vpp · GitHub). DASH metadata is encoded in protocol DASH_METADATA, whose message format is defined as the below figure: ``` @@ -206,7 +230,7 @@ sequenceDiagram ### 6.4. HA flow Base on basic flow, HA flow adds an extra FLOW_SYNCED state, which involves -extra sync for request/response ping-pang between DPAPP and PEER DPAPP. +extra sync for request/response ping-pong between DPAPP and PEER DPAPP. ### 6.5. HA flow resimulation Same as section 6.3, but also do the below: @@ -226,10 +250,11 @@ Test objective: 1. Verify flow CRUD in standalone and HA 2. Verify DASH SAI -Test scope:
    -The test only covers the functionality verification of DPAPP. The test of performance and capacity is out of scope. +Test scope: + + The test only covers the functionality verification of DPAPP. The test of performance and capacity is out of scope. -Test environment:
    +Test environment: - BMv2 P4 pipeline + DPAPP - scapy From 1f51ca31605d587c2ba6032e01cb3d5d5116a167 Mon Sep 17 00:00:00 2001 From: Junhua Zhai Date: Wed, 4 Sep 2024 13:46:43 +0000 Subject: [PATCH 4/5] Fix per comments --- .wordlist.txt | 1 - .../dataplane/dash-bmv2-data-plane-app.md | 35 +- .../dash-bmv2-data-plane-app-arch.drawio.svg | 469 +++++++++++++++++- 3 files changed, 485 insertions(+), 20 deletions(-) diff --git a/.wordlist.txt b/.wordlist.txt index 19462aaf8..91ecc084a 100644 --- a/.wordlist.txt +++ b/.wordlist.txt @@ -100,7 +100,6 @@ configs Conntrack Containerlab CP -cpus CPUs CreatedHalfOpenFlow CreatedOtherFlow diff --git a/documentation/dataplane/dash-bmv2-data-plane-app.md b/documentation/dataplane/dash-bmv2-data-plane-app.md index 6a0d3573d..ff99ac205 100644 --- a/documentation/dataplane/dash-bmv2-data-plane-app.md +++ b/documentation/dataplane/dash-bmv2-data-plane-app.md @@ -62,9 +62,12 @@ This document only focuses on describing the design of a data-plane app example, FLOW-DELETE hint and forwarded to data-plane app. - Flow Age-out - In flow lookup stage, if packet hits one flow, it will refresh flow - timestamp. Data-plane app periodically scans flow table and check if flow is - timed out according to (current timestamp - flow timestamp) vs idle timeout value. + Data-plane app can help the data plane implement the flow age-out + mechanism by bridging the gap in the current data plane engine - bmv2. + For example, in flow lookup stage, if packet hits one flow, pipeline may refresh + flow timestamp via p4 extern function. Data-plane app periodically scans flow table + and check if flow is timed out according to (current timestamp - flow timestamp) vs idle + timeout value. ### 3.2. HA - Inline flow replication @@ -96,18 +99,18 @@ Refer to [SONiC DASH HLD](https://github.com/sonic-net/DASH/blob/main/documentat ![dash_dpapp_arch](images/dash-bmv2-data-plane-app-arch.drawio.svg) -Referring to the above figure, data-plane app overall is a multi-thread vpp application, running in a standalone container. It includes these components: +Referring to the above figure, data-plane app overall is a multi-thread application based on vpp, running in a standalone container. It includes these components: -- vpp master, it runs dashsai server to receive dashsai requests (dash object CRUD) via northbound RPC channel and then invoke DASH SAI APIs to handle them. The server also processes flow creation/deletion notification from vpp workers. -- vpp workers, they serve as an exception path of packet processing, running -on multi-cpus. It creates a flow in local flow table and notifies dashsai +- master thread, it runs dashsai server to receive dashsai requests (dash object CRUD) via northbound RPC channel and then invoke DASH SAI APIs to handle them. The server also processes flow creation/deletion notification from workers. +- worker threads, they serve as an exception (slow) path of packet processing, running +on multiple CPUs. It creates a flow in local flow table and notifies dashsai server to offload it to BMv2 flow table. The packet is temporarily queued. -After workers know the success of flow offloading to BMv2, they dequeue the packet and send it back to P4 pipeline via VPP port. The workers also do flow age-out task with proper scheduling. +After workers know the success of flow offloading to BMv2, they dequeue the packet and send it back to P4 pipeline via DPAPP port. The workers also do flow age-out task with proper scheduling. - flow table, is a local cache of BMv2 flow table. - DASH SAI, is a unique interface for DASH object CRUD of DASH pipeline, implemented by DASH BMv2. -- VPP port, is a veth interface and connects to BMv2 via veth pair. It serves as datapath channel to receive/send all packets between date-plane app and BMv2. Generally the port supports multi RSS queues, each queue binds to one vpp worker. +- DPAPP port, is a veth interface and connects to BMv2 via veth pair. It serves as datapath channel to receive/send all packets between date-plane app and BMv2. Generally the port supports multi RSS queues, each queue binds to one worker thread. -**Note:** For simplicity and concept verification, vpp workers may directly call DASH SAI to offload flow to BMv2. The concern is that DASH SAI blocking API can block packet processing of vpp workers. +**Note:** For simplicity and concept verification, worker threads may directly call DASH SAI to offload flow to BMv2. The concern is that DASH SAI blocking API can block packet processing of workers. ## 6. Detailed design @@ -129,7 +132,7 @@ DASH metadata records the packet processing result of DASH pipeline. It can have When DASH pipeline requests DPAPP for flow creation, it encapsulates DASH metadata in an ethernet frame with EtherType DASH_METADATA and appends the original customer packet. The packet sent to DPAPP is like: ``` - Ethernet HEADER|DASH metadata|customer packet + Ethernet HEADER | DASH metadata | customer packet ``` The number of DASH_METADATA is 0x876D, which reuses the number of EtherType SECURE_DATA (vpp/src/vnet/ethernet/types.def at master · FDio/vpp · GitHub). @@ -194,8 +197,8 @@ The below sequence chart shows the detail steps of flow creation, flow state bec sequenceDiagram participant P as P4 Pipeline participant R as P4 Runtime - participant W as VPP worker - participant M as VPP master + participant W as DPAPP worker + participant M as DPAPP master autonumber P->>+W: DASH metadata (source PIPELINE) + customer packet W->>+W: Create flow in local flow table @@ -207,7 +210,7 @@ sequenceDiagram W->>P: DASH metadata (source DPAPP) + customer packet ``` -It is remarkable that VPP worker should not call DASH SAI API directly, otherwise DASH SAI may block VPP worker to handle other packets. Each flow has a packet queue. In step 3, it enqueues current packet firstly and then requests flow offload. In step 8, it dequeues the packet and then sends the packet back to P4 pipeline. +It is remarkable that DPAPP worker should not call DASH SAI API directly, otherwise DASH SAI may block DPAPP worker to handle other packets. Each flow has a packet queue. In step 3, it enqueues current packet firstly and then requests flow offload. In step 8, it dequeues the packet and then sends the packet back to P4 pipeline. ### 6.3. Flow resimulation @@ -216,7 +219,7 @@ sequenceDiagram participant C as DASH SAI Client participant P as P4 Pipeline participant R as P4 Runtime - participant W as VPP worker + participant W as DPAPP worker participant S as DASH SAI Server autonumber C->>+S: ENI_ATTR_FULL_FLOW_RESIMULATION_REQUESTED @@ -227,6 +230,8 @@ sequenceDiagram P->>+P: packet comes and starts flow resimulation due to eni.epoch > flow.epoch P->>+W: slow-path, update flow data ``` +Note: epoch could be an internal attribute of eni and flow, which is not +visible in public SAI. ### 6.4. HA flow Base on basic flow, HA flow adds an extra FLOW_SYNCED state, which involves diff --git a/documentation/dataplane/images/dash-bmv2-data-plane-app-arch.drawio.svg b/documentation/dataplane/images/dash-bmv2-data-plane-app-arch.drawio.svg index 0de61c614..c3c9e933e 100644 --- a/documentation/dataplane/images/dash-bmv2-data-plane-app-arch.drawio.svg +++ b/documentation/dataplane/images/dash-bmv2-data-plane-app-arch.drawio.svg @@ -1,4 +1,465 @@ - - - -
    DASH BMv2
    P4 tables
    P4 pipeline
    P4 Runtime
    DPU-Port1
    DPU-PortN
    ......
    CPU Port
    RD


    Dataplane App
    VPP port
    vpp 
    workers
    Flow table
    RD
    DASH SAI
    vpp master
    dashsai server
    GRPC
    GRPC
    veth pair
    RPC
    RW
    sairedis
    dashsai client
    \ No newline at end of file + + + + + + + + + +
    +
    +
    + DASH BMv2 +
    +
    +
    +
    + + DASH BMv2 + +
    +
    + + + + +
    +
    +
    + P4 tables +
    +
    +
    +
    + + P4 tables + +
    +
    + + + + +
    +
    +
    + P4 pipeline +
    +
    +
    +
    + + P4 pipeline + +
    +
    + + + + +
    +
    +
    + P4 Runtime +
    +
    +
    +
    + + P4 Runtime + +
    +
    + + + + +
    +
    +
    + DPU-Port1 +
    +
    +
    +
    + + DPU-Port1 + +
    +
    + + + + +
    +
    +
    + DPU-PortN +
    +
    +
    +
    + + DPU-PortN + +
    +
    + + + + +
    +
    +
    + ...... +
    +
    +
    +
    + + ...... + +
    +
    + + + + +
    +
    +
    + CPU Port +
    +
    +
    +
    + + CPU Port + +
    +
    + + + + + + + +
    +
    +
    + RD +
    +
    +
    +
    + + RD + +
    +
    + + + +
    +
    +
    +
    +
    +
    +
    +
    +
    +
    +
    +
    +
    + + +
    +
    + + + + + + +
    +
    +
    + Dataplane App +
    +
    +
    +
    + + Dataplane App + +
    +
    + + + + +
    +
    +
    + DPAPP port +
    +
    +
    +
    + + DPAPP port + +
    +
    + + + + + +
    +
    +
    + + worker threads + +
    +
    +
    +
    + + worker threads + +
    +
    + + + + + + + + + + +
    +
    +
    + Flow table +
    +
    +
    +
    + + Flow table + +
    +
    + + + + + +
    +
    +
    + RD +
    +
    +
    +
    + + RD + +
    +
    + + + + +
    +
    +
    + DASH SAI +
    +
    +
    +
    + + DASH SAI + +
    +
    + + + + + + +
    +
    +
    + + master thread + +
    +
    +
    +
    + + master thread + +
    +
    + + + + +
    +
    +
    + dashsai server +
    +
    +
    +
    + + dashsai serv... + +
    +
    + + + + +
    +
    +
    + GRPC +
    +
    +
    +
    + + GRPC + +
    +
    + + + + +
    +
    +
    + GRPC +
    +
    +
    +
    + + GRPC + +
    +
    + + + +
    +
    +
    + veth pair +
    +
    +
    +
    + + veth pair + +
    +
    + + + + +
    +
    +
    + RPC +
    +
    +
    +
    + + RPC + +
    +
    + + + + + + +
    +
    +
    + RW +
    +
    +
    +
    + + RW + +
    +
    + + + + +
    +
    +
    + + sairedis + +
    +
    +
    +
    + + sairedis + +
    +
    + + + + +
    +
    +
    + dashsai client +
    +
    +
    +
    + + dashsai client + +
    +
    +
    + + + + + Text is not SVG - cannot display + + + +
    \ No newline at end of file From 781075e72c56e5615d7b424c127eff6f1fee4056 Mon Sep 17 00:00:00 2001 From: Junhua Zhai Date: Tue, 10 Sep 2024 01:26:23 +0000 Subject: [PATCH 5/5] Update flow data figure --- .../dataplane/dash-bmv2-data-plane-app.md | 26 +++++++++++-------- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/documentation/dataplane/dash-bmv2-data-plane-app.md b/documentation/dataplane/dash-bmv2-data-plane-app.md index ff99ac205..33c0ff7ca 100644 --- a/documentation/dataplane/dash-bmv2-data-plane-app.md +++ b/documentation/dataplane/dash-bmv2-data-plane-app.md @@ -148,26 +148,28 @@ DASH metadata is encoded in protocol DASH_METADATA, whose message format is defi +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flow data : : | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - Source: Where is packet from, dash_packet_source_t - Type: Packet type under certain source - Subtype: Packet subtype under certain type + Source: Where is packet from, dash_packet_source_t + Type: Packet type under certain source + Subtype: Packet subtype under certain type Length: Total length of DASH header Flow key entry: eni_mac, 5-tuples Flow data: As next figure ``` - + ``` 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - |D| Version | + | Reserved |U| Sync State | Direction | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Version | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Action | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Meter Class | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | [overlay rewrite] : : | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ @@ -178,10 +180,12 @@ DASH metadata is encoded in protocol DASH_METADATA, whose message format is defi : | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - D: flow direction, 0, oubound; 1, inbound - Version: flow version - Action: flow data – action - Meter: Meter class + U: flag is-unidirectional, 0, false, 1, true + Sync State: flow sync state + Direction: flow direction, 0, oubound; 1, inbound + Version: flow version + Action: flow action + Meter Class: Meter class object ... ```