Skip to content

Commit

Permalink
coordinator: auto mode is supported (#2178)
Browse files Browse the repository at this point in the history
  • Loading branch information
cyclinder authored Aug 17, 2023
1 parent 1bfa1fb commit 4750d2d
Show file tree
Hide file tree
Showing 25 changed files with 184 additions and 117 deletions.
3 changes: 3 additions & 0 deletions api/v1/agent/models/coordinator_config.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions api/v1/agent/openapi.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -328,6 +328,10 @@ definitions:
type: boolean
detectGateway:
type: boolean
podNICs:
type: array
items:
type: string
required:
- mode
- overlayPodCIDR
Expand Down
12 changes: 12 additions & 0 deletions api/v1/agent/server/embedded_spec.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 10 additions & 10 deletions charts/spiderpool/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -132,16 +132,16 @@ helm install spiderpool spiderpool/spiderpool --wait --namespace kube-system \

### coordinator parameters

| Name | Description | Value |
| ------------------------------ | ------------------------------------------------------------------------- | ---------- |
| `coordinator.enabled` | enable SpiderCoordinator | `true` |
| `coordinator.name` | the name of the default SpiderCoordinator CR | `default` |
| `coordinator.mode` | optional network mode, ["underlay", "overlay", "disabled"] | `underlay` |
| `coordinator.podCIDRType` | Pod CIDR type that should be collected, [ "cluster", "calico", "cilium" ] | `cluster` |
| `coordinator.detectGateway` | detect the reachability of the gateway | `false` |
| `coordinator.detectIPConflict` | detect IP address conflicts | `false` |
| `coordinator.tunePodRoutes` | tune Pod routes | `true` |
| `coordinator.hijackCIDR` | Additional subnets that need to be hijacked to the host forward | `[]` |
| Name | Description | Value |
| ------------------------------ | ------------------------------------------------------------------------- | --------- |
| `coordinator.enabled` | enable SpiderCoordinator | `true` |
| `coordinator.name` | the name of the default SpiderCoordinator CR | `default` |
| `coordinator.mode` | optional network mode, ["auto","underlay", "overlay", "disabled"] | `auto` |
| `coordinator.podCIDRType` | Pod CIDR type that should be collected, [ "cluster", "calico", "cilium" ] | `cluster` |
| `coordinator.detectGateway` | detect the reachability of the gateway | `false` |
| `coordinator.detectIPConflict` | detect IP address conflicts | `false` |
| `coordinator.tunePodRoutes` | tune Pod routes | `true` |
| `coordinator.hijackCIDR` | Additional subnets that need to be hijacked to the host forward | `[]` |


### multus parameters
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ spec:
type: integer
mode:
enum:
- auto
- underlay
- overlay
- disabled
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ spec:
type: integer
mode:
enum:
- auto
- underlay
- overlay
- disabled
Expand Down
4 changes: 2 additions & 2 deletions charts/spiderpool/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -90,8 +90,8 @@ coordinator:
## @param coordinator.name the name of the default SpiderCoordinator CR
name: "default"

## @param coordinator.mode optional network mode, ["underlay", "overlay", "disabled"]
mode: "underlay"
## @param coordinator.mode optional network mode, ["auto","underlay", "overlay", "disabled"]
mode: "auto"

## @param coordinator.podCIDRType Pod CIDR type that should be collected, [ "cluster", "calico", "cilium" ]
podCIDRType: "cluster"
Expand Down
1 change: 1 addition & 0 deletions cmd/coordinator/cmd/cni_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ var (
type Mode string

const (
ModeAuto Mode = "auto"
ModeUnderlay Mode = "underlay"
ModeOverlay Mode = "overlay"
ModeDisable Mode = "disable"
Expand Down
6 changes: 4 additions & 2 deletions cmd/coordinator/cmd/command_add.go
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ func CmdAdd(args *skel.CmdArgs) (err error) {
if err != nil {
return err
}

if conf.Mode == ModeDisable {
return types.PrintResult(conf.PrevResult, conf.CNIVersion)
}
Expand Down Expand Up @@ -98,6 +99,7 @@ func CmdAdd(args *skel.CmdArgs) (err error) {
currentInterface: args.IfName,
tuneMode: conf.Mode,
interfacePrefix: conf.MultusNicPrefix,
podNics: coordinatorConfig.PodNICs,
}
c.HijackCIDR = append(c.HijackCIDR, conf.ServiceCIDR...)
c.HijackCIDR = append(c.HijackCIDR, conf.HijackCIDR...)
Expand All @@ -110,14 +112,14 @@ func CmdAdd(args *skel.CmdArgs) (err error) {
defer c.netns.Close()

// check if it's first time invoke
err = c.coordinatorFirstInvoke(conf.PodDefaultCniNic)
err = c.coordinatorModeAndFirstInvoke(logger, conf.PodDefaultCniNic)
if err != nil {
logger.Error(err.Error())
return err
}

// get basic info
switch conf.Mode {
switch c.tuneMode {
case ModeUnderlay:
c.podVethName = defaultUnderlayVethName
c.hostVethName = getHostVethName(args.ContainerID)
Expand Down
15 changes: 7 additions & 8 deletions cmd/coordinator/cmd/command_del.go
Original file line number Diff line number Diff line change
Expand Up @@ -84,17 +84,16 @@ func CmdDel(args *skel.CmdArgs) (err error) {
}
defer c.netns.Close()

if conf.Mode == ModeUnderlay {
hostVeth := getHostVethName(args.ContainerID)
vethLink, err := netlink.LinkByName(hostVeth)
if err != nil {
if _, ok := err.(netlink.LinkNotFoundError); ok {
logger.Sugar().Debug("Host veth has gone, nothing to do", zap.String("HostVeth", hostVeth))
return nil
}
hostVeth := getHostVethName(args.ContainerID)
vethLink, err := netlink.LinkByName(hostVeth)
if err != nil {
if _, ok := err.(netlink.LinkNotFoundError); ok {
logger.Sugar().Debug("Host veth has gone, nothing to do", zap.String("HostVeth", hostVeth))
} else {
logger.Sugar().Warn(fmt.Sprintf("failed to get host veth device %s: %v", hostVeth, err))
return fmt.Errorf("failed to get host veth device %s: %v", hostVeth, err)
}
} else {
if err = netlink.LinkDel(vethLink); err != nil {
logger.Sugar().Warn("failed to del hostVeth", zap.Error(err))
return fmt.Errorf("failed to del hostVeth %s: %w", hostVeth, err)
Expand Down
44 changes: 40 additions & 4 deletions cmd/coordinator/cmd/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,20 +26,56 @@ type coordinator struct {
ipFamily, currentRuleTable, hostRuleTable int
tuneMode Mode
hostVethName, podVethName, currentInterface, interfacePrefix string
HijackCIDR []string
HijackCIDR, podNics []string
netns ns.NetNS
hostVethHwAddress, podVethHwAddress net.HardwareAddr
currentAddress []netlink.Addr
hostIPRouteForPod []net.IP
}

func (c *coordinator) autoModeToSpecificMode(mode Mode, podFirstInterface string) error {
if mode != ModeAuto {
return nil
}

if c.currentInterface == podFirstInterface {
c.firstInvoke = true
c.tuneMode = ModeUnderlay
return nil
}

// veth0 must be present in underlay mode
vethExist, err := networking.CheckInterfaceExist(c.netns, defaultUnderlayVethName)
if err != nil {
return fmt.Errorf("failed to check interface: %v exist: %v", defaultUnderlayVethName, err)
}

if vethExist {
c.tuneMode = ModeUnderlay
} else {
c.tuneMode = ModeOverlay
// If spinderpool only assigns a NIC to the pod, Indicates that it is the first invoke
if len(c.podNics) == 1 {
c.firstInvoke = true
}
}

return nil
}

// firstInvoke check if coordinator is first called and do some checks:
// underlay mode only works with underlay mode, which can't work with overlay
// mode, and which can't be called in first cni invoked by using multus's
// annotations: v1.multus-cni.io/default-network
func (c *coordinator) coordinatorFirstInvoke(podFirstInterface string) error {
func (c *coordinator) coordinatorModeAndFirstInvoke(logger *zap.Logger, podFirstInterface string) error {
var err error
switch c.tuneMode {
case ModeAuto:
if err = c.autoModeToSpecificMode(ModeAuto, podFirstInterface); err != nil {
return err
}
logger.Sugar().Infof("Successfully auto detect mode, change mode from auto to %v", c.tuneMode)
return nil
case ModeUnderlay:
c.firstInvoke = c.currentInterface == podFirstInterface
// underlay mode can't work with calico/cilium(overlay)
Expand Down Expand Up @@ -70,8 +106,8 @@ func (c *coordinator) coordinatorFirstInvoke(podFirstInterface string) error {
return fmt.Errorf("when creating interface %s in overlay mode, it detects that the auxiliary interface %s of underlay mode exists. It seems that the previous interface work in underlay mode. ", c.currentInterface, defaultUnderlayVethName)
}

c.firstInvoke, err = networking.IsFirstModeOverlayInvoke(c.netns, c.interfacePrefix)
return err
c.firstInvoke = len(c.podNics) == 1
return nil
case ModeDisable:
return nil
}
Expand Down
22 changes: 19 additions & 3 deletions cmd/spiderpool-agent/cmd/coordinator.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import (
"github.com/spidernet-io/spiderpool/pkg/coordinatormanager"
spiderpoolv2beta1 "github.com/spidernet-io/spiderpool/pkg/k8s/apis/spiderpool.spidernet.io/v2beta1"
corev1 "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
)

var unixGetCoordinatorConfig = &_unixGetCoordinatorConfig{}
Expand All @@ -24,6 +25,7 @@ func (g *_unixGetCoordinatorConfig) Handle(params daemonset.GetCoordinatorConfig
ctx := params.HTTPRequest.Context()
crdClient := agentContext.CRDManager.GetClient()
podClient := agentContext.PodManager
epClient := agentContext.EndpointManager

var coordList spiderpoolv2beta1.SpiderCoordinatorList
if err := crdClient.List(ctx, &coordList); err != nil {
Expand All @@ -39,11 +41,25 @@ func (g *_unixGetCoordinatorConfig) Handle(params daemonset.GetCoordinatorConfig
return daemonset.NewGetCoordinatorConfigFailure().WithPayload(models.Error(fmt.Sprintf("spidercoordinator: %s no ready", coord.Name)))
}

var pod *corev1.Pod
var err error
var spNics []string
var se *spiderpoolv2beta1.SpiderEndpoint
// get spiderendpoint
se, err = epClient.GetEndpointByName(ctx, params.GetCoordinatorConfig.PodNamespace, params.GetCoordinatorConfig.PodName, constant.UseCache)
if err != nil && !apierrors.IsNotFound(err) {
return daemonset.NewGetCoordinatorConfigFailure().WithPayload(models.Error(fmt.Sprintf("failed to get spiderendpoint %s/%s", params.GetCoordinatorConfig.PodNamespace, params.GetCoordinatorConfig.PodName)))
}

if se != nil {
for _, spip := range se.Status.Current.IPs {
spNics = append(spNics, spip.NIC)
}
}

var pod *corev1.Pod
pod, err = podClient.GetPodByName(ctx, params.GetCoordinatorConfig.PodNamespace, params.GetCoordinatorConfig.PodName, constant.UseCache)
if err != nil {
return daemonset.NewGetCoordinatorConfigFailure().WithPayload(models.Error(fmt.Sprintf("failed to get coordinator config: pod %s/%s not found", params.GetCoordinatorConfig.PodNamespace, params.GetCoordinatorConfig.PodName)))
return daemonset.NewGetCoordinatorConfigFailure().WithPayload(models.Error(fmt.Sprintf("failed to get pod %s/%s", params.GetCoordinatorConfig.PodNamespace, params.GetCoordinatorConfig.PodName)))
}

var prefix string
Expand Down Expand Up @@ -73,7 +89,7 @@ func (g *_unixGetCoordinatorConfig) Handle(params daemonset.GetCoordinatorConfig
HostRPFilter: int64(*coord.Spec.HostRPFilter),
DetectGateway: *coord.Spec.DetectGateway,
DetectIPConflict: *coord.Spec.DetectIPConflict,
PodNICs: spNics,
}

return daemonset.NewGetCoordinatorConfigOK().WithPayload(config)
}
11 changes: 5 additions & 6 deletions cmd/spiderpool-init/cmd/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
"strconv"
"strings"

coordinatorcmd "github.com/spidernet-io/spiderpool/cmd/coordinator/cmd"
"github.com/spidernet-io/spiderpool/pkg/constant"
spiderpoolip "github.com/spidernet-io/spiderpool/pkg/ip"
spiderpoolv2beta1 "github.com/spidernet-io/spiderpool/pkg/k8s/apis/spiderpool.spidernet.io/v2beta1"
Expand Down Expand Up @@ -108,6 +109,9 @@ func parseENVAsDefault() InitDefaultConfig {
config.CoordinatorName = strings.ReplaceAll(os.Getenv(ENVDefaultCoordinatorName), "\"", "")
if len(config.CoordinatorName) != 0 {
config.CoordinatorMode = strings.ReplaceAll(os.Getenv(ENVDefaultCoordinatorTuneMode), "\"", "")
if config.CoordinatorMode == "" {
config.CoordinatorMode = string(coordinatorcmd.ModeAuto)
}
config.CoordinatorPodCIDRType = strings.ReplaceAll(os.Getenv(ENVDefaultCoordinatorPodCIDRType), "\"", "")

edg := strings.ReplaceAll(os.Getenv(ENVDefaultCoordinatorDetectGateway), "\"", "")
Expand All @@ -130,12 +134,7 @@ func parseENVAsDefault() InitDefaultConfig {
logger.Sugar().Fatalf("ENV %s %s: %v", ENVDefaultCoordinatorTunePodRoutes, etpr, err)
}
config.CoordinatorTunePodRoutes = tpr
switch config.CoordinatorMode {
case "underlay":
config.CoordinatorPodDefaultRouteNic = "eth0"
case "overlay":
config.CoordinatorPodDefaultRouteNic = "net1"
}
config.CoordinatorPodDefaultRouteNic = ""
config.CoordinatorPodMACPrefix = ""
v := os.Getenv(ENVDefaultCoordiantorHijackCIDR)
if len(v) > 0 {
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/crd-spidercoordinator.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ This is the Spidercoordinators spec for users to configure.
| Field | Description | Schema | Validation | Values | Default |
|--------------------|--------------------------------------------------------------|----------------------|------------|------------------------------|------------------------------|
| mode | The mode in which the coordinator. underlay: coordinator creates veth devices to solve the problem that CNIs such as macvlan cannot communicate with clusterIP. overlay: fix the problem that CNIs such as Macvlan cannot access ClusterIP through the Calico network card attached to the pod,coordinate policy route between interfaces to ensure consistence data path of request and reply packets | string | require | underlay,overlay | underlay |
| mode | The mode in which the coordinator. auto: automatically determine if it's overlay or underlay. underlay: coordinator creates veth devices to solve the problem that CNIs such as macvlan cannot communicate with clusterIP. overlay: fix the problem that CNIs such as Macvlan cannot access ClusterIP through the Calico network card attached to the pod,coordinate policy route between interfaces to ensure consistence data path of request and reply packets | string | require | auto,underlay,overlay | auto |
| podCIDRType | The ways to fetch the CIDR of the cluster | string | require | cluster,calico,cilium,none | cluster |
| tunePodRoutes | tune pod's route while the pod is attached to multiple NICs | bool | optional | true,false | true |
| podDefaultRouteNIC | The NIC where the pod's default route resides | string | optional | "",eth0,net1... | underlay: eth0,overlay: net1 |
Expand Down
9 changes: 7 additions & 2 deletions docs/usage/coordinator-zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,9 @@ ClusterIP 的路由,导致无法访问。

### 配置 coordinator 运行在 underlay 模式

> 在默认情况下 mode 的值为auto(spidercoordinator CR 中 spec.mode 为 auto), coordinator 将通过对比当前 CNI 网卡是否是 `eth0`, 如果是,则自动判断为 Underlay 模式。
> 如果当前网卡不是 `eth0`,那么 coordinator 将检测 Pod 中是否存在 `veth0` 网卡,如果是,则判断为 Underlay 模式。
当您的业务部署在"传统网络"或者 IAAS 环境上时,业务 Pod 的 IP 地址可能直接从宿主机的 IP 子网分配。应用 Pod 可直接使用自己的 IP 地址进行东西向和南北向通。

该模式的优点有:
Expand Down Expand Up @@ -69,7 +72,7 @@ spec:
}
```
- mode: 指定 coordinator 运行在 underlay 模式
- mode: 指定 coordinator 运行在 underlay 模式。或默认为 auto 模式,您只需要在 Pod 注入注解: `v1.multus-cni.io/default-network: kube-system/macvlan-underlay`, coordinator 将会自动判断 mode 为 underlay。

当以 macvlan-underlay 创建 Pod,我们进入到 Pod 内部,看看路由等信息:

Expand Down Expand Up @@ -100,6 +103,8 @@ default via 10.6.0.1 dev eth0

与 Underlay 模式相对应,我们有时候并不关心集群部署环境的底层网络是什么,我们希望集群能够运行在大多数的底层网络。常常会用到如[Calico](https://github.com/projectcalico/calico) 和 [Cilium](https://github.com/cilium/cilium) 等CNI, 这些插件多数使用了 vxlan 等隧道技术,搭建起一个 Overlay 网络平面,再借用 NAT 技术实现南北向的通信。

> 在默认情况下 mode 的值为auto(spidercoordinator CR 中 spec.mode 为 auto), coordinator 将通过对比当前 CNI 调用网卡是否不是 `eth0`。如果不是,确认 Pod 中不存在 `veth0` 网卡,则自动判断为 overlay 模式。

此模式的优点有:

- IP 地址充沛,几乎不存在地址短缺的问题
Expand Down Expand Up @@ -146,7 +151,7 @@ spec:
}
```

- mode: 指定 coordinator 运行在 overlay 模式
- mode: 指定 coordinator 运行在 overlay 模式。或默认为 auto 模式,您只需要在 Pod 注入注解: `k8s.v1.cni.cncf.io/networks: kube-system/macvlan-overlay`,coordinator 将会自动判断 mode 为 overlay。

当以 macvlan-overlay 创建 Pod,我们进入到 Pod 内部,看看路由等信息:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ import (

// CoordinationSpec defines the desired state of SpiderCoordinator.
type CoordinatorSpec struct {
// +kubebuilder:validation:Enum=underlay;overlay;disabled
// +kubebuilder:validation:Enum=auto;underlay;overlay;disabled
// +kubebuilder:validation:Optional
Mode *string `json:"mode,omitempty"`

Expand Down
Loading

0 comments on commit 4750d2d

Please sign in to comment.