- Introduction
- Architecture
- Mandatory features
- Optional capabilities
- Getting Data from CF
- Tests
- Common Gotchas
- Feedback
So you want to create your own CNI plugin with Cloud Foundry?
First, all CNI plugins are required to implement this set of features.
Cloud Foundry requires the networking stack to perform certain additional functions which are currently not standardized by CNI. These are spelled out later in this doc in more detail here.
There are also associated tests to confirm the plugin implementation is correct.
If you want to integrate your own CNI plugin with Cloud Foundry, begin by reviewing the component diagrams on the architecture page. Note that your plugin would replace the components in red, and take on the responsibilities of these components.
In addition to the features listed in the CNI spec, the following features are required.
- NetOut
- NetIn
- Policy configuration
- MTU
- Your CNI plugin is a bosh release
Spec: Operators can configure ASGs at the CF or space level to allow traffic from apps and tasks to CIDR ranges.
Description: Networking layer provides IP addressing and connectivity for containers. The networking layer sets up firewall rules to allow traffic based on ASG configuration. For more information on ASGs, see these docs.
CF Information Needed: ASG information can be pulled from the config passed in from the garden external networker. See runtimeConfig.netOutRules
under Getting Data From CF. The ASG information provided will be only for the ASGs that are currently applied to the app. If you want information about new ASGs has been added through Cloud Controller, see Information from CAPI.
Spec: External entities can reach applications through the GoRouter.
Description: Networking layer sets up firewall rules to allow ingress traffic from GoRouter, TCP router and SSH proxy.
CF Information Needed: In order for the GoRouter, TCP router, and SSH proxy to be able to access your app, ports listed in portMappings
need to be exposed via DNAT. For example, the cni-wrapper-plugin in silk-release - see runtimeConfig.portMappings
under Getting Data From CF - gets this data from the garden-cni job. These can also be retreived from environment variables
Spec: App-to-app policies between app containers and task containers for those apps
Description: The networking layer sets up firewall rules to allow container-to-container traffic based on policy (v1 of policy API must be supported).
CF Information Needed: You need to have an agent running that is polling the internal policy server. For example, VXLAN Policy Agent in silk-release polls the internal policy server.
Spec: operators can override the MTU on the interface
Description: CNI plugins should automatically detect the MTU settings on the host, and set the MTU on container network interfaces appropriately. For example, if the host MTU is 1500 bytes and the plugin encapsulates with 50 bytes of header, the plugin should ensure that the container MTU is no greater than 1450 bytes. This is to ensure there is no fragmentation. The built-in silk CNI plugin does this.
Operators may wish to override the MTU setting. It is recommended to expose MTU as a Bosh property on your CNI job, as the cni job in silk-release.
CF Information Needed: None.
Your CNI plugin will need to be packaged as a BOSH release.
Add in all packages and jobs required by your CNI plugin. At a minimum, you must provide a CNI binary program and a CNI config file. If your software requires a long-lived daemon to run on the diego cell, we recommend you deploy a separate BOSH job for that.
Use the silk-release as inspiration.
Update the deployment manifest properties - The garden-cni job properties must be configured to point to your plugin's paths.
properties:
cf_networking:
cni_plugin_dir: /var/vcap/packages/YOUR_PACKAGE/bin # directory for CNI binaries
cni_config_dir: /var/vcap/jobs/YOUR_JOB/config/cni # directory for CNI config file(s)
The above properties are configured on the garden-cni job: cni_config_dir
and cni_plugin_dir
Your CNI plugin must at a minimum expose these properties:
The following features are optional for your CNI plugin:
- Per ASG logging
- Global ASG and container-to-container logging
- Bosh backup and restore (BBR)
Spec: Operaters can configure "log": true
in ASG config per ASG.
Description: The networking layer logs all accepted/denied packets for the ASG with "log": true
set.
CF Information Needed: ASG information can be pulled from the config passed in from the garden external networker. See runtimeConfig.netOutRules
under Getting Data From CF.
Spec: Operators can enable global logging for ASGs or containers.
Description: The networking layer logs all accepted/denied ASG and container-to-container packets.
CF Information Needed: None. For example, the vxlan-policy-agent job directly exposes a Bosh property for operators to set to write C2C iptables logs, and the cni job exposes a Bosh property for operators to set to write ASG iptables logs.
Spec: Operators can backup and restore Bosh deployments.
Description: Add support for BBR if there is data that must be retained after a backup and restore operation.
CF Information Needed: None. For inspiration on conforming to BBR, see the bbr-cfnetworkingdb job and the backup script templates for the policy server job.
silk-release, on the other hand, is built in a way that it is resilient to data loss in the silk-controller.
Spec: Apps can connect to services using Bosh DNS.
Description: The networking layer allows containers to reach Bosh DNS on the cell at 169.254.0.2
.
CF Information Needed: None. silk-release exposes this Bosh property for an operator to set to - 169.254.0.2
, that will allow containers to reach Bosh DNS on the cell. Setting this property causes the silk plugin to add dns nameserver information in the response back to the garden external networker. This has the notable side effect of causing the /etc/resolv.conf
to have only the nameservers specified by this property, and nothing else. If this property is omitted, the /etc/resolv.conf
in the container will have a close resemblance to the host vm's /etc/resolv.conf
.
This config is described in the CNI conventions document.
The garden-external-networker
will invoke one or more CNI plugins, according to the CNI Spec.
It will start with the CNI config files available in the cni_config_dir
and also inject
some dynamic information about the container. This is divided into two keys the first, metadata
contains the CloudFoundry App, Space and Org that it belongs to.
Another key runtimeConfig
holds information that CNI plugins may need
to implement legacy networking features of Cloud Foundry. It is divided into two keys, portMappings
can be translated into port forwarding
rules to allow the gorouter access to application containers, and netOutRules
which are egress whitelist rules used for implementing
application security groups.
A reference implementation of these features can be seen in the cni-wrapper-plugin.
For example, at deploy time, Silk's CNI config is generated from this template, and
is stored in a file on disk at /var/vcap/jobs/silk-cni/config/cni-wrapper-plugin.conf
, which resembles
{
"name": "cni-wrapper",
"type": "cni-wrapper-plugin",
"cniVersion": "0.3.1",
"datastore": "/var/vcap/data/container-metadata/store.json",
"iptables_lock_file": "/var/vcap/data/garden-cni/iptables.lock",
"overlay_network": "10.255.0.0/16",
"instance_address": "10.0.16.14",
"iptables_asg_logging": true,
"iptables_c2c_logging": true,
"ingress_tag": "ffff0000",
"dns_servers": [
],
"delegate": {
"cniVersion": "0.3.1",
"name": "silk",
"type": "silk-cni",
"daemonPort": 23954,
"dataDir": "/var/vcap/data/host-local",
"datastore": "/var/vcap/data/silk/store.json",
"mtu": 0
}
}
Then, when a container is created, the garden-external-networker
adds additional runtime-specific data, so that
the CNI plugin receives a final config object that resembles:
{
"name": "cni-wrapper",
"type": "cni-wrapper-plugin",
"cniVersion": "0.3.1",
"datastore": "/var/vcap/data/container-metadata/store.json",
"iptables_lock_file": "/var/vcap/data/garden-cni/iptables.lock",
"overlay_network": "10.255.0.0/16",
"instance_address": "10.0.16.14",
"iptables_asg_logging": true,
"iptables_c2c_logging": true,
"ingress_tag": "ffff0000",
"dns_servers": [
],
"delegate": {
"cniVersion": "0.3.1",
"name": "silk",
"type": "silk-cni",
"daemonPort": 23954,
"dataDir": "/var/vcap/data/host-local",
"datastore": "/var/vcap/data/silk/store.json",
"mtu": 0
},
"runtimeConfig": {
"portMappings": [{
"host_port": 60001,
"container_port": 8080
}, {
"host_port": 60002,
"container_port": 2222
}],
"netOutRules": [{
"protocol": 1,
"networks": [{
"start": "8.8.8.8",
"end": "9.9.9.9"
}],
"ports": [{
"start": 53,
"end": 54
}],
"log": true
}],
"metadata": {
"policy_group_id": "d5bbc5ed-886a-44e6-945d-67df1013fa16",
"app_id": "d5bbc5ed-886a-44e6-945d-67df1013fa16",
"space_id": "4246c57d-aefc-49cc-afe0-5f734e2656e8",
"org_id": "2ac41bbf-8eae-4f28-abab-51ca38dea3e4"
}
}
}
Furthermore, the CNI runtime data, provided as environment variables, sets the
CNI ContainerID
equal to the
Garden container Handle
.
When Diego calls Garden, it sets that equal to the ActualLRP
InstanceGuid
.
In this way, a 3rd-party system can relate data from CNI with data in the Diego BBS.
If you are replacing the built-in "VXLAN Policy Agent" with your own policy enforcement implementation, you can use the Policy Server's internal API to retrieve policy information.
3rd party components replacing the VXLAN policy agent should implement the features necessary to allow application containers to access on another. The policies that are created by CF users are retrieved from the Internal Policy Server. Clients to this api will need to poll this api to ensure the changes to the policies are honored.
3rd party integrators should expect the internal policy server component will be present in a standard CF deploy.
For how to use the Internal Policy Server API, read here.
None of silk-release communicates directly with CAPI. Information about ASGs are passed in on container creation. If you want information about new ASGs that have been added through Cloud Controller, but that haven't been passed through on the config because the app has not been restarted, you can poll CAPI.
If you want information on org, space, app events for use by your CNI plugin, see the CF API docs.
None of silk-release communicates directly with the BBS. For inspiration on how to subscribe to the BBS event stream, see the route-emitter job and code.
Set register_direct_instance_routes
property on the route_emitter to true in order to register container IPs instead of port mappings.
A Cloud Foundry system that integrates a 3rd party networking component should be able to pass the following test suites:
- CF Networking Smoke Tests
- CF Networking Acceptance Tests
- CF Acceptance Tests (CATs)
- CF Routing Acceptance Tests (RATS)
- Optional - CF Disaster Recovery Acceptance Tests (DRATS)
Only the CF Networking Smoke Tests
are non-disruptive and may be run against a live, production environment. The other tests make potentially disruptive changes and should only be run against a non-production environment.
For local development, we recommend using cf-deployment
on BOSH-lite.
For guidance on these test suites, please reach out to our team in Slack (bottom of this page).
If you want to integrate using the default values for the cni_config_dir
and cni_plugin_dir
, your BOSH package for the CNI plugin must be named cni
and the BOSH job for the CNI plugin must be named cni
.
If you have any questions or feedback, please visit the #container-networking
channel on Cloud Foundry Slack.