When creating a benchmark which is composed of multiple functions, one can choose to compose their functions with synchronous calls (where a caller waits for the callee) or asynchronously (no waiting). The Knative programming model can support both of these through its Serving and Eventing modules, and so if one has a workload which they would like to bring up to use with vHive or stock Knative they will need to choose between the two. Both approaches are described in this document along with details on any extra Kubernetes (k8s) or Knative services/manifests which are necessary for implementing such a benchmark.
Note that this is not a step-by-step guide, but rather a general rundown of the core steps that one would need to take in implementing their workload. This overview consists of general guidelines which will apply to all implementations as well as sections dedicated specifically to the serving and eventing approach.
Apart from using the serving or eventing Knative component, the process of composing functions will also need to include support for a remote procedure call system such as gRPC, and Docker. Throughout this document are included references to examples of both serving and eventing using gRPC and Docker to implement a composition of functions which runs with Knative on a k8s cluster, and as such guidance is given for these systems specifically, though similar factors should apply to any alternatives.
Remote Procedure Call support allows functions to communicate over a network. gRPC with protobuf is used in the serving example. Refer to the gRPC tutorial for details on usage of these systems.
First one should define a proto file
which describes the services used to communicate across functions. Use this to generate code with
protoc
, making sure that the appropriate plugins for the used language are installed (e.g. our
example needs the Golang plugins as described here. An
example proto file and the generated code can be found
here.
Within each function one will need to support the appropriate proto service (i.e. implement a server or client using the generated proto code) by implementing the interface. Keep in mind that some functions such as the producer in the serving example will need to be both a server of one proto service and a client of another proto service simultaneously. Refer to the gRPC tutorial for extra detail.
To deploy functions on Knative one will need to package them as Docker containers. Each function which will be deployed on a cluster will need to be a separate image.
In the provided serving and eventing examples the Dockerfile uses target_arg arguments to work in tandem with the Makefile to reduce repetition, but it is also fine to write simple separate Dockerfiles for each function. One should push the images of their functions to Docker hub to make them accessible on their cluster.
Manifests are used to define the deployed services. The specifics of what goes into a manifest and how many manifests are needed depends on whether serving or eventing is used, and the details on both are given in their appropriate sections.
Since vHive functions use gRPC, for example in both the
serving and
eventing examples, one will
need to include the h2c
port translation in each relevant manifest.
vHive manifests must follow a specific structure, and they rely on hosting a guest image on a stub
image in order to work. See this hello-world example
for a typical vHive manifest. h2c
port translation must be used, and one's function must be
specified as a guest image environment variable. The guest port variable must match the
containerPort associated with the h2c, and the core container image must be
crccheck/hello-world:latest
. Because of these restrictions environment variables cannot be used
to interact with one's functions when using a vHive manifest, and this is also why it is advised
that relying on environment variables in general is avoided throughout the process of bringing up a
workload.
Tracing allows gathering timing data in a cluster. In vHive, we export tracing data to Zipkin for visualization. To support tracing throughout functions, one should instrument their functions with the calls to vHive's tracing module. One can see instrumentation in the Serving and Eventing examples.
vHive provides a tracing module that can be used to instrument various functions. The module relies on Opentelemetry that is a standard propelled by the CNCF community. In future, vHive is going to provide tracing modules in a wide range of languages (external contributions are welcome).
One can enable tracing by following these steps (example is for Golang):
-
Initialise the tracer:
shutdown, err := tracing.InitBasicTracer(*url, "my function") if err != nil { log.Warn(err) } defer shutdown()
The url provided should point to the zipkin span collector service, e.g.
http://localhost:9411/api/v2/spans
if one hosts zipkin locally (e.g. as a Docker container). See the zipkin quickstart.The basic tracer can be used for most applications, and in cases where one wants to provide additional attributes or wishes to specify a different sampling rate they can use
InitCustomTracer
. -
If the function is a server, make an instrumented grpc server: Example:
grpcServer := tracing.GetGRPCServerWithUnaryInterceptor()
-
If the function is a client, use the instrumented grpc dial method to connect to the server:
conn, err := tracing.DialGRPCWithUnaryInterceptor(addr, grpc.WithBlock(), grpc.WithInsecure())
-
To enable tracing instrumentation, set
ENABLE_TRACING
environment variable totrue
(missing values are by defaultfalse
) during deployment.apiVersion: serving.knative.dev/v1 kind: Service metadata: name: foo namespace: default spec: template: spec: containers: - image: docker.io/vhiveease/FOO:latest imagePullPolicy: Always env: - name: ENABLE_TRACING value: "true" ports: - name: h2c containerPort: 80
- Beware that tracing comes with an non-negligible overhead so end-to-end measurements should be performed with tracing off.
The producer in the vHive serving example gives an example usage of this utility for both server and client behaviour.
Below one can see screenshots from a producer-consumer trace visualized with Zipkin.
New workloads should be included in the automatic CI for regular testing, as this is helpful both for code maintenance and in demonstrating how the workload should be deployed. Existing function end-to-end CI workflows can be referred to as an example in which the demo serving and eventing workloads are ran both "locally" and on a Knative cluster.
When including logging within functions, please use logrus
with the following format:
import (
ctrdlog "github.com/containerd/containerd/log"
)
log.SetFormatter(&log.TextFormatter{
TimestampFormat: ctrdlog.RFC3339NanoFixed,
FullTimestamp: true,
})
See this code snippet for an example.
To compose functions with serving we make use of the
Knative Serving component. Each of the functions will
effectively be a server for the functions that come before it, and a client of the functions that
come after. For example in a simple chain of functions A -> B -> C
, B would be a client of
function C and a server for function A.
The serving function composition example can be found here, and additional CI implementation which shows how this code is executed can be found here. This example implements a simple Client -> Producer -> Consumer function chain, whereby the client triggers the producer function to generate a random string, and the consumer consumes said string (by logging it to a file).
To deploy a workload with serving one will need to:
- Implement a remote procedure call system (e.g.
gRPC
) - Dockerize the functions
- Write Knative manifests
As mentioned in the general guidelines, functions will communicate with RPC calls. In the serving example gRPC with protobuf is used to achieve this.
In serving, each "link" in a chain of functions needs to implement a protobuf service. For
example in a chain A -> B -> C
the A-B link will be one service and the B-C link will be a second
service. One should define a
proto file for these services, and
use it to generate code with protoc
. An example can be seen
here.
One should implement the appropriate server or client service from the generated proto code in their functions and remember that some functions such as the producer in the example will need to be both a server of one proto service and a client of another proto service simultaneously.
One will need a Knative service definition for each of their functions. Refer to the Knative docs and see the example manifests for support.
We recommend following the vHive developers guide to set up a
workload. When deploying functions from a Knative manifests one can make sure that they are
working with kn service list
and kn service describe <name>
.
If Knative is struggling to make revisions of pods (e.g. a service is labeled as unschedulable) then one might be using the wrong ports in their function. Double-check Knative manifests and function code. Port 80 should be used for serving by default, or the $PORT environment variable which will be set by Knative when deploying a function.
If some pods are stuck on pending (kubectl get pods -A
) then one might have exhausted system
resources. This can occur in situations where there are too many pods or containers running on
the system (e.g. if when working from within kind
containers on cloudlab as recommended in the
vHive developers guide), or when using default github runners for
automated workflows.
One can also make use of the Knative Eventing component to compose functions. There are two different eventing approaches in Knative, and we will use the Broker-Trigger model as advised---see Knative Eventing Primer for an introduction to Knative Eventing.
Whilst it is possible to expose the broker directly to the outside world, it makes more sense to have a service in front of it, for tasks like authentication, authorization, validation, and also to abstract the particular broker implementation away.
An example of function composition using eventing can be found here. This example implements a simple Client (grpcurl) -> Producer -> Consumer function chain, whereby the client triggers the producer function to generate an event, and the consumer consumes said event. The CI workflow for this example can be found here, showing how the example can be deployed.
In general, to deploy a workload with eventing one will need to:
- Implement a producer server that processes incoming requests and raises corresponding events
- Implement event consumers that handle the events that are of interest
- Dockerize the functions
- Write Knative manifests for one's services and supporting components (e.g. Triggers, SinkBindings, etc.)
Below presented Knative manifests of some components that we believe might be helpful to explain, namely:
- SinkBinding
- Trigger
Example:
apiVersion: sources.knative.dev/v1
kind: SinkBinding
metadata:
name: my-sinkbinding
namespace: my-namespace-a
spec:
subject:
apiVersion: serving.knative.dev/v1
kind: Service
name: my-service
namespace: my-namespace-b
sink:
ref:
apiVersion: eventing.knative.dev/v1
kind: Broker
name: my-broker
namespace: my-namespace-c
- A SinkBinding is a component that injects
K_SINK
and some other environment variables to configure services dynamically on runtime---such services can use the injected environment variables to address the Broker, the Channel, or even another Service (all called a "sink") to send CloudEvents. - A SinkBinding, the subject ("sender"), and the sink ("receiver") may exist in different namespaces.
Example:
apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
name: my-trigger
namespace: my-namespace
spec:
broker: my-broker
filter:
attributes:
type: my-cloudevent-type
source: my-cloudevent-source
my-custom-extension: my-custom-value
subscriber:
ref:
apiVersion: serving.knative.dev/v1
kind: Service
name: my-kservice
namespace: my-namespace
- A Trigger is a link between a broker and a subscriber that relays the incoming CloudEvents to
the broker after filtering them on certain attributes, commonly
type
andsource
but possibly also any other attribute extensions. - A Trigger must be in the same namespace with the broker it is attached to, but can relay CloudEvents to any addressable subscriber in any namespace.
- While deploying a SinkBinding, best wait until the sink ("receiver") is ready and thus has an
address that the SinkBinding can use.
- Services that depend during initialization on the environment variables injected by the SinkBinding might fail (repeatedly) upon deployment until a relevant SinkBinding is applied; that is normal.
- While deploying a Trigger, best wait until both the broker and the subscriber ("receiver") are ready.