Skip to content

Commit

Permalink
[Feature] Add end to end tests to apiserver
Browse files Browse the repository at this point in the history
Fixes #1388
  • Loading branch information
z103cb committed Oct 5, 2023
1 parent 38e3527 commit 1de910a
Show file tree
Hide file tree
Showing 19 changed files with 2,282 additions and 163 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ jobs:
working-directory: ${{env.working-directory}}

- name: Test
run: go test ./...
run: go test ./pkg/... ./cmd/... -race -parallel 4
working-directory: ${{env.working-directory}}

- name: Set up Docker
Expand Down
44 changes: 38 additions & 6 deletions apiserver/DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,35 @@ make build
make test
```

#### End to End Testing

There are two `make` targets provide execute the end to end test (integration between Kuberay API server and Kuberay Operator):

* `make e2e-test` executes all the tests defined in the [test/e2e package](./test/e2e/). It uses the cluster defined in `~/.kube/config` to submit the workloads.
* `make local-e2e-test` creates a local kind cluster, deploys the nightly operator image and a freshly build Kuberay API server into the kind cluster and shuts down the kind cluster upon successful execution of the end to end test.

The `e2e` test targets use two variables to control what version of Ray images to use in the end to end tests:

* `E2E_API_SERVER_RAY_IMAGE` -- for the ray docker image. Currently set to `rayproject/ray:2.7.0-py310`. On Apple silicon or arm64 development machines the `-aarch64` suffix is added.
* `E2E_API_SERVER_URL` -- for the base URL of the deployed KubeRayAPI server. The default value is: `http://localhost:31888`

The end to end test targets share the usage of the `GO_TEST_FLAGS`. Overriding the make file variable with a `-v` option allows for both unit and end to end tests to print any output / debug messages. By default, only if there's a test failure those messages are show.

The default values of the variables can be overridden using the `-e` make command line arguments.

Examples:

```bash
# To run end to end test using default cluster
make e2e-test

# To run end to end test in fresh cluster.
# Please note that:
# * the cluster created for this test is the same as the cluster created by make cluster.
# * if the end to end tests fail the cluster will still be up and will have to be explicitly shutdown by executing make clean-cluster
make local-e2e-test
```

#### Swagger UI updates

To update the swagger ui files deployed with the Kuberay API server, you'll need to:
Expand Down Expand Up @@ -117,7 +146,7 @@ make run

#### Access

Access the service at `localhost:8888` for http, and `locahost:8887` for the RPC port.
Access the service at `localhost:8888` for http, and `localhost:8887` for the RPC port.

### Kubernetes Deployment

Expand Down Expand Up @@ -160,9 +189,9 @@ As a convenience for local development the following `make` targets are provided
* `make cluster` -- creates a local kind cluster, using the configuration from `hack/kind-cluster-config.yaml`. It creates a port mapping allowing for the service running in the kind cluster to be accessed on `localhost:31888` for HTTP and `localhost:31887` for RPC.
* `make clean-cluster` -- deletes the local kind cluster created with `make cluster`
* `load-image` -- loads the docker image defined by the `IMG` make variable into the kind cluster. The default value for variable is: `kuberay/apiserver:latest`. The name of the image can be changed by using `make load-image -e IMG=<your image name and tag>`
* `operator-image` -- Build the operator image to be loaded in your kind cluster. The tag for the operator image is `kuberay/operator:latest`. This step is optional.
* `load-operator-image` -- Load the operator image to the kind cluster created with `create-kind-cluster`. The tag for the operator image is `kuberay/operator:latest`, and the tag can be overridden using `make load-operator-image -E OPERATOR_IMAGE_TAG=<operator tag>`. To use the nightly operator tag, set `OPERATOR_IMAGE_TAG` to `nightly`.
* `deploy-operator` -- Deploy operator into your cluster. The tag for the operator image is `kuberay/operator:latest`.
* `operator-image` -- Build the operator image to be loaded in your kind cluster. You must specify a value for the operator image tag. Since the default value is set to `nightly`, the local image with this value will be overridden if `make deploy` operator is used later. This step is optional. Example: `make operator-image -e OPERATOR_IMAGE_TAG=latest`
* `load-operator-image` -- Load the operator image to the kind cluster created with `create-kind-cluster`. The tag for the operator image is `kuberay/operator:nightly`, and the tag can be overridden using `make load-operator-image -E OPERATOR_IMAGE_TAG=<operator tag>`.
* `deploy-operator` -- Deploy operator into your cluster. The tag for the operator image is `kuberay/operator:nightly`.
* `undeploy-operator` -- Undeploy operator from your cluster

When developing and testing with kind you might want to execute these targets together:
Expand All @@ -172,9 +201,12 @@ When developing and testing with kind you might want to execute these targets to
make docker-image cluster load-image deploy
#To create a new API server image, operator image and deploy them on a new cluster
make docker-image operator-image cluster load-image load-operator-image deploy deploy-operator
make docker-image operator-image cluster load-image load-operator-image deploy deploy-operator -e OPERATOR_IMAGE_TAG=latest
#To execute end 2 end tests with a local build operator and verbose output
make operator-image local-e2e-test -e OPERATOR_IMAGE_TAG=latest -e GO_TEST_FLAGS="-v"
```

#### Access API Server in the Cluster

Access the service at `localhost:31888` for http and `locahost:31887` for the RPC port.
Access the service at `localhost:31888` for http and `localhost:31887` for the RPC port.
25 changes: 21 additions & 4 deletions apiserver/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@ REPO_ROOT_BIN := $(REPO_ROOT)/bin
IMG_TAG ?=latest
IMG ?= kuberay/apiserver:$(IMG_TAG)

# Allow for additional test flags (-v, etc)
GO_TEST_FLAGS ?=
# Ray docker images to use for end to end tests
E2E_API_SERVER_RAY_IMAGE ?=rayproject/ray:2.7.0-py310
# Kuberay API Server base URL to use in end to end tests
E2E_API_SERVER_URL ?=http://localhost:31888

# Get the currently used golang install path (in GOPATH/bin, unless GOBIN is set)
ifeq (,$(shell go env GOBIN))
GOBIN=$(shell go env GOPATH)/bin
Expand Down Expand Up @@ -56,11 +63,18 @@ imports: goimports ## Run goimports against code.
$(GOIMPORTS) -l -w .

test: fmt vet fumpt imports lint ## Run unit tests.
go test ./... -race -coverprofile ray-kube-api-server-coverage.out
go test ./pkg/... ./cmd/... $(GO_TEST_FLAGS) -race -coverprofile ray-kube-api-server-coverage.out -parallel 4

lint: golangci-lint fmt vet fumpt imports ## Run the linter.
$(GOLANGCI_LINT) run --timeout=3m

.PHONY: e2e-test
e2e-test: ## Run end to end tests using a pre-exiting cluster.
go test ./test/e2e/... $(GO_TEST_FLAGS) -timeout 30m -race -parallel 4 -count=1

.PHONY: local-e2e-test
local-e2e-test: docker-image cluster load-image load-operator-image deploy-operator deploy e2e-test clean-cluster ## Run end to end tests, create a fresh kind cluster will all components deployed.

##@ Build

build: fmt vet fumpt imports lint ## Build api server binary.
Expand All @@ -70,10 +84,10 @@ run: fmt vet fumpt imports lint ## Run the api server from your host.
go run -race cmd/main.go -localSwaggerPath ${REPO_ROOT}/proto/swagger

docker-image: test ## Build image with the api server.
${ENGINE} build -t ${IMG} -f Dockerfile ..
$(ENGINE) build -t ${IMG} -f Dockerfile ..

docker-push: ## Push image with the api server.
${ENGINE} push ${IMG}
$(ENGINE) push ${IMG}

.PHONY: build-swagger
build-swagger: go-bindata
Expand Down Expand Up @@ -170,7 +184,7 @@ clean-dev-tools: ## Remove all development tools
##@ Testing Setup and Tools
KIND_CONFIG ?= hack/kind-cluster-config.yaml
KIND_CLUSTER_NAME ?= ray-api-server-cluster
OPERATOR_IMAGE_TAG ?= latest
OPERATOR_IMAGE_TAG ?= nightly
.PHONY: cluster
cluster: kind ## Start kind development cluster.
$(KIND) create cluster -n $(KIND_CLUSTER_NAME) --config $(KIND_CONFIG)
Expand Down Expand Up @@ -200,4 +214,7 @@ undeploy-operator: ## Undeploy operator via helm from the K8s cluster specified

.PHONY: load-operator-image
load-operator-image: ## Load the operator image to the kind cluster created with create-kind-cluster.
ifeq (${OPERATOR_IMAGE_TAG}, nightly)
$(ENGINE) pull kuberay/operator:$(OPERATOR_IMAGE_TAG)
endif
$(KIND) load docker-image kuberay/operator:$(OPERATOR_IMAGE_TAG) -n $(KIND_CLUSTER_NAME)
112 changes: 46 additions & 66 deletions apiserver/Volumes.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,96 +7,81 @@ API server allows to specify multiple types of volumes mounted to the Ray pods (
[config maps](https://kubernetes.io/docs/concepts/storage/volumes/#configmap),
[secrets](https://kubernetes.io/docs/concepts/storage/volumes/#secret),
and [empty dir](https://kubernetes.io/docs/concepts/storage/volumes/#emptydir).
Multiple volumes of different type can be mounted to both head and worker nodes, by defining a volume array for them

Multiple volumes of different type can be mounted to both head and worker nodes, by defining a volume array for them

## HostPath volumes

A hostPath volume mounts a file or directory from the host node's filesystem into your Pod. This is not something that
most Pods will need, but it offers a powerful escape hatch for some applications.
A hostPath volume mounts a file or directory from the host node's filesystem into your Pod. This is not something that most Pods will need, but it offers a powerful escape hatch for some applications.

For example, some uses for a hostPath are:

* running a container that needs access to Docker internals; use a hostPath of /var/lib/docker
* running cAdvisor in a container; use a hostPath of /sys
* allowing a Pod to specify whether a given hostPath should exist prior to the Pod running, whether it should be
created, and what it should exist as
* allowing a Pod to specify whether a given hostPath should exist prior to the Pod running, whether it should be created, and what it should exist as

The code below gives an example of hostPath volume definition:

````
```json
{
"name": "hostPath", # unique name
"source": "/tmp", # data location on host
"mountPath": "/tmp/hostPath", # mounting path
"volumeType": 1, # volume type - host path
"hostPathType": 0, # host path type - directory
"mountPropagationMode": 1 # mount propagation - host to container
"name": "hostPath", # unique name
"source": "/tmp", # data location on host
"mountPath": "/tmp/hostPath", # mounting path
"volumeType": 1, # volume type - host path
"hostPathType": 0, # host path type - directory
"mountPropagationMode": 1 # mount propagation - host to container
}
````
```

## PVC volumes

A Persistent Volume Claim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources
and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request
specific size and access modes (e.g., they can be mounted `ReadWriteOnce`, `ReadOnlyMany` or `ReadWriteMany`).
A Persistent Volume Claim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted `ReadWriteOnce`, `ReadOnlyMany` or `ReadWriteMany`).

The caveat of using PVC volumes is that the same PVC is mounted to all nodes. As a result only PVCs with access
mode `ReadOnlyMany` can be used in this case.
The caveat of using PVC volumes is that the same PVC is mounted to all nodes. As a result only PVCs with access mode `ReadOnlyMany` can be used in this case.

The code below gives an example of PVC volume definition:

````
```json
{
"name": "pvc", # unique name
"mountPath": "/tmp/pvc", # mounting path
"volumeType": 0, # volume type - PVC
"mountPropagationMode": 2, # mount propagation mode - bidirectional
"readOnly": false # read only
"name": "pvc", # unique name
"mountPath": "/tmp/pvc", # mounting path
"volumeType": 0, # volume type - PVC
"mountPropagationMode": 2, # mount propagation mode - bidirectional
"readOnly": false # read only
}
````
```

## Ephemeral volumes

Some application need additional storage but don't care whether that data is stored persistently across restarts. For
example, caching services are often limited by memory size and can move infrequently used data into storage that is
slower than memory with little impact on overall performance. Ephemeral volumes are designed for these use cases.
Because volumes follow the Pod's lifetime and get created and deleted along with the Pod, Pods can be stopped and
restarted without being limited to where some persistent volume is available.
Some application need additional storage but don't care whether that data is stored persistently across restarts. For example, caching services are often limited by memory size and can move infrequently used data into storage that is slower than memory with little impact on overall performance. Ephemeral volumes are designed for these use cases. Because volumes follow the Pod's lifetime and get created and deleted along with the Pod, Pods can be stopped and restarted without being limited to where some persistent volume is available.

Although there are several option of ephemeral volumes, here we are using generic ephemeral volumes, which can be
provided by all storage drivers that also support persistent volumes. Generic ephemeral volumes are similar to emptyDir
volumes in the sense that they provide a per-pod directory for scratch data that is usually empty after provisioning.
But they may also have additional features:
Although there are several option of ephemeral volumes, here we are using generic ephemeral volumes, which can be provided by all storage drivers that also support persistent volumes. Generic ephemeral volumes are similar to emptyDir volumes in the sense that they provide a per-pod directory for scratch data that is usually empty after provisioning. But they may also have additional features:

* Storage can be local or network-attached.
* Volumes can have a fixed size that Pods are not able to exceed.

The code below gives an example of ephemeral volume definition:

````
```json
{
"name": "ephemeral", # unique name
"mountPath": "/tmp/ephemeral" # mounting path,
"mountPropagationMode": 0, # mount propagation mode - None
"volumeType": 2, # volume type - ephemeral
"storage": "5Gi", # disk size
"storageClass": "default" # storage class - optional
"accessMode": 0 # access mode RWO - optional
"name": "ephemeral", # unique name
"mountPath": "/tmp/ephemeral" # mounting path,
"mountPropagationMode": 0, # mount propagation mode - None
"volumeType": 2, # volume type - ephemeral
"storage": "5Gi", # disk size
"storageClass": "default", # storage class - optional
"accessMode": 0 # access mode RWO - optional
}
````
```

## Config map volumes

A ConfigMap provides a way to inject configuration data into pods. The data stored in a ConfigMap can be referenced in
a volume of type configMap and then consumed by containerized applications running in a pod.
A ConfigMap provides a way to inject configuration data into pods. The data stored in a ConfigMap can be referenced in a volume of type configMap and then consumed by containerized applications running in a pod.

When referencing a ConfigMap, you provide the name of the ConfigMap in the volume. You can customize the path to use
for a specific entry in the ConfigMap.
When referencing a ConfigMap, you provide the name of the ConfigMap in the volume. You can customize the path to use for a specific entry in the ConfigMap.

The code below gives an example of config map volume definition:

````
```json
{
"name":"code-sample", # Unique name
"mountPath":"/home/ray/samples", # mounting path
Expand All @@ -106,17 +91,15 @@ The code below gives an example of config map volume definition:
"sample_code.py":"sample_code.py"
}
}
````
```

## Secret volumes

A secret volume is used to pass sensitive information, such as passwords, to Pods. You can store secrets in the
Kubernetes API and mount them as files for use by pods without coupling to Kubernetes directly. Secret volumes are
backed by tmpfs (a RAM-backed filesystem) so they are never written to non-volatile storage.
A secret volume is used to pass sensitive information, such as passwords, to Pods. You can store secrets in the Kubernetes API and mount them as files for use by pods without coupling to Kubernetes directly. Secret volumes are backed by tmpfs (a RAM-backed filesystem) so they are never written to non-volatile storage.

The code below gives an example of secret volume definition:

````
```json
{
"name":"important-secret", # Unique name
"mountPath":"/home/ray/sensitive", # mounting path
Expand All @@ -126,22 +109,19 @@ The code below gives an example of secret volume definition:
"subPath": "password"
}
}
````
```

## Emptydir volumes

An emptyDir volume is first created when a Pod is assigned to a node, and exists as long as that Pod is running on
that node. As the name says, the emptyDir volume is initially empty. All containers in the Pod can read and write the
same files in the emptyDir volume, though that volume can be mounted at the same or different paths in each container.
When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently.
An emptyDir volume is first created when a Pod is assigned to a node, and exists as long as that Pod is running on that node. As the name says, the emptyDir volume is initially empty. All containers in the Pod can read and write the same files in the emptyDir volume, though that volume can be mounted at the same or different paths in each container. When a Pod is removed from a node for any reason, the data in the emptyDir is deleted permanently.

The code below gives an example of empydir volume definition:
The code below gives an example of empty directory volume definition:

````
```json
{
"name": "emptyDir", # unique name
"mountPath": "/tmp/emptydir" # mounting path,
"volumeType": 5, # vlume type - ephemeral
"storage": "5Gi", # max storage size - optional
"name": "emptyDir", # unique name
"mountPath": "/tmp/emptydir" # mounting path,
"volumeType": 5, # vlume type - ephemeral
"storage": "5Gi", # max storage size - optional
}
````
```
4 changes: 3 additions & 1 deletion apiserver/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,15 @@ require (
)

require (
github.com/dustinkirkland/golang-petname v0.0.0-20230626224747-e794b9370d49
github.com/elazarl/go-bindata-assetfs v1.0.1
github.com/grpc-ecosystem/go-grpc-middleware v1.3.0
github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0
github.com/grpc-ecosystem/grpc-gateway/v2 v2.6.0
)

require github.com/pmezard/go-difflib v1.0.0 // indirect

require (
github.com/asaskevich/govalidator v0.0.0-20200428143746-21a406dcc535 // indirect
github.com/beorn7/perks v1.0.1 // indirect
Expand All @@ -48,7 +51,6 @@ require (
github.com/mitchellh/mapstructure v1.4.1 // indirect
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/prometheus/client_model v0.2.0 // indirect
github.com/prometheus/common v0.28.0 // indirect
github.com/prometheus/procfs v0.6.0 // indirect
Expand Down
Loading

0 comments on commit 1de910a

Please sign in to comment.