Skip to content

Commit

Permalink
Update the three_data_hall docs for the three data hall with unified …
Browse files Browse the repository at this point in the history
…image (#2188)

* Update the three_data_hall docs for the three data hall with unified image

* Ensure that the unified image will always be used for 3dh testing
  • Loading branch information
johscheuer authored Jan 9, 2025
1 parent f68e301 commit ff924de
Show file tree
Hide file tree
Showing 14 changed files with 476 additions and 195 deletions.
19 changes: 14 additions & 5 deletions config/tests/three_data_hall/Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,30 @@

This example requires that your Kubernetes cluster has nodes which are labeled with `topology.kubernetes.io/zone`.
The example requires at least 3 unique zones, those can be faked for testing, by adding the labels to a node.
If you want to use cloud provider specific zone label values you can set the `AZ1`, `AZ2` and `AZ3` environment variables.
You can also set the namespace with the `NAMESPACE` environment variable to deploy this setup in a specific namespace.

## Create the RBAC settings

This example uses the unified image to read data from the Kubernetes API and therefore we have to
create the according RBAC setup. If you use a different namespace than the default namespace, you have
to adjust the `fdb-kubernetes` `ClusterRoleBinding` to point to the right `ServiceAccount`.

```bash
kubectl apply -f ./config/tests/three_data_hall/unified_image_role.yaml
```


## Create the Three-Data-Hall cluster

This will bring up a FDB cluster using the three-data-hall redundancy mode.
This will bring up the three data hall cluster managed by a single `FoundationDBCluster` resource:

```bash
./create.bash
kubectl apply -f ./config/tests/three_data_hall/cluster.yaml
```

## Delete

This will remove all created resources:

```bash
./delete.bash
kubectl delete -f ./config/tests/three_data_hall/cluster.yaml
```
56 changes: 56 additions & 0 deletions config/tests/three_data_hall/cluster.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
apiVersion: apps.foundationdb.org/v1beta2
kind: FoundationDBCluster
metadata:
labels:
cluster-group: test-cluster
name: test-cluster
spec:
# The unified image supports to make use of node labels, so setting up a three data hall cluster
# is easier with the unified image.
imageType: unified
version: 7.1.63
faultDomain:
key: kubernetes.io/hostname
processCounts:
stateless: -1
databaseConfiguration:
# Ensure that enough coordinators are available. The processes will be spread across the different zones.
logs: 9
storage: 9
redundancy_mode: "three_data_hall"
processes:
general:
customParameters:
- "knob_disable_posix_kernel_aio=1"
- "locality_data_hall=$NODE_LABEL_TOPOLOGY_KUBERNETES_IO_ZONE"
volumeClaimTemplate:
spec:
resources:
requests:
storage: "16G"
podTemplate:
spec:
securityContext:
runAsUser: 4059
runAsGroup: 4059
fsGroup: 4059
serviceAccount: fdb-kubernetes
# Make sure that the pods are spread equally across the different availability zones.
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foundationdb.org/fdb-cluster-name: test-cluster
containers:
- name: foundationdb
env:
# This feature allows the fdb-kubernetes-monitor to read the labels from the node where
# it is running.
- name: ENABLE_NODE_WATCH
value: "true"
resources:
requests:
cpu: 250m
memory: 128Mi
35 changes: 0 additions & 35 deletions config/tests/three_data_hall/create.bash

This file was deleted.

5 changes: 0 additions & 5 deletions config/tests/three_data_hall/delete.bash

This file was deleted.

41 changes: 0 additions & 41 deletions config/tests/three_data_hall/final.yaml

This file was deleted.

29 changes: 0 additions & 29 deletions config/tests/three_data_hall/functions.bash

This file was deleted.

41 changes: 0 additions & 41 deletions config/tests/three_data_hall/stage_1.yaml

This file was deleted.

58 changes: 58 additions & 0 deletions config/tests/three_data_hall/unified_image_role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
apiVersion: v1
kind: ServiceAccount
metadata:
name: fdb-kubernetes
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: fdb-kubernetes
rules:
- apiGroups:
- ""
resources:
- "pods"
verbs:
- "get"
- "watch"
- "update"
- "patch"
- "list"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: fdb-kubernetes
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: fdb-kubernetes
subjects:
- kind: ServiceAccount
name: fdb-kubernetes
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fdb-kubernetes
rules:
- apiGroups:
- ""
resources:
- "nodes"
verbs:
- "get"
- "watch"
- "list"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: fdb-kubernetes
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: fdb-kubernetes
subjects:
- kind: ServiceAccount
name: fdb-kubernetes
5 changes: 0 additions & 5 deletions config/tests/unified_image/images.yaml
Original file line number Diff line number Diff line change
@@ -1,11 +1,6 @@
- op: add
path: "/spec/imageType"
value: unified
- op: add
path: "/spec/mainContainer"
value:
imageConfigs:
- tagSuffix: "-local"
- op: remove
path: "/spec/processes/general/podTemplate/spec/initContainers/0"
- op: add
Expand Down
13 changes: 12 additions & 1 deletion controllers/generate_initial_cluster_file.go
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,19 @@ func (g generateInitialClusterFile) reconcile(ctx context.Context, r *Foundation
processLocality = append(processLocality, currentLocality)
}

limits := locality.GetHardLimits(cluster)
// Only for the three data hall mode we allow a less restrictive selection of the initial coordinators.
// The reason for this is that we don't know the data_hall locality until the fdbserver processes are running
// as they will report the data_hall locality. So this is a workaround to allow an easy bring up of a three_data_hall
// cluster with the unified image. Once the processes are reporting and the cluster is configured, the operator
// will choose 9 coordinators spread across the 3 data halls.
if cluster.Spec.DatabaseConfiguration.RedundancyMode == fdbv1beta2.RedundancyModeThreeDataHall {
count = 3
delete(limits, fdbv1beta2.FDBLocalityDataHallKey)
}

coordinators, err := locality.ChooseDistributedProcesses(cluster, processLocality, count, locality.ProcessSelectionConstraint{
HardLimits: locality.GetHardLimits(cluster),
HardLimits: limits,
})
if err != nil {
return &requeue{curError: err}
Expand Down
Loading

0 comments on commit ff924de

Please sign in to comment.