Skip to content

Commit 60ac110

Browse files
Add locality failover example to the ambient multicluster docs (#16949)
* Add locality failover example to the ambient multicluster docs This example requires changes introduced in #58065 and #57963. These PRs enable waypoints to route requests to remote network via ambient E/W gateway (using double HBONE). It's all required to address #57537. With those changes in we can extend the initial multicluster setup example to show how to configure locality failover and along the way demonstrate how to deploy waypoints in multicluster ambient environments. Signed-off-by: Mikhail Krinkin <[email protected]> * Fix output snippet lang from bash to plain to address lint warning Signed-off-by: Mikhail Krinkin <[email protected]> * Fix lints Signed-off-by: Mikhail Krinkin <[email protected]> * regenerate snips Signed-off-by: Mikhail Krinkin <[email protected]> * Fix typo Signed-off-by: Mikhail Krinkin <[email protected]> * Update content/en/docs/ambient/install/multicluster/_index.md Co-authored-by: Keith Mattix II <[email protected]> * Update content/en/docs/ambient/install/multicluster/_index.md Co-authored-by: Keith Mattix II <[email protected]> * fix lint warning Signed-off-by: Mikhail Krinkin <[email protected]> --------- Signed-off-by: Mikhail Krinkin <[email protected]> Co-authored-by: Keith Mattix II <[email protected]>
1 parent 0e049ed commit 60ac110

File tree

6 files changed

+396
-1
lines changed

6 files changed

+396
-1
lines changed

content/en/docs/ambient/install/multicluster/_index.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,14 @@ the current state and limitations of this feature.
5959

6060
**If a service's waypoint is marked as global, that service will also be global**
6161
- This can lead to unintended cross-cluster traffic if not managed carefully
62+
- The solution to this issue is tracked [here](https://github.com/istio/istio/issues/57710)
63+
64+
#### Load Distribution on Remote Network
65+
66+
**Traffic going to a remote network is not equally distributed between endpoints**
67+
- When failing over to a remote network, a single endpoint on a remote network may get a disproportionate number of requests
68+
due to multiplexing of HTTP requests and connection pooling
69+
- The solution to this issue is tracked [here](https://github.com/istio/istio/issues/58039)
6270

6371
#### Gateway Limitations
6472

content/en/docs/ambient/install/multicluster/common.sh

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
_set_kube_vars
2020

2121
source content/en/docs/ambient/install/multicluster/verify/snips.sh
22+
source content/en/docs/ambient/install/multicluster/failover/snips.sh
2223

2324
# set_single_network_vars initializes all variables for a single network config.
2425
function set_single_network_vars
@@ -156,6 +157,52 @@ function verify_load_balancing
156157
_verify_contains snip_verifying_crosscluster_traffic_3 "$EXPECTED_RESPONSE_FROM_CLUSTER2"
157158
}
158159

160+
function deploy_waypoints
161+
{
162+
# Deploy waypoints in both clusters and wait until they are up and running
163+
snip_deploy_waypoint_proxy_1
164+
_wait_for_deployment sample waypoint "${CTX_CLUSTER1}"
165+
_wait_for_deployment sample waypoint "${CTX_CLUSTER2}"
166+
167+
# Label HelloWorld service to use the newly deployed waypoints
168+
snip_deploy_waypoint_proxy_4
169+
# Mark waypoint service as global
170+
snip_deploy_waypoint_proxy_5
171+
}
172+
173+
function configure_locality_failover
174+
{
175+
echo "Deploying locality failover configuration"
176+
snip_configure_locality_failover_1
177+
snip_configure_locality_failover_2
178+
}
179+
180+
function verify_traffic_local
181+
{
182+
local EXPECTED_RESPONSE_FROM_CLUSTER1="Hello version: v1, instance:"
183+
local EXPECTED_RESPONSE_FROM_CLUSTER2="Hello version: v2, instance:"
184+
185+
echo "Verifying traffic stays in ${CTX_CLUSTER1}"
186+
_verify_contains snip_verify_traffic_stays_in_local_cluster_1 "$EXPECTED_RESPONSE_FROM_CLUSTER1"
187+
188+
echo "Verifying traffic stays in ${CTX_CLUSTER2}"
189+
_verify_contains snip_verify_traffic_stays_in_local_cluster_3 "$EXPECTED_RESPONSE_FROM_CLUSTER2"
190+
}
191+
192+
function break_cluster1
193+
{
194+
echo "Breaking ${CTX_CLUSTER1}"
195+
snip_verify_failover_to_another_cluster_1
196+
}
197+
198+
function verify_failover
199+
{
200+
local EXPECTED_RESPONSE_FROM_CLUSTER2="Hello version: v2, instance:"
201+
202+
echo "Verifying that traffic from ${CTX_CLUSTER1} fails over to ${CTX_CLUSTER2}"
203+
_verify_contains snip_verify_failover_to_another_cluster_2 "$EXPECTED_RESPONSE_FROM_CLUSTER2"
204+
}
205+
159206
# For Helm multi-cluster installation steps
160207

161208
function create_istio_system_ns
Lines changed: 192 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,192 @@
1+
---
2+
title: Configure failover behavior in multicluster ambient installation
3+
description: Configure outlier detection and failover behavior in ambient multicluster ambient mesh using waypoints.
4+
weight: 70
5+
keywords: [kubernetes,multicluster,ambient]
6+
test: yes
7+
owner: istio/wg-environments-maintainers
8+
prev: /docs/ambient/install/multicluster/verify
9+
---
10+
Follow this guide to customize failover behavior in your ambient multicluster Istio installation using waypoint proxies.
11+
12+
Before proceeding, be sure to complete ambient multicluster Istio installation following one of the
13+
[multicluster installation guides](/docs/ambient/install/multicluster) and verify that the installation is working properly.
14+
15+
In this guide, we will build on top of the `HelloWorld` application used to verify the multicluster installation. We will
16+
configure locality failover for the `HelloWorld` service to prefer endpoints in the cluster local to the client using a
17+
`DestinationRule` and will deploy a waypoint proxy to enforce the configuration.
18+
19+
## Deploy waypoint proxy
20+
21+
In order to configure outlier detection and customize failover behavior for the service we need a waypoint proxy. To begin,
22+
deploy waypoint proxy to each cluster in the mesh:
23+
24+
{{< text bash >}}
25+
$ istioctl --context "${CTX_CLUSTER1}" waypoint apply --name waypoint --for service -n sample --wait
26+
$ istioctl --context "${CTX_CLUSTER2}" waypoint apply --name waypoint --for service -n sample --wait
27+
{{< /text >}}
28+
29+
Confirm the status of the waypoint proxy deployment on `cluster1`:
30+
31+
{{< text bash >}}
32+
$ kubectl --context "${CTX_CLUSTER1}" get deployment waypoint --namespace sample
33+
NAME READY UP-TO-DATE AVAILABLE AGE
34+
waypoint 1/1 1 1 137m
35+
{{< /text >}}
36+
37+
Confirm the status of the waypoint proxy deployment on `cluster2`:
38+
39+
{{< text bash >}}
40+
$ kubectl --context "${CTX_CLUSTER2}" get deployment waypoint --namespace sample
41+
NAME READY UP-TO-DATE AVAILABLE AGE
42+
waypoint 1/1 1 1 138m
43+
{{< /text >}}
44+
45+
Wait until all waypoint proxies are ready.
46+
47+
Configure `HelloWorld` service in each cluster to use the waypoint proxy:
48+
49+
{{< text bash >}}
50+
$ kubectl --context "${CTX_CLUSTER1}" label svc helloworld -n sample istio.io/use-waypoint=waypoint
51+
$ kubectl --context "${CTX_CLUSTER2}" label svc helloworld -n sample istio.io/use-waypoint=waypoint
52+
{{< /text >}}
53+
54+
Finally, and this step is specific to multicluster deployment of waypoint proxies, mark the waypoint proxy service in each
55+
cluster as global, just like you did earlier with the `HelloWorld` service:
56+
57+
{{< text bash >}}
58+
$ kubectl --context "${CTX_CLUSTER1}" label svc waypoint -n sample istio.io/global=true
59+
$ kubectl --context "${CTX_CLUSTER2}" label svc waypoint -n sample istio.io/global=true
60+
{{< /text >}}
61+
62+
The `HelloWorld` service in both clusters is now configured to use waypoint proxies, but waypoint proxies don't do anything
63+
useful yet.
64+
65+
## Configure locality failover
66+
67+
To configure locality failover create and apply a `DestinationRule` in `cluster1`:
68+
69+
{{< text bash >}}
70+
$ kubectl --context "${CTX_CLUSTER1}" apply -n sample -f - <<EOF
71+
apiVersion: networking.istio.io/v1
72+
kind: DestinationRule
73+
metadata:
74+
name: helloworld
75+
spec:
76+
host: helloworld.sample.svc.cluster.local
77+
trafficPolicy:
78+
outlierDetection:
79+
consecutive5xxErrors: 1
80+
interval: 1s
81+
baseEjectionTime: 1m
82+
loadBalancer:
83+
simple: ROUND_ROBIN
84+
localityLbSetting:
85+
enabled: true
86+
failoverPriority:
87+
- topology.istio.io/cluster
88+
EOF
89+
{{< /text >}}
90+
91+
Apply the same `DestinationRule` in `cluster2` as well:
92+
93+
{{< text bash >}}
94+
$ kubectl --context "${CTX_CLUSTER2}" apply -n sample -f - <<EOF
95+
apiVersion: networking.istio.io/v1
96+
kind: DestinationRule
97+
metadata:
98+
name: helloworld
99+
spec:
100+
host: helloworld.sample.svc.cluster.local
101+
trafficPolicy:
102+
outlierDetection:
103+
consecutive5xxErrors: 1
104+
interval: 1s
105+
baseEjectionTime: 1m
106+
loadBalancer:
107+
simple: ROUND_ROBIN
108+
localityLbSetting:
109+
enabled: true
110+
failoverPriority:
111+
- topology.istio.io/cluster
112+
EOF
113+
{{< /text >}}
114+
115+
This `DestinationRule` configures the following:
116+
117+
- [Outlier detection](/docs/reference/config/networking/destination-rule/#OutlierDetection) for the `HelloWorld` service.
118+
This instructs waypoint proxies how to identify when endpoints for a service are unhealthy. It's required for failover
119+
to function properly.
120+
121+
- [Failover priority](/docs/reference/config/networking/destination-rule/#LocalityLoadBalancerSetting) that instructs
122+
waypoint proxy how to prioritize endpoints when routing requests. In this example, waypoint proxy will prefer endpoints
123+
in the same cluster over endpoints in other clusters.
124+
125+
With these policies in place, waypoint proxies will prefer endpoints in the same cluster as the waypoint proxy when they
126+
are available and considered healthy based on the outlier detection configuration.
127+
128+
## Verify traffic stays in local cluster
129+
130+
Send request from the `curl` pods on `cluster1` to the `HelloWorld` service:
131+
132+
{{< text bash >}}
133+
$ kubectl exec --context "${CTX_CLUSTER1}" -n sample -c curl \
134+
"$(kubectl get pod --context "${CTX_CLUSTER1}" -n sample -l \
135+
app=curl -o jsonpath='{.items[0].metadata.name}')" \
136+
-- curl -sS helloworld.sample:5000/hello
137+
{{< /text >}}
138+
139+
Now, if you repeat this request several times and verify that the `HelloWorld` version should always be `v1` because the
140+
traffic stays in `cluster1`:
141+
142+
{{< text plain >}}
143+
Hello version: v1, instance: helloworld-v1-954745fd-z6qcn
144+
Hello version: v1, instance: helloworld-v1-954745fd-z6qcn
145+
...
146+
{{< /text >}}
147+
148+
Similarly, send request from `curl` pods on `cluster2` several times:
149+
150+
{{< text bash >}}
151+
$ kubectl exec --context "${CTX_CLUSTER2}" -n sample -c curl \
152+
"$(kubectl get pod --context "${CTX_CLUSTER2}" -n sample -l \
153+
app=curl -o jsonpath='{.items[0].metadata.name}')" \
154+
-- curl -sS helloworld.sample:5000/hello
155+
{{< /text >}}
156+
157+
You should see that all requests are processed in `cluster2` by looking at the version in the response:
158+
159+
{{< text plain >}}
160+
Hello version: v2, instance: helloworld-v2-7b768b9bbd-7zftm
161+
Hello version: v2, instance: helloworld-v2-7b768b9bbd-7zftm
162+
...
163+
{{< /text >}}
164+
165+
## Verify failover to another cluster
166+
167+
To verify that failover to remote cluster works simulate `HelloWorld` service outage in `cluster1` by scaling down
168+
deployment:
169+
170+
{{< text bash >}}
171+
$ kubectl --context "${CTX_CLUSTER1}" scale --replicas=0 deployment/helloworld-v1 -n sample
172+
{{< /text >}}
173+
174+
Send request from the `curl` pods on `cluster1` to the `HelloWorld` service again:
175+
176+
{{< text bash >}}
177+
$ kubectl exec --context "${CTX_CLUSTER1}" -n sample -c curl \
178+
"$(kubectl get pod --context "${CTX_CLUSTER1}" -n sample -l \
179+
app=curl -o jsonpath='{.items[0].metadata.name}')" \
180+
-- curl -sS helloworld.sample:5000/hello
181+
{{< /text >}}
182+
183+
This time you should see that the request is processed by `HelloWorld` service in `cluster2` because there are no
184+
available endpoints in `cluster1`:
185+
186+
{{< text plain >}}
187+
Hello version: v2, instance: helloworld-v2-7b768b9bbd-7zftm
188+
Hello version: v2, instance: helloworld-v2-7b768b9bbd-7zftm
189+
...
190+
{{< /text >}}
191+
192+
**Congratulations!** You successfully configuration locality failover in Istio ambient multicluster deployment!

0 commit comments

Comments
 (0)