A 6-Node Consul cluster ( inspired by vagrant-consul-cluster ) for configuring and testing Consul locally.
-
macOS MacBook Pro (Intel based architecture)
-
16GB of RAM or greater
-
128GB of free disk space.
-
macOS Monterey v12.1 or newer.
-
Reliable Internet Connection
-
Knowledge of internal home network IP scheme (simulated WAN network).
-
Knowledge of MacBook's usable network adapters for bridging VM network.
user@macbook:~$ networksetup -listallhardwareports
-
Valid Consul Enterprise License if using Consul Enterprise.
-
Vagrant 1.9.1 or newer.
-
Vagrant Reload plugin installed (Vagrant Reload Plugin)
-
VirtualBox 5.1.x or newer
The Vagrantfile
is set up to create 6 hosts of various types as described below.
OS: Ubuntu 18.04.3 LTS (bionic)
vCPUs: 6
vMem: 4096MB (4GB)
Apt Packages Installed:
curl, wget, software-properties-common, jq, unzip
traceroute, nmap, socat, iptables-persistent, dnsmasq,
netcat
Networking:
Consul VMs:
iptables: Consul Required ports/protocols allowed
ipv4 ICMP redirects: disabled
ipv4 Routing: LAN/WAN default routes set to cluster Ubuntu VM Router
Vagrant Forwarded Ports:
consul-dc1-server-0: 8500 --> 8500 (Consul UI)
consul-dc2-server-0: 8500 --> 9500 (Consul UI)
consul-dc1-mesh-gw: 19000 -> 19001 (Envoy Admin UI)
consul-dc2-mesh-gw: 19000 -> 19002 (Envoy Admin UI)
Primary DC Bridged Network:
eth1: 20.0.0.0/16
eth2: 192.169.7.0/24
Secondary DC Bridged Network:
eth1: 30.0.0.0/16
eth2: 192.169.7.0/24
Cluster VM Router Bridged:
eth1: 20.0.0.0/16
eth2: 30.0.0.0/16
eth3: 192.169.7.0/24
ipv4 Forwarding: Enabled (eth1-eth3)
DNS:
dnsmasq: installed/configured
/etc/hosts: Consul VM local LAN and remote WAN IPs configured
Consul Version: 1.11.5+ent
Envoy Version: 1.20.2
Consul Daemon: systemd unit file installed/configured
Log Level: TRACE
UI: Enabled (consul-dc1/dc2-server-0 only)
Data Directory: /opt/consul/data
Gossip Encryption: Enabled
TLS Encryption: Enabled
ACLs: Enabled/Bootstrapped
Connect: Enabled
gRPC Port: Enabled/8502
Client ipv4: 0.0.0.0
Bind ipv4: 0.0.0.0
Advertise ipv4: eth1
Advertise ipv4 (WAN): eth2
Translate WAN Address: Enabled
Leave on Terminate: Enabled
Central Service Cfg: Enabled
Proxy Defaults (Global):
Mesh GW: Local
Checks Exposed: true
Primary DC Server Roles
consul-dc1-server-0: Cluster Certificate Authority (Consul CA)
consul-dc1-server-1: Cluster Member Server
consul-dc1-mesh-gw: Cluster Mesh Gateway (Primary)
Secondary DC Server Roles
consul-dc2-server-0: Cluster Member Server
consul-dc2-server-1: Cluster Member Server
consul-dc2-mesh-gw: Cluster Mesh Gateway (Secondary)
/etc/vbox/networks.conf
VirtualBox utilizes a preset 192.168.65.0/24 networking scheme.
In order to allow for alternative networking configurations (i.e., the kind required by this repository), please ensure of the following:
-
Create VirtualBox
networks.conf
sudo mkdir -p /etc/vbox sudo touch /etc/vbox/networks.conf
-
Edit the networks.conf (nano/vim) file to encompass the required networks.
* 20.0.0.0/16 30.0.0.0/16 192.168.59.0/24 192.169.7.0/24
-
Save the networks.conf file as applicable by your editor (nano/vim).
-
Restart the VirtualBox application to apply changes.
-
Clone the Vagrant Consul Cluster repository to a working directory you desire.
git clone https://github.com/natemollica-nm/vagrant-consul-cluster
-
Initialize Vagrant environment.
vagrant init
-
If applicable, edit the imported
Vagrantfile
variables forCONSUL_VERSION="1.11.5+ent" ENVOY_VERSION="1.20.2" LAN_IP_DC1="20.0.0" LAN_IP_DC2="30.0.0" WAN_IP="192.169.7" MAC_NETWORK_BRIDGE="en0: Wi-Fi"
Note: These variables can be set to configure the Consul Version, Envoy Version, and LAN and WAN specifics of your desired configuration. Ensure the Consul-Envoy versions being used are in accordance with HashiCorp's supported versioning found* at https://www.consul.io/docs/connect/proxies/envoy
-
(Optional) Append the anticipated cluster host IPs to your
/etc/hosts
file to promote a faster provisioning process.# LAN IPs 20.0.0.10 consul-dc1-server-0 20.0.0.20 consul-dc1-server-1 20.0.0.30 consul-dc1-server-2 20.0.0.40 consul-dc1-server-3 20.0.0.55 consul-dc1-mesh-gw 30.0.0.10 consul-dc2-server-0 30.0.0.20 consul-dc2-server-1 30.0.0.30 consul-dc2-server-2 30.0.0.40 consul-dc2-server-3 30.0.0.55 consul-dc2-mesh-gw # WAN IPs 192.168.0.110 consul-dc1-server-0 192.168.0.120 consul-dc1-server-1 192.168.0.130 consul-dc1-server-2 192.168.0.140 consul-dc1-server-3 192.168.0.150 consul-dc2-mesh-gw 192.168.0.210 consul-dc2-server-0 192.168.0.220 consul-dc2-server-1 192.168.0.230 consul-dc2-server-2 192.168.0.240 consul-dc2-server-3 192.168.0.250 consul-dc2-mesh-gw
-
Start the Consul cluster Ubuntu Router and Primary DC provisioning process.
vagrant up consul-cluster-router consul-dc1-server-0 consul-dc1-server-1 consul-dc1-mesh-gw
-
Monitor provisioning of Primary DC until completion.
-
Start the Consul cluster Secondary DC provisioning process
vagrant up consul-dc2-server-0 consul-dc2-server-1 consul-dc2-mesh-gw
-
Monitor provisioning of Secondary DC until completion.
Note: The following commands below are placed here for convenience in setting the Consul environmental variables that are required to perform steps in this guide. Set the dc variable as applicable to the appropriate datacenter, and the initial_management_token_value to the cluster's bootstrapped ACL token prior to running the commands. The ACL initial bootstrap token can be located by running the below command and retrieving the acl stanza initial bootstrap token UUID value.
cat /etc/consul.d/consul.hcl
dc="dc1"
export CONSUL_HTTP_SSL=true
export CONSUL_HTTP_ADDR="https://127.0.0.1:8501"
export CONSUL_CACERT="/etc/consul.d/tls/consul-agent-ca.pem"
export CONSUL_CLIENT_CERT="/etc/consul.d/tls/$dc-server-consul-0.pem"
export CONSUL_CLIENT_KEY="/etc/consul.d/tls/$dc-server-consul-0-key.pem"
export CONSUL_HTTP_TOKEN=<initial_management_token_value>
The Consul rolling reboot procedure will be utilized several times within this process. When required, perform the following to accomplish a rolling reboot of the desired servers to help maintain proper quorum. This should be run from one server at a time individually.
-
Run the following and wait for Graceful Leave return message:
consul leave
-
Run the following to restart Consul daemon:
sudo service consul start
-
Ensure Consul member recognizes server as a voting member (run as required until voter is true)
consul operator raft list-peers
-
Proceed to next server until all DC servers have been Consul rebooted.
**Note: Basic WAN Federation is not necessarily a requirement to establish WAN Federation via Mesh GWs. This is setup to allow for initial ACL replication to replicate Mesh GW ACL Tokens to both DCs. WAN Federation, when established via Mesh GWs, will in fact disable basic WAN Federation capabilities within Consul.**
-
From consul-dc1-server-0 run:
consul join -wan "consul-dc2-server-0"
-
From consul-dc2-server-0 run:
consul join -wan "consul-dc1-server-0"
-
Verify DC1and DC2 WAN Consul Membership by running (on both DC1 and DC2):
consul members -wan
Note: This should be run from any server in DC1.
-
If necessary, establish Consul environmental variables for variables outlined in section titled Consul Environmental Variables.
-
Create the ACL Token replication policy by running:
repl_policy="replication-policy.hcl" replication_policy_rules="" replication_policy_rules=$( cat-CONFIG acl = "write" operator = "write" service_prefix "" { policy = "read" intentions = "read" } CONFIG ) sudo touch $repl_policy && sudo chmod 0755 $repl_policy echo -e "$replication_policy_rules" | sudo tee $repl_policy consul acl policy create -name replication -rules @$repl_policy
-
Create ACL Replication token by running:
consul acl token create -description "replication token" -policy-name replication
- Take note of ACL Replication token SecretID. In ALL DC2 servers ensure consul.hcl (/etc/consul.d/consul.hcl) configuration files have the following entries for ACL replication and Primary DC designation:
primary_datacenter = "dc1"
acl {
enabled = true
default_policy = "allow"
enable_token_replication = true
enable_token_persistence = true
tokens {
initial_management = <initial_mgmt_token>
replication = <replication_token_secretID>
}
}
-
Perform rolling reboot of DC2 Servers (as described in Rolling Reboot Procedure above).
-
Ensure no errors/warnings from Consul API in regard to ACL replication by running from any DC2 Server:
curl "http://localhost:8500/v1/acl/replication?pretty"
Note: This should be run from any server in DC1.
-
If necessary, establish Consul environmental variables for variables outlined in section titled Consul Environmental Variables.
-
Create Mesh GW Token policy by running:
mesh_gw_policy="mesh-gateway-policy.hcl"
mesh_gw_rules=$( cat-CONFIG
service_prefix "mesh-gateway" {
policy = "write"
}
service_prefix "" {
policy = "read"
}
node_prefix "" {
policy = "read"
}
agent_prefix "" {
policy = "read"
}
CONFIG
)
sudo touch $mesh_gw_policy && sudo chmod 0755 $mesh_gw_policy
echo -e "$mesh_gw_rules" | sudo tee $mesh_gw_policy
consul acl policy create -name mesh-gateway -rules @$mesh_gw_policy
- Create DC1 Mesh GW Token by running:
consul acl token create -description "mesh-gateway primary datacenter token" -policy-name mesh-gateway
- Create DC2 Mesh GW Token by running:
consul acl token create -description "mesh-gateway secondary datacenter token" -policy-name mesh-gateway
- Take note of DC1 and DC2 Mesh GW Token SecretIDs.
- Ensure all DC1 servers have the following Connect stanza entry:
connect {
enabled = true
enable_mesh_gateway_wan_federation = true
}
-
Perform rolling reboot of DC1 Servers (as described in Rolling Reboot Procedure above).
-
Repeat steps 1 and 2 for DC2 to ensure the following Connect stanza entry is set for all DC2 servers:
primary_gateways = [ "consul-dc1-mesh-gw:8443"]
connect {
enabled = true
enable_mesh_gateway_wan_federation = true
}
Note: For learning or troubleshooting purposes it may be beneficial to monitor the Envoy proxy log output after establishing the Mesh Gateway service within Consul. the steps outlined here establish a tail monitor for the Mesh GW servers prior to starting the associate Envoy Proxy.
-
Establish secondary terminal for consul-dc1-mesh-gw and consul-dc2-mesh-gw servers.
-
From consul-dc1-mesh-gw run the following to setup Envoy Proxy log monitoring to stdout:
touch envoy.out && sudo chmod 777 envoy.out tail -f envoy.out
-
Repeat step 2 for consul-dc2-mesh-gw.
-
If necessary, establish Consul environmental variables for variables outlined in section titled Consul Environmental Variables.
-
From consul-dc1-mesh-gw, run the following to establish DC1's primary Mesh Gateway service:
consul connect envoy -gateway=mesh \
-expose-servers -register \
-service "mesh-gateway-dc1" \
-address="20.0.0.55:8443" \
-wan-address="192.169.7.150:8443" \
-token="<primary_mesh_gw_token_secretID>" \
-ca-file="/etc/consul.d/tls/consul-agent-ca.pem" \
-client-cert="/etc/consul.d/tls/dc1-server-consul-0.pem" \
-client-key="/etc/consul.d/tls/dc1-server-consul-0-key.pem" \
-grpc-addr="https://127.0.0.1:8502" \
-admin-bind="0.0.0.0:19000" \
-tls-server-name="consul-dc1-server-0" \
-bind-address="mesh-gateway-dc1=0.0.0.0:8443" -- -l trace &> envoy.out
- Repeat steps 1 and 2 for consul-dc2-mesh-gw with the following:
consul connect envoy -gateway=mesh \
-expose-servers -register \
-service "mesh-gateway-dc1" \
-address="30.0.0.55:8443" \
-wan-address="192.169.7.250:8443" \
-token="<secondary_mesh_gw_token_secretID>" \
-ca-file="/etc/consul.d/tls/consul-agent-ca.pem" \
-client-cert="/etc/consul.d/tls/dc1-server-consul-0.pem" \
-client-key="/etc/consul.d/tls/dc1-server-consul-0-key.pem" \
-grpc-addr="https://127.0.0.1:8502" \
-admin-bind="0.0.0.0:19000" \
-tls-server-name="consul-dc1-server-0" \
-bind-address="mesh-gateway-dc1=0.0.0.0:8443" -- -l trace &> envoy.out
-
To verify health of the Envoy Mesh GW from the UI visit http://127.0.0.1:8500/ui and locate the mesh-gateway-dc1 service to ensure its health checks are passing.
-
To verify health of Envoy Mesh GW from Consul CLI, run the following (ensure <dc> is replaced with appropriate datacenter id):
curl "http://127.0.0.1:8500/v1/health/checks/mesh-gateway-<dc>?pretty"
-
Perform step 1 or step 2 as desired for DC2's secondary Mesh GW.
-
From any DC2 Server run:
consul kv put -datacenter=dc1 -token="<initial_mgmt_token>" \ -ca-file="/etc/consul.d/tls/consul-agent-ca.pem" \ -client-cert="/etc/consul.d/tls/dc2-server-consul-0.pem" \ -client-key="/etc/consul.d/tls/dc2-server-consul-0-key.pem" from DCtwo
-
From any DC1 Server run:
consul kv put -datacenter=dc2 -token="<initial_mgmt_token>" \ -ca-file="/etc/consul.d/tls/consul-agent-ca.pem" \ -client-cert="/etc/consul.d/tls/dc1-server-consul-0.pem" \ -client-key="/etc/consul.d/tls/dc1-server-consul-0-key.pem" from DCone
-
If steps 1 and 2 are completed with no error, WAN Federation via Mesh Gateways has been established and continued testing can be accomplished as desired.
-
Destroy all Vagrant VirtualBox VMs
vagrant destroy -f