Skip to content

Commit

Permalink
Multi-cell adoption
Browse files Browse the repository at this point in the history
Make ansible and shell variables compute cells aware.
Split edpm nodes into compute cells by 1:1 mapping it as
dataplane nodesets.

Separate vars and secrets YAML values for the source and edpm
nodes to not confuse its different naming schemes for cells
in OSP/TripleO and RHOSO. This also allows adoption of the source
tripleo nodes into different EDPM nodes with another IPs but the
same FQDNs (those must remain unchanged as a strict requirement).

Remove edpm_computes vars no longer used after moving stopping
control-plane tripleo services into edpm-ansible

Signed-off-by: Bohdan Dobrelia <[email protected]>
  • Loading branch information
bogdando committed Jul 8, 2024
1 parent 551c9d9 commit cae15a5
Show file tree
Hide file tree
Showing 17 changed files with 375 additions and 118 deletions.
98 changes: 95 additions & 3 deletions docs_dev/assemblies/development_environment.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,79 @@ https://openstack-k8s-operators.github.io/data-plane-adoption/dev/#_reset_the_en

'''

== Deploying TripleO With Multiple Cells

A TripleO Standalone setup creates only a single Nova v2 cell, with a combined controller and compute services on it.
In order to deploy multiple compute cells for adoption testing (without Ceph), create a 5 VMs, with the following requirements met:

* Named `edpm-compute-0` .. `edpm-compute-4`.
* Running RHEL 9.2, with RHOSP 17.1 repositiries configured.
* Can login via SSH without a password as the root user, from the hypervisor host.
* User `zuul` is created, and can sudo without a password, and login via SSH without a password, from the hypervisor host.
* User `zuul` can login to `edpm-compute-1`, `edpm-compute-2`, `edpm-compute-3`, `edpm-compute-4` nodes via SSH without a password, from the `edpm-compute-0` node,
by using the generated `/home/zuul/.ssh/id_rsa` private key.
* RedHat registry credentials are exported on the hypervisor host.

Then invoke the `tripleo` target of `install_yamls`.

For that, you can also adjust as needed and use the following commands
(instead of an example `rhos-release` commands, use a tool of your choice):

[,bash]
----
export RH_REGISTRY_USER="<insert your registry.redhat.io user>"
export RH_REGISTRY_PWD="<insert your registry.redhat.io password>"
cd ~/install_yamls/devsetup
cat <<EOF > /tmp/osp17_repos
# Rhos-release steps are only available from the internal RedHat network, use a tool of your choice instead!
#sudo rhos-release -x
#sudo rhos-release 17.1
sudo dnf install -y git curl wget util-linux lvm2 podman python3-tripleoclient
EOF
export CENTOS_9_STREAM_URL=<insert url to rhel-guest-image-9.2.x86_64.qcow2>
export NTP_SERVER=<insert ntp server of your choice>
export MANILA_ENABLED=false
export EDPM_COMPUTE_CEPH_ENABLED=false
export EDPM_COMPUTE_CEPH_NOVA=false
export EDPM_COMPUTE_CELLS=3
export STANDALONE_EXTRA_CMD="bash -c 'echo \"$RH_REGISTRY_PWD\" > ~/authfile; chmod 0600 ~/authfile; sudo /bin/podman login registry.redhat.io -u \"$RH_REGISTRY_USER\" --password-stdin < ~/authfile'"
export REPO_SETUP_CMDS=/tmp/osp17_repos
export EDPM_TOTAL_NODES=1
export SKIP_TRIPLEO_REPOS=false
export IP_ADRESS_SUFFIX=100
export EDPM_COMPUTE_NETWORK_IP=192.168.122.1
export HOST_PRIMARY_RESOLV_CONF_ENTRY=192.168.122.1
export BASE_DISK_FILENAME="rhel-9-base.qcow2"
make edpm_compute EDPM_COMPUTE_SUFFIX=0 EDPM_COMPUTE_DISK_SIZE=10 EDPM_COMPUTE_RAM=9 EDPM_COMPUTE_VCPUS=2
make edpm_compute EDPM_COMPUTE_SUFFIX=1 EDPM_COMPUTE_DISK_SIZE=17 EDPM_COMPUTE_RAM=12 EDPM_COMPUTE_VCPUS=4
make edpm_compute EDPM_COMPUTE_SUFFIX=2 EDPM_COMPUTE_DISK_SIZE=14 EDPM_COMPUTE_RAM=12 EDPM_COMPUTE_VCPUS=4
make edpm_compute EDPM_COMPUTE_SUFFIX=3 EDPM_COMPUTE_DISK_SIZE=10 EDPM_COMPUTE_RAM=4 EDPM_COMPUTE_VCPUS=2
make edpm_compute EDPM_COMPUTE_SUFFIX=4 EDPM_COMPUTE_DISK_SIZE=16 EDPM_COMPUTE_RAM=12 EDPM_COMPUTE_VCPUS=4
for n in 0 1 2 3 4; do
ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa \
[email protected]${n} useradd --create-home --shell /bin/bash --groups root zuul
scp -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa \
~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa [email protected]${n}:/home/zuul/.ssh/id_rsa
ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa \
[email protected]${n} cp /root/.ssh/authorized_keys /home/zuul/.ssh/authorized_keys
ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa \
[email protected]${n} chown zuul: /home/zuul/.ssh/*
ssh -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa \
[email protected]${n} echo "zuul ALL=NOPASSWD:ALL" '>' /etc/sudoers.d/zuul; done
done
make tripleo_deploy
for n in 0 1 2 3 4; do make standalone_snapshot EDPM_COMPUTE_SUFFIX=$n; done
----

== Network routing

Route VLAN20 to have access to the MariaDB cluster:
Expand Down Expand Up @@ -340,14 +413,26 @@ make openstack

== Performing the adoption procedure

To simplify the adoption procedure, copy the deployment passwords that
To simplify the adoption procedure with additional cells, copy and rename the deployment passwords that
you use in copy the deployment passwords that you use in the
https://openstack-k8s-operators.github.io/data-plane-adoption/user/#deploying-backend-services_migrating-databases[backend
services deployment phase of the data plane adoption].

For a single-cell standalone TripleO deployment:
[,bash]
----
scp -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa [email protected]:/root/tripleo-standalone-passwords.yaml ~/
mkdir -p ~/overcloud-deploy/overcloud
scp -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa [email protected]:/root/tripleo-standalone-passwords.yaml ~/overcloud-deploy/overcloud/overcloud-passwords.yaml
----

Further on, this password is going to be referenced as `TRIPLEO_PASSWORDS[default]` for a `default` cell name in terms of TripleO (which becomes a `cell1` after adoption).

For a multi-cell deployment, change the above command to these:
[,bash]
----
scp -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa [email protected]:overcloud-deploy/overcloud/overcloud-passwords.yaml ~/
scp -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa [email protected]:overcloud-deploy/cell1/cell1-passwords.yaml ~/
scp -i ~/install_yamls/out/edpm/ansibleee-ssh-key-id_rsa [email protected]:overcloud-deploy/cell2/cell2-passwords.yaml ~/
----

The development environment is now set up, you can go to the https://openstack-k8s-operators.github.io/data-plane-adoption/[Adoption
Expand Down Expand Up @@ -386,14 +471,21 @@ oc delete --wait=false pod mariadb-copy-data || true
oc delete secret osp-secret || true
----

Revert the standalone vm to the snapshotted state
Revert the standalone vm(s) to the snapshotted state

[,bash]
----
cd ~/install_yamls/devsetup
make standalone_revert
----

For a multi-cell deployment, change the above command to these:
[,bash]
----
cd ~/install_yamls/devsetup
for n in 0 1 2 3 4; do make standalone_revert EDPM_COMPUTE_SUFFIX=$n; done
----

Clean up and initialize the storage PVs in CRC vm

[,bash]
Expand Down
8 changes: 7 additions & 1 deletion docs_dev/assemblies/tests.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,14 @@ work out of the box. The comments in the YAML files will guide you
regarding the expected values. You may want to double check that
these variables suit your environment:
** `install_yamls_path`
** `tripleo_passwords`
** `controller*_ssh`
** `source_ovndb_ip``
** `tripleo_cells_passwords` (for each compute cell on the source cloud)
** `source_mariadb_ip` (for each compute cell on the source cloud)
** `source_galera_members`` (for each compute cell on the source cloud)
** `source_node_hostname`` (for each compute cell on the source cloud)
** `edpm_node_hostname` (for each compute cell on the destination)
** `edpm_node_ip` (for each compute cell on the destination)
** `edpm_privatekey_path`
** `timesync_ntp_servers`

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -229,18 +229,21 @@ EOF
----
endif::[]

. Deploy the `OpenStackDataPlaneNodeSet` CR:
. Deploy the `OpenStackDataPlaneNodeSet` CRs (for each Nova compute cell):
+
. If TLS Everywhere is enabled, change spec:tlsEnabled to true
. If using a custom DNS Domain, modify the spec:nodes:[NODE NAME]:hostName to use fqdn for the node
. Use node set names, like `openstack-cell1`, `openstack-cell2`
. Assign all nodes from the source cloud `default` cell into `openstack-cell1`
. Assign all nodes from the source cloud `cell1` into `openstack-cell2`, and so on
+
[source,yaml]
----
oc apply -f - <<EOF
apiVersion: dataplane.openstack.org/v1beta1
kind: OpenStackDataPlaneNodeSet
metadata:
name: openstack
name: openstack-cell1
spec:
tlsEnabled: false
networkAttachments:
Expand Down
34 changes: 18 additions & 16 deletions docs_user/modules/proc_deploying-backend-services.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,9 @@ ADMIN_PASSWORD=SomePassword
To use the existing {OpenStackShort} deployment password:
+
----
ADMIN_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' AdminPassword:' | awk -F ': ' '{ print $2; }')
declare -A TRIPLEO_PASSWORDS
TRIPLEO_PASSWORDS[default]='~/overcloud-deploy/overcloud/overcloud-passwords.yaml'
ADMIN_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' AdminPassword:' | awk -F ': ' '{ print $2; }')
----
* Set service password variables to match the original deployment.
Database passwords can differ in the control plane environment, but
Expand All @@ -54,21 +56,21 @@ For example, in developer environments with {OpenStackPreviousInstaller} Standal
passwords can be extracted like this:
+
----
AODH_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' AodhPassword:' | awk -F ': ' '{ print $2; }')
BARBICAN_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' BarbicanPassword:' | awk -F ': ' '{ print $2; }')
CEILOMETER_METERING_SECRET=$(cat ~/tripleo-standalone-passwords.yaml | grep ' CeilometerMeteringSecret:' | awk -F ': ' '{ print $2; }')
CEILOMETER_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' CeilometerPassword:' | awk -F ': ' '{ print $2; }')
CINDER_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' CinderPassword:' | awk -F ': ' '{ print $2; }')
GLANCE_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' GlancePassword:' | awk -F ': ' '{ print $2; }')
HEAT_AUTH_ENCRYPTION_KEY=$(cat ~/tripleo-standalone-passwords.yaml | grep ' HeatAuthEncryptionKey:' | awk -F ': ' '{ print $2; }')
HEAT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' HeatPassword:' | awk -F ': ' '{ print $2; }')
IRONIC_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' IronicPassword:' | awk -F ': ' '{ print $2; }')
MANILA_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' ManilaPassword:' | awk -F ': ' '{ print $2; }')
NEUTRON_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' NeutronPassword:' | awk -F ': ' '{ print $2; }')
NOVA_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' NovaPassword:' | awk -F ': ' '{ print $2; }')
OCTAVIA_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' OctaviaPassword:' | awk -F ': ' '{ print $2; }')
PLACEMENT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' PlacementPassword:' | awk -F ': ' '{ print $2; }')
SWIFT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' SwiftPassword:' | awk -F ': ' '{ print $2; }')
AODH_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' AodhPassword:' | awk -F ': ' '{ print $2; }')
BARBICAN_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' BarbicanPassword:' | awk -F ': ' '{ print $2; }')
CEILOMETER_METERING_SECRET=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' CeilometerMeteringSecret:' | awk -F ': ' '{ print $2; }')
CEILOMETER_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' CeilometerPassword:' | awk -F ': ' '{ print $2; }')
CINDER_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' CinderPassword:' | awk -F ': ' '{ print $2; }')
GLANCE_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' GlancePassword:' | awk -F ': ' '{ print $2; }')
HEAT_AUTH_ENCRYPTION_KEY=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' HeatAuthEncryptionKey:' | awk -F ': ' '{ print $2; }')
HEAT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' HeatPassword:' | awk -F ': ' '{ print $2; }')
IRONIC_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' IronicPassword:' | awk -F ': ' '{ print $2; }')
MANILA_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' ManilaPassword:' | awk -F ': ' '{ print $2; }')
NEUTRON_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' NeutronPassword:' | awk -F ': ' '{ print $2; }')
NOVA_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' NovaPassword:' | awk -F ': ' '{ print $2; }')
OCTAVIA_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' OctaviaPassword:' | awk -F ': ' '{ print $2; }')
PLACEMENT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' PlacementPassword:' | awk -F ': ' '{ print $2; }')
SWIFT_PASSWORD=$(cat ${TRIPLEO_PASSWORDS[default]} | grep ' SwiftPassword:' | awk -F ': ' '{ print $2; }')
----

.Procedure
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ here this time).
----
PODIFIED_MARIADB_IP=$(oc get svc --selector "mariadb/name=openstack" -ojsonpath='{.items[0].spec.clusterIP}')
PODIFIED_CELL1_MARIADB_IP=$(oc get svc --selector "mariadb/name=openstack-cell1" -ojsonpath='{.items[0].spec.clusterIP}')
PODIFIED_CELL2_MARIADB_IP=$(oc get svc --selector "mariadb/name=openstack-cell1" -ojsonpath='{.items[0].spec.clusterIP}')
PODIFIED_DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d)
# The CHARACTER_SET and collation should match the source DB
Expand All @@ -42,14 +43,36 @@ ifeval::["{build}" == "downstream"]
STORAGE_CLASS=local-storage
MARIADB_IMAGE=registry.redhat.io/rhosp-dev-preview/openstack-mariadb-rhel9:18.0
endif::[]
# Replace with your environment's MariaDB Galera cluster VIP and backend IPs:
SOURCE_MARIADB_IP=172.17.0.2
declare -A SOURCE_GALERA_MEMBERS
SOURCE_GALERA_MEMBERS=(
declare -A TRIPLEO_PASSWORDS
TRIPLEO_PASSWORDS[default]='~/overcloud-deploy/overcloud/overcloud-passwords.yaml'
TRIPLEO_PASSWORDS[cell1]='~/cell1/cell1-passwords.yaml'
TRIPLEO_PASSWORDS[cell2]='~/cell2/cell2-passwords.yaml'
# ...
# Replace with your environment's main overcloud stack MariaDB Galera cluster VIP:
declare -A SOURCE_MARIADB_IP
SOURCE_MARIADB_IP[default]=172.17.0.90
SOURCE_MARIADB_IP[cell1]=172.17.0.91
SOURCE_MARIADB_IP[cell2]=172.17.0.92
# ...
# Replace with all members data for each MariaDB Galera cluster in the overcloud and additional cell stacks
declare -A SOURCE_GALERA_MEMBERS_DEFAULT
SOURCE_GALERA_MEMBERS_DEFAULT=(
["standalone.localdomain"]=172.17.0.100
# ...
)
SOURCE_DB_ROOT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }')
declare -A SOURCE_GALERA_MEMBERS_CELL1
SOURCE_GALERA_MEMBERS_CELL1=(
# ...
)
# ...
declare -A SOURCE_DB_ROOT_PASSWORD
CELLS="default cell1 cell2"
for CELL in $CELLS; do
SOURCE_DB_ROOT_PASSWORD[$CELL]=$(cat ${TRIPLEO_PASSWORDS[$CELL]} | grep ' MysqlRootPassword:' | awk -F ': ' '{ print $2; }')
done
----

* Prepare MariaDB copy directory and the adoption helper pod
Expand Down Expand Up @@ -116,17 +139,22 @@ oc wait --for condition=Ready pod/mariadb-copy-data --timeout=30s

.Procedure

. Check that the source Galera database cluster members are online and synced:
. Check that the source Galera database cluster(s) members are online and synced:
+
----
for i in "${!SOURCE_GALERA_MEMBERS[@]}"; do
echo "Checking for the database node $i WSREP status Synced"
oc rsh mariadb-copy-data mysql \
-h "${SOURCE_GALERA_MEMBERS[$i]}" -uroot -p"$SOURCE_DB_ROOT_PASSWORD" \
-e "show global status like 'wsrep_local_state_comment'" | \
grep -qE "\bSynced\b"
for CELL in $CELLS; do
MEMBERS=SOURCE_GALERA_MEMBERS_$(echo ${CELL}|tr '[:lower:]' '[:upper:]')[@]
for i in "${!MEMBERS}"; do
echo "Checking for the database node $i WSREP status Synced"
oc rsh mariadb-copy-data mysql \
-h "$i" -uroot -p"${SOURCE_DB_ROOT_PASSWORD[$CELL]}" \
-e "show global status like 'wsrep_local_state_comment'" | \
grep -qE "\bSynced\b"
done
done
----
+
Each additional Nova v2 cell runs a dedicated Galera database cluster, so the checking is done for all of it.

. Get the count of not-OK source databases:
+
Expand Down Expand Up @@ -159,7 +187,7 @@ services, except the compute agent, have no internal state, and its service
records can be safely deleted. You also need to rename the former `default` cell
to `cell1`.

. Create a dump of the original databases:
. Create a dump of the original databases. Substitute `SOURCE_MARIADB_IP` and `SOURCE_DB_ROOT_PASSWORD` for the database cluster in the main overcloud Heat stack, and for additional compute cells:
+
----
oc rsh mariadb-copy-data << EOF
Expand Down
Loading

0 comments on commit cae15a5

Please sign in to comment.