Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi cell database/mq adoption #746

Merged

Conversation

bogdando
Copy link
Contributor

@bogdando bogdando commented Nov 25, 2024

  1. Fix OSPDo specifics for single vs multi cellls

Declare RUN_OVERRIDES before it is used.
Use env vars instead of docs generation conditions to reuse the same
code in tests:

  • Add MARIADB_RUN_OVERRIDES to cover all overrides and client annotations
  • Add missing definitions for rhoso/ospd namespace specific vars
  • Use env TRIPLEO_PASSWORDS for all cases as OSPDo still deploys
    tripleo
  • Define and use NAMESPACE (default openstack) instead of
    RHOSO18_NAMESPACE or OSPDO_NAMESPACE. Remove unused rhoso18 ns value
    (only in these guide).
  1. Refactor comments in commands into asciidocs native

Illustrate how commands in scripts could have comments
that become (almost as is) native ascii docs foot-notes.

When copying code into docs, the minimal adjustments will
be needed, like adding '$' prefix (or '>' for multiline commands).

Provide a static multi-cell config for databases and messaging
for adoption guide and tests, which comprises a 3 cells.

  1. Keep renaming 'default' cell consistent for single and multi cells:

Default becomes cellX (or it can be imported as is, for a multi-cell
case only)
cell1 becomes mapped to openstack-cell1 osdp node set
cell2 becomes mapped to openstack-cell2 osdp node set, etc.
cellX (X=3 here) becomes mapped to openstack-cell3. Alternatively,
default cell retains its name for the openstack-default osdpns
mapping
Evaluate podified MariaDB passwords for cells from osp-secret
to align the tests with documented commands. Remove no longer
needed podified DB password variable.

  1. Make ansible and shell variables compute cells aware.

  2. Rework vars and secrets YAML values for the source and edpm
    nodes to not confuse its different naming schemes for cells
    in OSP/TripleO and RHOSO.

  3. Remove cached fact for pulled OSP configuration as it can no longer
    be generated in a multi-cell setup, where related shell variables
    become bash arrays.

  4. Simplify ENV headers management by collecting in a single place.

  5. Adjust storage/storageRequests values to make it better fitting
    a multi-cell test scenarios. Also provide values in docs and
    add a comment to adjust them as needed.

  6. Remove source_db_root_password as it is directly evaluated from
    tripleo passwords into an env var.

  7. Run mysql commands in individual pods.
    Finished pods take time to terminate, avoid errors where
    consequent mysql commands failing because the old and new pod use the
    same name.

  8. Rename nodesets to openstack-cell1, which is needed for adoption of
    remaining multi-cell aware services in a follow up.

  9. Make edpm_nodes input multi-cell aware.

Assume a single cell1 yet.

Remove edpm_computes and computes env var
from tests as it is not multi-cell aware, and should be no longer
needed. The docs still use that env var, it will be removed in
multi-cell adoption follow up, where we also cover EDPM multi-cell
adoption.

This is required as rhe rdo-jobs dependency introduces that
change for edpm_nodes and provides a common base for this and future
multi-cell follow ups.

  1. Unify org_namespace defaults and reference by env var

Closes: https://github.com/openstack-k8s-operators/data-plane-adoption/issues/184
Depends-On: https://review.rdoproject.org/r/c/rdo-jobs/+/56389
Required-By: https://review.rdoproject.org/r/c/rdo-jobs/+/56116

Jira: #OSPRH-6548

@bogdando bogdando requested a review from jistr November 25, 2024 12:48
@bogdando bogdando force-pushed the multi_cell_database branch from 3323bbf to a1207eb Compare November 25, 2024 12:49
@bogdando bogdando changed the title Multi cell database Multi cell database/mq adoption Nov 25, 2024
@bogdando
Copy link
Contributor Author

bogdando commented Nov 25, 2024

@bogdando bogdando mentioned this pull request Nov 25, 2024
@bogdando bogdando added the check-before-merge/depends-on Don't forget to check depends-on before merging label Nov 25, 2024
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ff2fa0079d2d4d0f9d107db215021a87

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 1h 34m 52s
adoption-standalone-to-crc-no-ceph FAILURE in 1h 40m 01s
✔️ adoption-docs-preview SUCCESS in 1m 17s

@bogdando bogdando force-pushed the multi_cell_database branch from 0ba8ce7 to cd5b8f9 Compare November 26, 2024 12:54
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/2bf5c26d9ce042488b038c87505a0f82

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 1h 38m 37s
adoption-standalone-to-crc-no-ceph FAILURE in 1h 44m 16s
✔️ adoption-docs-preview SUCCESS in 1m 31s

@bogdando bogdando force-pushed the multi_cell_database branch 3 times, most recently from c392e11 to 006e247 Compare November 27, 2024 13:21
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/ef997d9bd22a422fa0a81831c10ce76f

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph POST_FAILURE in 1h 40m 25s
adoption-standalone-to-crc-no-ceph RETRY_LIMIT in 10m 21s
✔️ adoption-docs-preview SUCCESS in 1m 21s

@bogdando bogdando force-pushed the multi_cell_database branch from 006e247 to 55aec9f Compare November 29, 2024 15:36
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/221b8eaa480445db948ba454fc1dc6d9

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 1h 51m 13s
adoption-standalone-to-crc-no-ceph FAILURE in 53m 45s
adoption-docs-preview FAILURE in 1m 15s

@bogdando
Copy link
Contributor Author

bogdando commented Dec 2, 2024

recheck

@bogdando bogdando force-pushed the multi_cell_database branch from 55aec9f to ee3defd Compare December 2, 2024 12:27
Copy link

This change depends on a change that failed to merge.

Change https://review.rdoproject.org/r/c/rdo-jobs/+/53192 is needed.

@bogdando bogdando force-pushed the multi_cell_database branch from ee3defd to 03ce37c Compare December 4, 2024 13:04
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/28720608befb44c49d0f76f95466684b

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph RETRY_LIMIT in 48m 27s
adoption-standalone-to-crc-no-ceph RETRY_LIMIT in 48m 06s
adoption-docs-preview FAILURE in 1m 15s

@bogdando bogdando force-pushed the multi_cell_database branch 3 times, most recently from 505e4e8 to 8d81b48 Compare December 6, 2024 13:59
@bogdando bogdando force-pushed the multi_cell_database branch from 8d81b48 to cdff544 Compare December 6, 2024 14:20
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/513df93417f943f8bc8b6d3849192150

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 1h 42m 09s
adoption-standalone-to-crc-no-ceph FAILURE in 1h 44m 20s
✔️ adoption-docs-preview SUCCESS in 1m 28s

bogdando and others added 4 commits February 10, 2025 14:34
Assume a single cell1 yet.

Remove edpm_computes and computes env var
from tests as it is not multi-cell aware, and should be no longer
needed. The docs still use that env var, it will be removed in
multi-cell adoption follow up, where we also cover EDPM multi-cell
adoption.

This is required as rhe rdo-jobs dependency introduces that
change for edpm_nodes and provides a common base for this and future
multi-cell follow ups.

Signed-off-by: Bohdan Dobrelia <[email protected]>
Signed-off-by: Bohdan Dobrelia <[email protected]>
@bogdando bogdando force-pushed the multi_cell_database branch from ed36f41 to b6545ed Compare February 10, 2025 13:34
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/09a8bc90c11c465a8651d8c59c6b2246

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 1h 38m 53s
adoption-standalone-to-crc-no-ceph FAILURE in 47m 31s
✔️ adoption-docs-preview SUCCESS in 1m 15s

@bogdando
Copy link
Contributor Author

https://review.rdoproject.org/r/c/testproject/+/54199 has

periodic-adoption-multinode-to-crc-no-ceph https://softwarefactory-project.io/zuul/t/rdoproject.org/build/4560c2e175f0410591ddd23a3540c631 : FAILURE in 3h 42m 49s (swift migration post-adoption)
periodic-adoption-multinode-to-crc-ceph https://softwarefactory-project.io/zuul/t/rdoproject.org/build/f9109ebc2039469abaf4985eb686c097 : SUCCESS in 4h 28m 08s

@bogdando
Copy link
Contributor Author

recheck "Trying to pull quay.io/freeipa/freeipa-server:fedora-41.."

@bogdando bogdando removed the check-before-merge/depends-on Don't forget to check depends-on before merging label Feb 11, 2025
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/dea0d03a0ebf48d6aff5c5faee49eb61

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 1h 43m 28s
adoption-standalone-to-crc-no-ceph FAILURE in 1h 46m 47s
✔️ adoption-docs-preview SUCCESS in 1m 46s

@bogdando bogdando added check-before-merge/depends-on Don't forget to check depends-on before merging do-not-merge/work-in-progress labels Feb 11, 2025
@bogdando
Copy link
Contributor Author

standalone CI jobs need the same adjustments as I did for multinode

@bogdando
Copy link
Contributor Author

recheck

Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://softwarefactory-project.io/zuul/t/rdoproject.org/buildset/50a66f96f54749d9843a3493e1b9e8ee

✔️ noop SUCCESS in 0s
adoption-standalone-to-crc-ceph FAILURE in 3h 16m 10s
✔️ adoption-standalone-to-crc-no-ceph SUCCESS in 3h 15m 10s
✔️ adoption-docs-preview SUCCESS in 1m 18s

@bogdando
Copy link
Contributor Author

recheck Keystone Connection aborted

@bogdando
Copy link
Contributor Author

please merge

@jistr
Copy link
Contributor

jistr commented Feb 18, 2025

This is large and it makes the user-facing workflow around DB copy pretty complex, but my +2 stands as i don't think we can do considerably better with the current higher level approach of sticking to documented procedure only.

I would be interested in more context around the cell renaming (why is it necessary) but that may be best for a f2f call.

I would like to get a review from one more core dev and then we can merge.

@jistr
Copy link
Contributor

jistr commented Feb 18, 2025

The history here is long but after unwrapping it in the Github UI i see this was already reviewed by me, Katie, Pini and Gibi. Going for the merge.

@jistr
Copy link
Contributor

jistr commented Feb 18, 2025

/approve

Copy link

openshift-ci bot commented Feb 18, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jistr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jistr jistr removed the check-before-merge/depends-on Don't forget to check depends-on before merging label Feb 18, 2025
@jistr
Copy link
Contributor

jistr commented Feb 18, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Feb 18, 2025
@openshift-merge-bot openshift-merge-bot bot merged commit 217f280 into openstack-k8s-operators:main Feb 18, 2025
6 checks passed
@bogdando bogdando deleted the multi_cell_database branch February 18, 2025 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants