Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NO-ISSUE: Fix OKD podman deploy #6588

Merged
merged 1 commit into from
Jul 19, 2024

Conversation

mlorenzofr
Copy link
Contributor

This PR fixes the deploy of OKD using podman. More info in the issue #6562

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

@openshift-ci openshift-ci bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jul 18, 2024
@rccrdpccl
Copy link
Contributor

/lgtm
/approve

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 18, 2024
Copy link
Contributor

@adriengentil adriengentil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/retitle NO-ISSUE: Fix OKD podman deploy

@openshift-ci openshift-ci bot changed the title Fix OKD podman deploy NO-ISSUE: Fix OKD podman deploy Jul 18, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 18, 2024
@openshift-ci-robot
Copy link

@mlorenzofr: This pull request explicitly references no jira issue.

In response to this:

This PR fixes the deploy of OKD using podman. More info in the issue #6562

List all the issues related to this PR

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see CONTRIBUTING guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

openshift-ci bot commented Jul 18, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adriengentil, mlorenzofr, rccrdpccl

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [adriengentil,rccrdpccl]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

codecov bot commented Jul 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.58%. Comparing base (6440103) to head (e5b69c7).
Report is 5 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6588      +/-   ##
==========================================
+ Coverage   68.57%   68.58%   +0.01%     
==========================================
  Files         247      246       -1     
  Lines       36705    36687      -18     
==========================================
- Hits        25169    25163       -6     
+ Misses       9294     9288       -6     
+ Partials     2242     2236       -6     

see 3 files with indirect coverage changes

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD d4e710f and 2 for PR HEAD e5b69c7 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 43ec783 and 1 for PR HEAD e5b69c7 in total

@rccrdpccl
Copy link
Contributor

/retest

1 similar comment
@adriengentil
Copy link
Contributor

/retest

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 250a70e and 0 for PR HEAD e5b69c7 in total

Copy link

openshift-ci bot commented Jul 19, 2024

@mlorenzofr: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 704d2af into openshift:master Jul 19, 2024
22 checks passed
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: ose-agent-installer-api-server
This PR has been included in build ose-agent-installer-api-server-container-v4.17.0-202407200311.p0.g704d2af.assembly.stream.el9.
All builds following this will include this PR.

@mlorenzofr mlorenzofr deleted the okd-podman-readme branch July 22, 2024 08:34
eifrach pushed a commit to eifrach/assisted-service that referenced this pull request Aug 4, 2024
danmanor added a commit to danmanor/assisted-service that referenced this pull request Sep 28, 2024
danmanor added a commit to danmanor/assisted-service that referenced this pull request Sep 28, 2024
openshift-merge-bot bot pushed a commit that referenced this pull request Oct 6, 2024
…35.0, Cherry pick commits for hotfix v2.35.1 (#6867)

* NO-ISSUE: Exclude vendor directory from snyk code analysis (#6572)

* MGMT-18333: Remove the python client from the image (#6579)

* Add certs to ingress when deploying in non-OCP clusters (#6564)

Resolves https://issues.redhat.com/browse/MGMT-18305

* Refactor package in generator_test.go to use _test suffix for implementing black-box testing methodology (#6578)

* KONFLUX-1611: Add labels, licenses & user to Dockerfile (#6484)

* NO-ISSUE: [master] Bump OCP versions: 4.16, 4.14, 4.13 (#6584)

Co-authored-by: danmanor <[email protected]>

* Remove unused extracter struct (#6586)

All the operator implementations take this as input, but none use it.

* MGMT-18155: handle state of day2 node in Done stage (#6570)

When installing a day2 node, in some scenraios, the process
can be stucked on 'installing-pending-user-action' state
while the node is in Done stage.
E.g. after fixing an invalid certificate that failed verfication
when trying to reach the using kubeconfig.

Thus, added 'InstallingPendingUserAction' as a SourceState
for the relevant TransitionRule (which should move state
to the final 'AddedToExistingCluster' DestinationState).

* fix: dev-requirements.txt to reduce vulnerabilities (#6576)

The following vulnerabilities are fixed by pinning transitive dependencies:
- https://snyk.io/vuln/SNYK-PYTHON-SETUPTOOLS-7448482

Co-authored-by: snyk-bot <[email protected]>

* NO-ISSUE: Fix python client generation (#6593)

The naming of the python package changed since setuptool upgrade in
#6576.

* MGMT-18411: Add NTP sources to discovery environment (#6591)

Currently when the user provides additional NTP sources for the
installation they are used by the installed cluster, but not by the
discovery environment. That means that the system clock of the discovery
environment may be wrong if the default NTP servers can't be reached,
and as result it may fail to pull images because the system clock may be
before the validity date of the certificates of the registry server. To
avoid that this patch changes the ignition of the discovery environment
so that it will also use the NTP sources provided by the user.

Related: https://issues.redhat.com/browse/MGMT-18411

Signed-off-by: Juan Hernandez <[email protected]>

* Fix OKD podman deploy (#6588)

* Revert "Add certs to ingress when deploying in non-OCP clusters (#6564)" (#6590)

This reverts commit 77219ee.

When using BMO to deploy hosts the ironic server didn't trust the certs
used for the ingress. This failed the deployment.

Revert the cert addition while we investigate a different way to handle
this.

Related to https://issues.redhat.com/browse/MGMT-18418

* NO-ISSUE: Adjust permissions on /data directory (#6587)

Currently, the assisted service fails to start using podman with a
permission error:

```
Unable to create directory for file data /data/0e4d8f88-9eb9-441c-84e5-7f4569be78e0/manifests/openshift/50-masters-chrony-configuration.yaml: mkdir /data/0e4d8f
88-9eb9-441c-84e5-7f4569be78e0: permission denied" pkg=Inventory
```

This change ensures that /data is writable by the UID specified in the
Dockerfile, and when deployed on OCP, we allow the group "root" to write
because the UID will be randomly selected.

https://docs.openshift.com/container-platform/4.16/openshift_images/create-images.html#use-uid_create-images

* Add -L option to curl command to follow redirects (#6594)

* NO-ISSUE: [master] Bump OCP versions: 4.15 (#6598)

Co-authored-by: danmanor <[email protected]>

* MGMT-17773: Enforce go modules tidy (#6600)

* MGMT-18448: Allow local cluster import to be disabled. (#6606)

We need to make sure that the local cluster import may be disabled.
The import should be switched off by default and only overridden with an annotation.

This PR implements this functionality.

If the local cluster import functionality is disabled then all local cluster entities controlled
by assisted will be left intact, we will simply skip the reconcile loop.

Customers who need to enable the feature may do so by placing an annotation on the AgentServiceConfig.

* MGMT-18446: Correct hostname max length validation (#6604)

https://issues.redhat.com/browse/MGMT-18446
Discovered that Kubelet fails when a hostname is longer
than 63 characters (rather than 64 originally). Updates
the validation to only allow hostnames with 63 characters
or less.
More details https://access.redhat.com/solutions/7068042

* NO-ISSUE: [master] Bump OCP versions: 4.17 (#6611)

Co-authored-by: danmanor <[email protected]>

* NO-ISSUE: Tidy golang dependencies before vendoring in base image (#6599)

* Revert "MGMT-18127: User name and password in a proxy url should be url encod…" (#6619)

This reverts commit 66a5b41.

* MGMT-18451: Enable debugging assisted-service on kind (#6613)

Co-authored-by: root <[email protected]>

* NO-ISSUE: [master] Bump OCP versions: 4.16 (#6621)

Co-authored-by: danmanor <[email protected]>

* MGMT-16242: Suggest OCP images based on availability for architecture (#6262)

When creating an infraenv, a validation is performed to ensure that the release images contain a compatible image for a given Openshift version.
This based on the MajorMinor version of The desired Openshift version and and MajorMinor version of the relase image Openshift version.
This is performed within the context of a specific architecture.

The error message presented to the user when no suitable release image could be isolated is not helpful and needs to contain more clues on how the user might resolve the issue.

This PR will list available Openshift versions supported in the release images for the user's architecture when this error occurs.

* NO-ISSUE: Use skipepr to build images as debug image requires nmstate packages for building (#6629)

* AGENT-930: For the agent installer, parse the expiration time from the token and verify if the token is valid. (#6605)

* NO-ISSUE: [master] Bump OCP versions: 4.16, 4.15, 4.12 (#6638)

Co-authored-by: danmanor <[email protected]>

* MGMT-18505: Fix installation from a 4.17 hub with converged flow (#6639)

* This allows ironic inspector URL to be missing in ICC config secret.

In 4.17 this is expected to be missing as the inspector service as been
removed.

If this URL is provided the agent will attempt to contact the inspector
service when it shouldn't causing the install to fail.

In earlier versions the secret will not be present so the controller
will continue to provide the inspector service URL as before.

Resolves https://issues.redhat.com/browse/OCPBUGS-37472

* Update image-customization-controller to release-4.16 branch

This includes a patch which removes the default for the inspector URL.
This is required because when deploying from a 4.17 hub the inspector
URL will not be present in the information on the cluster and we don't
want that URL to be set in the ignition.

* MGMT-18231: Block iSCSI as an installation disk when its holder is multipath (#6612)

* NO-ISSUE: [master] Bump OCP versions: 4.14, 4.16, 4.13 (#6644)

Co-authored-by: danmanor <[email protected]>

* MGMT-18313: Replace golang base image as it is based on Centos Linux 7 (#6637)

* MGMT-17560: validation to support iSCSI boot volume (#6434)

* MGMT-17560: validation to support iSCSI boot volume

consume inventory directly in IsDiskEligible

* refactor AddressFamily

Move it to common package in order to avoid import cycle.

* user netip package and change error message

* use lo instead of funk

* make disk eligible if no default route is found

* fix tests

* NO-ISSUE: Print subsystem environment variables before test (#6654)

* NO-ISSUE: Add missing kube api flag on kube api subsystem test (#6655)

* NO-ISSUE: [master] Bump OCP versions: 4.16 (#6658)

Co-authored-by: danmanor <[email protected]>

* MGMT-17560: Append kargs for iSCSI boot (#6602)

* MGMT-17560: Append kargs for iSCSI boot

Set DHCP on the network interface used for the iSCSI network when
installing RHCOS on the disk. This is required to allow boot from an
iSCSI volume.

* check disk if nil

* use netip and lo

* refactor append kargs for storage

* add warning when no installation disk found

* add more tests

* NO-ISSUE: Fix debug image by adding required packages (#6659)

* NO-ISSUE: [master] Bump OCP versions: 4.14, 4.15 (#6663)

Co-authored-by: danmanor <[email protected]>

* MGMT-18560: Fix AutomatedCleaningMode behavior (#6662)

AutomatedCleaningMode should not be used when the
converged flow is disabled since it requires IPA
which is only enabled when the converged flow is enabled.

Caused by regression in PR
#5319
which relies on users to set the automatedCleaningMode
spec, but did not take into account the converged flow
being disabled.

* OCPBUGS-27238: Use both the OCP cluster trusted certs and user certs (#6649)

* Use both the OCP cluster trusted certs and user certs

Previously when a user provided mirror registry certs the
assisted-service pod would be deployed in such a way that those would be
the _only_ certs trusted by most commands running on the pod.

This would cause issues when, for example, the spoke cluster release
image is mirrored internally, but the hub cluster image is not.

This was the case in https://issues.redhat.com/browse/OCPBUGS-27238
where assisted-service failed to pull the hub cluster release image
because it didn't trust a certificate it otherwise should have.

To address this the infrastructure-operator creates a configmap which is
labeled such that the cluster network operator will inject the public
CA bundle into it as described in [1]. This content is then merged with
the user-provided content (if any is provided) into a third configmap
which is mounted into the assisted-service container.

[1] https://docs.openshift.com/container-platform/4.16/networking/configuring-a-custom-pki.html#certificate-injection-using-operators_configuring-a-custom-pki

* Doc the new mirror registry CA bundle behavior

* MGMT-18514: Calculate machine networks in external platform (#6661)

* MGMT-18514: Calculate node-ip in external platform

Since external platform relies on user managed networking, we need the
feature introduced in
#6257 to determine the
machine networks.

This change is useful when using iSCSI boot drive on Oracle Cloud
Infrastructure, in order to select the machine network that doesn't
belonmg to the iSCSI subnet.

* refactor

* MGMT-17560: Workaround missing DNS on iSCSI (#6603)

* MGMT-17560: Workaround missing DNS on iSCSI

When iSCSI boot is enabled, DNS configuration will be missing on first
boot, even though the information was advertised by DHCP:
https://issues.redhat.com/browse/OCPBUGS-26580.

The workaround consists at re-applying the network configuration on all
the network interfaces when we detect if `/sysroot` is mounted from an
iSCSI block volume.

* fix typo

* add wordaround only if one machine uses iscsi boot drive

* check error message in unit tests

* refactor unit-tests to not repeat host creation

* NO-ISSUE: Change debug Dockerfile so it will not require prior actions (#6674)

* MGMT-18121: Configure networking when using ISCSI over OCI (#6665)

* MGMT-18121: OCI network configuration script

* MGMT-18122: Copy network configuration when ISCSI on OCI is in use

* add JSON example returned by vnic endpoint

* refactor static netowrking

* check if inventory is not nil

* fix path and trigger after NetworkManager

* --copy-network doesn't work

* add unit test

* fix typo

* Update internal/ignition/templates/iscsi-oci-configure-secondary-nic.service

Co-authored-by: Eran Ifrach <[email protected]>

* refactor constructHostInstallerArgs

* re-wrap comment

---------

Co-authored-by: Eran Ifrach <[email protected]>

* NO-ISSUE: [master] Bump OCP versions: 4.16, 4.17 (#6676)

Co-authored-by: danmanor <[email protected]>

* Wait for host to deprovision in BMAC (#6666)

Only BMAC should be aware of the BMH and its status. When deprovisioning
a node by deleting a BMH, make BMAC wait for the node to fully
deprovision before annotating and deleting the agent and removing the
finalizer.

This adds a new annotation for the agent resource
(`agent.agent-install/clean-spoke-on-delete`) which is used in place of the
bmac annotation to communicate to the agent controller that the node should
be removed.

This gets us a bit closer to https://issues.redhat.com/browse/MGMT-10006
by removing the BMH concept from the "remove a node" flow in the agent
controller. This will allow this logic to be more easily reused in the
late-binding/non-bmh case.

* Reapply "Add certs to ingress when deploying in non-OCP clusters (#6564)" (#6685)

This reverts commit 9964f08.

The re-adds the ingress certs now that the CAPI provider is using the
internal service IP for the BMH.

Resolves https://issues.redhat.com/browse/MGMT-18418

* MGMT-18378:  allow CNV on ARM Dev preview (#6645)

* [MGMT-18378] - making test more readable

* [MGMT-18378] - fix issue where LVM is not default for CNV

* [MGMT-18378] - enable CNV ARM as dev preview

* MGMT-1612: Allow a slight deviation from official host minimum memory (#6660)

The internal host validator tolerates a slight deviation from official host minimum memory, because some setups (e.g. VMs) might give just a bit less than requested and we shouldn't be too strict about it

* MGMT-18384 bump golang 1.21 (#6667)

* NO-ISSUE: add MCE to OLM operators dev docs (#6689)

* add MCE to OLM operators dev docs

* Update olm-operator-plugins.md

---------

Co-authored-by: Oved Ourfali <[email protected]>

* NO-ISSUE: Support both minikube and kind for deploying on k8s cluster (#6664)

* NO-ISSUE: Soft install timeout enhancement (#6694)

* MGMT-8115: Soft install timeout

This patch adds an enhacement proposal about replacing hard timeouts
with soft timeouts, so that users can manually fix issues and
installation can resume.

Signed-off-by: Juan Hernandez <[email protected]>

* Fix typos

Signed-off-by: Juan Hernandez <[email protected]>

* MGMT-8115: Add more details about the current implementation, and suggested implementation

---------

Signed-off-by: Juan Hernandez <[email protected]>
Co-authored-by: Juan Hernandez <[email protected]>

* NO-ISSUE: change ci operator to 4.17 (#6702)

* NO-ISSUE: [master] Bump OCP versions: 4.16, 4.13, 4.14, 4.15, 4.12, 4.17 (#6706)

Co-authored-by: danmanor <[email protected]>

* Enhancement: backup/restore support (#6683)

* NO-ISSUE: Add mlorenzofr to OWNERS_ALIASES file (#6710)

* MGMT-17805: Fix MCO reboot error for s390x (#6682)

Signed-off-by: Amadeus Podvratnik <[email protected]>
Co-authored-by: Amadeus Podvratnik <[email protected]>

* NO-ISSUE: Prepare base documents and targets for devel environments using kind (#6704)

* MGMT-18659: Ensure that we use the name of local ManagedCluster for local-cluster name (#6696)

When performing a local cluster import - the local cluster name is hardwired to "local-cluster", this is not flexible enough and needs to be changed.
We should be picking up the name from the ManagedCluster that is labelled as "local-cluster" - this should then be used wherever we use the present local-cluster name.
An additional check is performed to ensure that the clusterID referenced in the labels of ManagedCluster matches the clusterID in the labels found in the
clusterVersion of the hub on which the LocalClusterImport is running.

This should be sufficient at this stage as we do not need to handle scenarios such as a "rename", mainly because ACM will be enforcing some rules

From ACM-DDR-022:

To ensure backward compatibility, we will not directly allow renaming the local cluster from MCE or ACM installer at the first stage. The user can choose a customized local cluster name with the following step:
Disable local cluster management in MCE or ACM when install
Manually create a ManagedCluster resource with the label “local-cluster=true” on the hub

I have also added some entity cleanup that was missed in previous versions, mainly just making sure that unused secrets are deleted.

* NO-ISSUE: [master] Bump OCP versions: 4.12, 4.16, 4.17, 4.15 (#6713)

Co-authored-by: danmanor <[email protected]>

* MGMT-18545: Prefer multipath disks when selecting install disk (#6675)

Multipath disks now include the WWN (MGMT-17867) hint and are
included in the list of acceptable install disks. The operator-based
assisted-service will now prefer multipath disk selection if there
is an acceptable multipath disk.

* MGMT-18702: Fix error due to missing local cluster name (#6721)

When the local cluster name could not be resolved, we inadvertantly try to clean it up.
This issue addresses that by ensuring we do not attempt cleanup in the absence of a name derived from ManagedCluster

* NO-ISSUE: [master] Bump OCP versions: 4.16 (#6724)

Co-authored-by: danmanor <[email protected]>

* NO-ISSUE: workaround for LSO on 4.17 (#6733)

* MGMT-10006: Cleanup spoke after unbind (#6705)

* Add deprovision info to agent status

This will signal to the agent controller in future reconciles to clean
up the referenced node from the referenced cluster when unbind has
finished.

* Make agent cleanup opt-out instead of opt-in

If this is something we always want in late-binding it makes sense for
that to be the case in bound scenarios as well.

This also removes the need for BMAC to annotate the agent just to delete
it.

* Refactor clusterExists and removeSpokeResources so they don't require a bound agent

The agent won't be related to the cluster directly any more when these
functions need to be used post-unbind. Take the cluster reference or
client object as a parameter instead.

* Remove spoke nodes when unbinding

When an agent is fully unbound and booted back into the discovery image,
used the deprovision information in the agent status to connect to the
cluster the agent was removed from and clear out the node, machine, and
BMH resources.

Resolves https://issues.redhat.com/browse/MGMT-10006

* Add docs on resource removal and skip annotation

* MGMT-18645: Ensure CA bundle exists before assisted-service (#6734)

* MGMT-18645: Ensure CA bundle exists before assisted-service

A race condition exists where the assisted-service deployment mounts
the assisted-service trusted ca bundle config map before it contains
anything. This can cause the certificate validation to fail when
using mirror registries because the ca bundle is empty.

This adds a check to make sure the ca bundle config map contains a value
before creating the assisted-service deployment.

* MGMT-18645: Fix unit tests

* NO-ISSUE: fixed destroy target in running-test doc (#6741)

Replaced the missing `destroy-kind-cluster` target
with the new `destroy-hub-cluster` target.

* OCPBUGS-36577: Switch to github.com/docker/distribution/reference to mitigate CVE-2024-3727 (#6740)

The library github.com/containers/image/v5 has a vulnerability that has as of yet been unresolved.

Thankfully, it is possible to change the part of the library that we use

We can change github.com/containers/image/v5/docker/reference for github.com/docker/distribution/reference

* NO-ISSUE: [master] Bump OCP versions: 4.17, 4.13, 4.15 (#6737)

Co-authored-by: danmanor <[email protected]>

* MGMT-14634: Ensure that nil values and empty values for filenames are handled correctly. (#6731)

Presently if an empyty filename is passed or if the filename is nil, the code may panic
This PR addresses that by adding suitable validations to ensure these panics do not occur.

* MGMT-18694, MGMT-18575, OCPBUGS-34849: Don't require mapping for names matching physical interfaces (#6715)

* Do not require interface mapping for physical network names

* add support for identifier field, e.g. mac-address

* Move logic to check that interface map is not empty into separate function

* Add an annotation to override ironic IP family to provide in dual-stack hubs (#6686)

Currently when a hub cluster is dual-stack the only way the
preprovisioning image controller can determine which callback IP family
to provide to the ironic agent is to use the cluster reference in the
infraenv. If there is no cluster reference (for late binding cases) then
the controller always provides the primary IP family of the hub cluster.

This prevents users from using late-binding to deploy ipv6 only hosts
from an ipv4-primary dual-stack hub (or vice versa).

This commit adds an annotation on the infraenv that can be used to
override the ip family used when discovering hosts with a particular
discovery image. `infraenv.agent-install.openshift.io/ip-family` can be
set to `v4`, `v6`, or `v4,v6` to indicate to the controller which family
should be used.

In newer OCP versions the ironic agent supports multiple callback URLs
in its config, but the infrastructure-operator still supports running on
hubs that don't have this change so this annotation is still required
for those situations.

Partially resolves https://issues.redhat.com/browse/MGMT-18510

* NO-ISSUE: [master] Bump OCP versions: 4.15, 4.16, 4.14 (#6746)

Co-authored-by: danmanor <[email protected]>

* MGMT-18636: Handle status annotation in provisioned BMHs (#6703)

This change is introducing a solution for reprovisioned BMHs
of restored spoke clusters.
The suggested logic ensures that a 'status' annotation is
always available for BMHs in state 'provisioned'.

By that, as suggested in BMO docs[1] for moving BMHs
bwtween hub, the status is restored from the annotation.
I.e. When the status is available, BMHs reprovisioning
by Ironic is prevented.

See full flow in the enhancement[2], 'Solution for BMHs reconcile' section.

[1] BMO documentation:
https://github.com/metal3-io/metal3-docs/blob/6a656b3eb195c1b09ba35fcad4d011c6cb02dbc2/docs/user-guide/src/bmo/status_annotation.md

[2] Enhancement:
#6683

* OCPBUGS-32857: Bump to latest go-jose to mitigate CVE-2024-28180 (#6743)

The current version of go-jose is vulnerable to CVE-2024-28180
This PR mitigates that by bumping to the latest version of go-jose

Additionally, we have switched to using the github repository for the library as this appears to be better maintained

* MGMT-18466: moving to security updates only (#6751)

* NO-ISSUE: Bump github.com/go-openapi/swag in /client (#6758)

Bumps [github.com/go-openapi/swag](https://github.com/go-openapi/swag) from 0.22.3 to 0.23.0.
- [Commits](go-openapi/swag@v0.22.3...v0.23.0)

---
updated-dependencies:
- dependency-name: github.com/go-openapi/swag
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* NO-ISSUE: Bump github.com/rs/cors from 1.10.1 to 1.11.1 (#6756)

Bumps [github.com/rs/cors](https://github.com/rs/cors) from 1.10.1 to 1.11.1.
- [Commits](rs/cors@v1.10.1...v1.11.1)

---
updated-dependencies:
- dependency-name: github.com/rs/cors
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* NO-ISSUE: Bump github.com/go-openapi/spec from 0.20.7 to 0.21.0 (#6757)

Bumps [github.com/go-openapi/spec](https://github.com/go-openapi/spec) from 0.20.7 to 0.21.0.
- [Commits](go-openapi/spec@v0.20.7...v0.21.0)

---
updated-dependencies:
- dependency-name: github.com/go-openapi/spec
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* CORS-3664: Authentication tech debt for agent based installer (#6697)

* CORS-3664: Authentication tech debt for agent based installer

* The expiry parameter is now a pointer (*time.Time) instead of a variadic param

* NO-ISSUE: Bump github.com/pkg/xattr from 0.4.9 to 0.4.10 (#6754)

Bumps [github.com/pkg/xattr](https://github.com/pkg/xattr) from 0.4.9 to 0.4.10.
- [Release notes](https://github.com/pkg/xattr/releases)
- [Commits](pkg/xattr@v0.4.9...v0.4.10)

---
updated-dependencies:
- dependency-name: github.com/pkg/xattr
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* MGMT-18579: Inject nmpolicy captures into the provided YAML and place it with INI file under the host-specific path (#6695)

* Inject nmpolicy captures

Inject nmpolicy captures into the provided YAML and place it with INI
file under the host-specific path

* New script using nmstate

Script for the new flow using nmstate service, along with the service configurations to execute the script for both minimal and full ISO

* Modify minimal ISO flow

Modify the initrd in minimal ISO to include nmpolicy files along with the script and relevant service, while maintaining backward compatibility with the current flow for versions earlier than 4.13

* Modify full ISO flow

Modify static network flow for full ISO along with maintaining backward compatibility with the current flow for versions earlier than 4.13

* Transition keyfiles validations to YAML files

* NO-ISSUE: Bump github.com/go-openapi/loads from 0.21.1 to 0.22.0 (#6765)

Bumps [github.com/go-openapi/loads](https://github.com/go-openapi/loads) from 0.21.1 to 0.22.0.
- [Commits](go-openapi/loads@v0.21.1...v0.22.0)

---
updated-dependencies:
- dependency-name: github.com/go-openapi/loads
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* NO-ISSUE: Bump github.com/samber/lo from 1.39.0 to 1.47.0 (#6766)

Bumps [github.com/samber/lo](https://github.com/samber/lo) from 1.39.0 to 1.47.0.
- [Release notes](https://github.com/samber/lo/releases)
- [Commits](samber/lo@v1.39.0...v1.47.0)

---
updated-dependencies:
- dependency-name: github.com/samber/lo
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* NO-ISSUE: Fix failing dependencies (#6749)

* OCPBUGS-33824: Libraries bump to mitigate CVE-2024-27289 (#6742)

The goal of this PR is to ensure that the github.com/jackc/pgx library
is no longer on 4.16.0 which is vulnerable to CVE-2024-27289

If we bump to a very recent version of pgx, we have new constraints
placed on the text encoding that a connection may have.

Perhaps his warrants a bigger investigation in a separate issue as it
would be good to be able to upgrade to the latest and greatest postgres
driver if possible.

For now, we will use replace to bump the pgx to a non vulnerable version
of 4.18.3

Co-authored-by: root <[email protected]>

* MGMT-18466: grouping all other updates (#6775)

* Remove generateConfiguration call and its associated error check (#6772)

Remove the execution of generateConfiguration and its associated error check in GenerateStaticNetworkConfigDataYAML because it is not needed at this stage of the code. The validation, where it is required, already handles this

* NO-ISSUE: Fix typo in pgx reference (#6774)

There is a typo in the replace statement for github.com/jackc/pgx, it should start with github.com/jackc/pgx/v4
Without this, recent library replacements do not take place.

This PR resolves that by fixing the reference

* MGMT-18635: Restore missing Host by Agent CR (#6730)

When applying an Agent CR with a missing associated host,
the Agnet CR is currently being deleted by the reconciler.
This scenario would happen on restore to a new hub flows
since the host is a DB only entity.

Thus, this change introduces handling of a missing host
by restoring the object into DB according to properties
from the Agent CR.
For restoring the host's status, added 'state' annotation
to the agent which is set on reconcile.

See full flow and more details in the enhancement:
https://github.com/openshift/assisted-service/blob/master/docs/enhancements/backup-restore-support.md#solution-for-deleted-agent-crs

* NO-ISSUE: Bump github.com/moby/moby (#6777)

Bumps [github.com/moby/moby](https://github.com/moby/moby) from 26.0.0+incompatible to 27.2.1+incompatible.
- [Release notes](https://github.com/moby/moby/releases)
- [Commits](moby/moby@v26.0.0...v27.2.1)

---
updated-dependencies:
- dependency-name: github.com/moby/moby
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* NO-ISSUE: Bump github.com/go-gormigrate/gormigrate/v2 (#6778)

Bumps [github.com/go-gormigrate/gormigrate/v2](https://github.com/go-gormigrate/gormigrate) from 2.0.1 to 2.1.2.
- [Release notes](https://github.com/go-gormigrate/gormigrate/releases)
- [Changelog](https://github.com/go-gormigrate/gormigrate/blob/master/CHANGELOG.md)
- [Commits](go-gormigrate/gormigrate@v2.0.1...v2.1.2)

---
updated-dependencies:
- dependency-name: github.com/go-gormigrate/gormigrate/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* MGMT-16437: drop platform_is_external column (#6776)

`platform_is_external` is a left over of the development of the external
platform. Its usage was removed with https://github.com/openshift/assisted-service/pull/5787/files#diff-8b1949772e223a1da6a2049ada2733fa506410975b241cf86cf44c7a8665bc62L5394-L5399
more than helf a year ago.

* MGMT-18618: Fix local rhsso deployment (#6779)

* NO-ISSUE: Add _Dedent_ function (#6781)

This patch adds a new `common.Dedent` function that removes all common
leading white space from strings. This is intended for situations where
it is convenient to include JSON or YAML documents inside Go code, but
it is cumbersome to have very long lines or concatenation of strings.

Signed-off-by: Juan Hernandez <[email protected]>

* NO-ISSUE: Bump github.com/go-openapi/validate from 0.22.0 to 0.24.0 (#6767)

Bumps [github.com/go-openapi/validate](https://github.com/go-openapi/validate) from 0.22.0 to 0.24.0.
- [Commits](go-openapi/validate@v0.22.0...v0.24.0)

---
updated-dependencies:
- dependency-name: github.com/go-openapi/validate
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* NO-ISSUE: Fix cloud_hotfix_releases konflux build (#6624)

* MGMT-19006, MGMT-19004, MGMT-19007: remove default LVM for CNV deployments (#6800)

* MGMT-19001, MGMT-19005: Add dhcp and autoconf fields if missing in nmstate YAML + fix the nmstate flow bug on archs other than x86 (#6798)

* Add dhcp and autoconf fields if missing in nmstate YAML

* Fix the nmstate flow bug on architectures other than x86

* logging nmstate to journal

* MGMT-19029: Fix issue where validation failed on CNV with ARM host (#6845)

* Use old static network flow if OCP version is empty (#6848)

* fix failing kubeapi subsystem test invalid NMstate YAML (#6847)

* add feature flag min ocp version for nmstate service flow (#6842)

* MGMT-18955: Add 4.18 RHCOS / OCP images (#6786)

---------

Signed-off-by: Juan Hernandez <[email protected]>
Signed-off-by: Amadeus Podvratnik <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Adrien Gentil <[email protected]>
Co-authored-by: Nick Carboni <[email protected]>
Co-authored-by: Linoy Hadad <[email protected]>
Co-authored-by: Matthieu Bernardin <[email protected]>
Co-authored-by: danmanor <[email protected]>
Co-authored-by: Daniel Erez <[email protected]>
Co-authored-by: Liat Gamliel <[email protected]>
Co-authored-by: snyk-bot <[email protected]>
Co-authored-by: Juan Hernández <[email protected]>
Co-authored-by: Manuel Lorenzo <[email protected]>
Co-authored-by: Liangquan Li <[email protected]>
Co-authored-by: Paul Maidment <[email protected]>
Co-authored-by: Crystal Chun <[email protected]>
Co-authored-by: Alona Paz <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: Pawan Pinjarkar <[email protected]>
Co-authored-by: Elior Erez <[email protected]>
Co-authored-by: Eran Ifrach <[email protected]>
Co-authored-by: Gilad Ravid <[email protected]>
Co-authored-by: Oved Ourfali <[email protected]>
Co-authored-by: Oved Ourfali <[email protected]>
Co-authored-by: Ori Amizur <[email protected]>
Co-authored-by: Manuel Lorenzo <[email protected]>
Co-authored-by: Amadeuds Podvratnik <[email protected]>
Co-authored-by: Amadeus Podvratnik <[email protected]>
Co-authored-by: Bob Fournier <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: root <[email protected]>
Co-authored-by: David Asulin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants