Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROX-20757 scanner multi arch builds #1574

Merged
merged 22 commits into from
Jul 31, 2024

Conversation

Stringy
Copy link
Contributor

@Stringy Stringy commented Jul 19, 2024

Builds on #1573, adding multi-arch builds (s390x, ppc64le, arm64) to scanner pipeline

Copy link

openshift-ci bot commented Jul 19, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@Stringy Stringy marked this pull request as ready for review July 22, 2024 08:05
@Stringy Stringy requested a review from a team as a code owner July 22, 2024 08:05
Copy link
Contributor

@tommartensen tommartensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the pipeline passed, what other validation can we have?
Run the images on IBM arch VMs?

@Stringy Stringy force-pushed the giles/konflux/ROX-20757-scanner-multi-arch-builds branch from fdd68be to 7b6fa0c Compare July 22, 2024 13:21
Base automatically changed from giles/konflux/ROX-25321-migrate-scanner-to-oci-artifacts to master July 22, 2024 15:02
@Stringy Stringy force-pushed the giles/konflux/ROX-20757-scanner-multi-arch-builds branch from c7e4692 to 1633da6 Compare July 22, 2024 15:04
@@ -250,15 +252,13 @@ spec:
operator: in
values: [ "true" ]

- name: apply-tags
- name: apply-tags-amd64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of scanner and most other pipelines (except for collector), we have apply-tags task to satisfy the init's task image-url parameter. (We could have skipped that altogether because init does not do much and does even less in our case, but we wanted to stay close to the pipeline template and remain future-proof.)

We don't need per-arch konflux-$(params.revision)-amd64 tags. The only one we need is konflux-$(params.revision) which is naturally placed on the manifest image which you get in build-image-manifest.

Would it work if you delete build-image-manifest-konflux, delete apply-tags-<arch> tasks, and add only one apply-tags that works on a result of build-image-manifest?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried this, apply-tags will not retag the manifest, it will only retag the image for the architecture that matches the host running the pipeline (amd64 in this case). Alternatively, we could keep the build-image-manifest-konflux step, but have it use the stackrox images instead of applying the tags, WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird limitation of apply-tags. Can you show me a pipelinerun please?

The only reason we have apply-tags in all pipelines but Collector is to satisfy the init task.

For satisfying the init task it should be sufficient to produce an image with a tag that's derived from information available at the pipeline level, for which we use git commit sha - $(params.output-image-repo):konflux-$(params.revision). Since the intent of init is to check if the image was previously built, we could either give amd64 one to it (i.e. keep only apply-tags for amd64 and tag the image as konflux-$(params.revision), no -amd64 suffix) or create a new manifest tagged as konflux-$(params.revision).

I think, that manifest creation is a cheap operation and it just feels cleaner (e.g. we could succeed building amd64 but fail P or Z) to check presence of builds for all arches. Therefore, I would suggest to try remove all apply-tags*, keep build-image-manifest-konflux but feed it from the same images as build-image-manifest.
Effectively, use build-image-manifest-konflux as a way to add tags.

Would this work?

For Collector, we not only create konflux-$(params.revision) tag for init but we also add $(tasks.determine-image-tag.results.IMAGE_TAG)-latest and $(tasks.determine-image-tag.results.IMAGE_TAG)-slim for StackRox deployment to succeed because this matches the convention. If tagging via build-image-manifest* works, these will be two extra tasks in the pipeline producing manifests tagged with -slim and -latest.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show me a pipelinerun please?

This was about 20 commits ago, doubt I can find the exact run but will try to run it again.

I agree that generating a proper manifest is cleaner than simply retagging a single arch.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msugakov, sorry for the delay, here is a pipeline for scanner-v4-db where I used apply tags on the manifest: https://console.redhat.com/preview/application-pipeline/workspaces/rh-acs/applications/acs/pipelineruns/scanner-v4-db-build-468sx

If you check in quay.io, you'll see the konflux image sha matches with the amd64 image and not the manifest. This is the same behaviour you'll get from doing podman/docker tag <MANIFEST> <IMAGE>, so I believe this is by design. I'll try creating a manifest from the stackrox images next, see if that yields the result we want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's podman manifest inspect and called for both in scanner-v4-db-build-jzrb4, I get identical results.

$ diff -s <(podman manifest inspect quay.io/rhacs-eng/scanner-v4-db:4.6.x-107-g70adab264b-fast) <(podman manifest inspect quay.io/rhacs-eng/scanner-v4-db:konflux-70adab264b664d3f78ab3a84086bb4dbabb12963)
Files /proc/self/fd/11 and /proc/self/fd/15 are identical

Means we can use manifest generation as retagging, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this mean we'd need separate manifest generation tasks in Collector for -latest and -slim?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I've just realised I can use a matrix for that. Ignore me 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's another matrix. The first one iterates over architectures, the second one iterates over manifest tags (yet each manifest includes image from each arch).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Means we can use manifest generation as retagging, right?

That's my understanding, yes, I have that setup on the scanner-v4 and basic-component PRs already.

.tekton/scanner-component-pipeline.yaml Show resolved Hide resolved
.tekton/scanner-component-pipeline.yaml Show resolved Hide resolved
.tekton/scanner-component-pipeline.yaml Outdated Show resolved Hide resolved
.tekton/scanner-db-build.yaml Outdated Show resolved Hide resolved
@Stringy Stringy force-pushed the giles/konflux/ROX-20757-scanner-multi-arch-builds branch from 5fc75cb to f1ff855 Compare July 25, 2024 14:14
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this file are due to limitation with tekton matrices:

  • "normal" params can't reference matrix params
  • matrix params can't be arrays
  • BUILD_ARGS must be an array and the tag variable needs to have the arch suffix in order to differentiate between multi-arch images

As a result, I've used the TARGETARCH variable within the Dockerfile to construct the tag here instead. I've done the same for main (stackrox/stackrox#12054) and collector (stackrox/collector#1751)

@tommartensen @msugakov I'd be keen to hear your thoughts on this workaround

Copy link
Contributor

@msugakov msugakov Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented in-place and also I wonder why did you decide this may be needed?

.tekton/scanner-component-pipeline.yaml Outdated Show resolved Hide resolved
.tekton/scanner-component-pipeline.yaml Outdated Show resolved Hide resolved
.tekton/scanner-component-pipeline.yaml Show resolved Hide resolved
.tekton/scanner-component-pipeline.yaml Show resolved Hide resolved
.tekton/scanner-component-pipeline.yaml Outdated Show resolved Hide resolved
.tekton/scanner-component-pipeline.yaml Outdated Show resolved Hide resolved
Comment on lines 253 to 254
- name: build-container-multi-arch
matrix:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it looks like there's a usability problem with matrix tasks.

https://console.redhat.com/preview/application-pipeline/workspaces/rh-acs/applications/acs/pipelineruns/scanner-build-hmj57 shows logs for just one run from the matrix (seems to be the first one defined, s390x) and does not allow to switch to other runs. I'm going to report this.

Have you seen one of matrix builds failing, would logs show the failing build or they always show just the first one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I've seen it's inconsistent in its reporting, and seemingly randomly picks a run from the list. I've only observed this while the task is running, however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stringy @Molter73 bad news: the recommendation from the Konflux team is not to use matrix for now. See https://redhat-internal.slack.com/archives/C04PZ7H0VA8/p1721997731935729?thread_ts=1717081428.417939&cid=C04PZ7H0VA8

I'm very sorry for me nudging you in that direction and your effort spent on it.

image/scanner/rhel/konflux.Dockerfile Outdated Show resolved Hide resolved
image/scanner/rhel/konflux.Dockerfile Outdated Show resolved Hide resolved
Copy link
Contributor

@msugakov msugakov Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented in-place and also I wonder why did you decide this may be needed?

Copy link
Contributor

@msugakov msugakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EC feels unhappy but I think not this change might be causing it.
Green light from me after you address these last findings.

.tekton/scanner-component-pipeline.yaml Outdated Show resolved Hide resolved
.tekton/scanner-component-pipeline.yaml Outdated Show resolved Hide resolved
.tekton/scanner-component-pipeline.yaml Outdated Show resolved Hide resolved
@msugakov
Copy link
Contributor

@tommartensen Could you PTAL?

@Stringy
Copy link
Contributor Author

Stringy commented Jul 30, 2024

/retest

@Stringy
Copy link
Contributor Author

Stringy commented Jul 31, 2024

/retest

@Stringy Stringy merged commit f176989 into master Jul 31, 2024
34 of 38 checks passed
@Stringy Stringy deleted the giles/konflux/ROX-20757-scanner-multi-arch-builds branch July 31, 2024 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants