Understand best practices for collaborative development workflows #41

gonuke · 2023-05-25T17:56:54Z

We have started using the multistage-docker-build-action for some collaborative development projects. We'd like to use it for PR testing as part of our continuous integration strategy. We have a valid Dockerfile with about 6 stages including two final stages related to building and then testing the software in question. (Prior stages are for installing dependencies and since some are long we want multiple stages for ultimate efficiency in building.)

Historically, we have had one workflow that uses this Dockerfile to build images with the dependencies, and then test that this images is suitable for building/testing a known-to-work version of the software. These images then become the basis for PR-based testing of modifications to the software itself. This strategy was based on the challenges in setting up the GH action environment in all the ways that this action resolves! 🎉

It now seems attractive for every PR to be tested with the same Dockerfile used to build the full stack because we can rely on pulling earlier stages from the GH container repo (GHCR) and having them in the cache to accelerate the build process for all the dependences prior to testing the build of the software.

This does raise some questions, however:

Normally, the tests will be run as the PR author. This may require that the images being pushed to GHCR mush be pushed to that author's/user's GHCR package space.

Is it possible for PR tests to push a package to the upstream repo's package space, depending on that author's permissions?
What if the author does not have push/write privileges for that repo
If the images are pushed to the author's GHCR package space, are the success/failure results reported as usual to the PR dashboard?

This action retrieves images from GHCR to a local cache to accelerate docker builds.

If a PR has updates based on the review process, does each successive round of testing properly identify the correct version of the already pushed images from stages that were successful and have not changed in the most recent update?
Does this depend on whether the images are on the authors package space or the upstream repo's package space?

With answers to these, and perhaps related questions, we can update documentation to identify best practice (or any practice) for using this action for collaborative development projects.

The text was updated successfully, but these errors were encountered:

gonuke · 2023-05-25T17:57:29Z

Tagging @bquan0 to investigate this further

bquan0 · 2023-05-26T00:44:42Z

From what I can tell so far, it's not possible to push a package from a forked repo to the upstream repo's package space. In this workflow run, I changed the repository space that the multistage action pushes to from bquan0/learn-maintenance to cnerg since cnerg is the upstream repo owner. However, the workflow runs into a denied: permission_denied: The requested installation does not exist. error. Also, I don't think there is a default github env variable that gives the upstream repo owner's name.

I also looked in package settings, but there doesn't seem to be an option to allow permissions for forked repos.

Firehed · 2023-06-05T21:33:02Z

Thanks for bringing up these questions!

The initial design was based around a centralized image repository for all contributors - I've used both Github's GHCR as well as Google's GCR in this sort of workflow. Authentication to the repository is consequently specific to the container registry - Github does some automatically to their own system, but GCR and others require additional steps/actions in order to authenticate.

I think some of this will depend on your development model as well - if all contributors are working on their own forks (more common with OSS/public repos), you'll probably end with different strategies than if they're all direct contributors to your repository (more common for private repos). Notably, the availability of GHA secrets for fork-based workflows is pretty restricted, since you likely wouldn't want someone forking your repo to be able to extract those secrets.

This probably doesn't answer any of your questions yet, but I'm hoping we can discuss further to suss out some of the edge cases and hopefully find a path forward! Are you able to share more details about your workflow specifics?

gonuke · 2023-07-22T16:31:34Z

I'm digging into this a little more in this PR in our repo.

Our development model does have contributors working in their own forks and then issuing PR's from those forks. In that model, we have choices about when the action runs and under what terms. This allows workarounds for seeing whether builds are successful, but maybe not the ideal behavior.

If I set the repo to use github.repository_owner, and login to GHCR with that same user, the PR tests try to push to the upstream GHCR and fail with

denied: installation not allowed to Write organization package

If I set the repo to use github.actor, and login to GHCR with that same user, the PR tests try to push to the author's GHCR and fail with

denied: permission_denied: The requested installation does not exist.

I am not sure if this latter error is a GH/GHCR configuration problem, or expected behavior - the internet seems to offer conflicting opinions...??

One idea: a version of this that has a variable to toggle whether it pushes or not? In my use case, I might set it to push images when run as push and not push images when run as pull_request. I'd have to think through what that does for successfully reusing cached images over iterations of pull request comments....

gonuke · 2023-07-22T16:33:37Z

While we are at it... it might also be useful to have a variable to specify the a tag name.

gonuke · 2023-08-06T20:39:43Z

I did some research (ie. systematically tried a lot of combinations) today and learned that it is outside of normal practice to allow workflows initiated by public forked repositories to write packages. In summary, GH strongly encourages the use of the GITHUB_TOKEN and when used from a public forked repository is never has better than read access to organization packages (see chart below from https://docs.github.com/en/actions/security-guides/automatic-token-authentication)

This means that it will not be practical to use the multistage-docker-build-action to manage routine CI for pull requests, as far as I can tell.

Firehed added the question Further information is requested label Jun 5, 2023

bquan0 mentioned this issue Aug 13, 2023

Add workflow to create virtual machine from PyNE docker image in ghcr.io pyne/pyne#1498

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Understand best practices for collaborative development workflows #41

Understand best practices for collaborative development workflows #41

gonuke commented May 25, 2023

gonuke commented May 25, 2023

bquan0 commented May 26, 2023 •

edited

Loading

Firehed commented Jun 5, 2023

gonuke commented Jul 22, 2023 •

edited

Loading

gonuke commented Jul 22, 2023

gonuke commented Aug 6, 2023

Understand best practices for collaborative development workflows #41

Understand best practices for collaborative development workflows #41

Comments

gonuke commented May 25, 2023

gonuke commented May 25, 2023

bquan0 commented May 26, 2023 • edited Loading

Firehed commented Jun 5, 2023

gonuke commented Jul 22, 2023 • edited Loading

gonuke commented Jul 22, 2023

gonuke commented Aug 6, 2023

bquan0 commented May 26, 2023 •

edited

Loading

gonuke commented Jul 22, 2023 •

edited

Loading