-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] BuildStrategy: Cannot Use Context Dir as Working Directory #1573
Comments
Refinement - this is likely a limitation of how Tekton creates containers in a TaskRun. A guide for build strategy authors should call this out. |
@adambkaplan I would consider this a bug in our Git step implementation. The step could check the existing sub-directories and if there are any, clone into a temporary directory and then move content. Or delete the sub-directories (would that break the parallel steps) and after clone finished verify that they were recreated. |
This feels like a lot of extra work, and can be error/risk prone. I also think this encourages "bad" behavior of expecting additional content to exist alongside source code as part of a build process. Things like dependency caches IMO should be configurable and located outside of the source code directory. I'm personally fine keeping this as a known issue/limitation, as this really only impacts strategy authors/platform teams.
I think that would break the other steps in the build. IIRC Tekton has an "entrypoint" mechanism that starts all TaskRun containers at the same time, then effectively pauses/sleeps them to execute in the right order. |
I would say we document it but would still try to resolve it. It is a not so nice limitation and we have own build strategies that work around it like https://github.com/shipwright-io/build/blob/v0.13.0/samples/v1beta1/buildstrategy/ko/buildstrategy_ko_cr.yaml#L104. So, I can understand that build strategy authors run into that issue. What I would be interested is your opinion on the Tekton behavior. Tekton could easily remove the workingDir from the container and start the step command in the workingDir from their entrypoint (or fail if at that time the directory does not exist). |
Interesting. I can locally reproduce the Git behavior, but in Shipwright, this sometimes works: cat <<EOF | kubectl create -f -
apiVersion: shipwright.io/v1beta1
kind: BuildStrategy
metadata:
name: source-context-working-dir
spec:
steps:
- name: noop
image: registry.access.redhat.com/ubi9/ubi-minimal
workingDir: $(params.shp-source-context)
command:
- ls
args:
- $(params.shp-source-context)
securityContext:
runAsUser: 1000
runAsGroup: 1000
EOF
$ shp build create source-context --source-url https://github.com/shipwright-io/sample-go --source-context-dir source-build --output-image dummy
Created build "source-context"
$ shp build run source-context --follow
Pod "source-context-5f8m2-kvg8b-pod" is in state "Pending"...
Pod "source-context-5f8m2-kvg8b-pod" is in state "Pending"...
Pod "source-context-5f8m2-kvg8b-pod" is in state "Pending"...
Pod "source-context-5f8m2-kvg8b-pod" is in state "Pending"...
Pod "source-context-5f8m2-kvg8b-pod" is in state "Pending"...
Pod "source-context-5f8m2-kvg8b-pod" is in state "Pending"...
Pod "source-context-5f8m2-kvg8b-pod" is in state "Pending"...
succeeded event for pod "source-context-5f8m2-kvg8b-pod" arrived before or in place of running event so dumping logs now
*** Pod "source-context-5f8m2-kvg8b-pod", container "step-source-default": ***
2024/11/16 19:44:39 Info: ssh (/usr/bin/ssh): OpenSSH_8.7p1, OpenSSL 3.0.7 1 Nov 2022
2024/11/16 19:44:39 Info: git (/usr/bin/git): git version 2.43.5
2024/11/16 19:44:39 Info: git-lfs (/usr/bin/git-lfs): git-lfs/3.4.1 (GitHub; linux arm64; go 1.21.13 (Red Hat 1.21.13-3.el9_4) X:strictfipsruntime)
2024/11/16 19:44:39 /usr/bin/git -c safe.directory=/workspace/source clone -h
2024/11/16 19:44:39 /usr/bin/git -c safe.directory=/workspace/source submodule -h
2024/11/16 19:44:39 /usr/bin/git -c safe.directory=/workspace/source clone --quiet --no-tags --single-branch --depth 1 -- https://github.com/shipwright-io/sample-go /workspace/source
2024/11/16 19:44:39 /usr/bin/git -c safe.directory=/workspace/source -C /workspace/source submodule update --init --recursive --depth 1
2024/11/16 19:44:39 /usr/bin/git -c safe.directory=/workspace/source -C /workspace/source rev-parse --abbrev-ref HEAD
2024/11/16 19:44:39 Successfully loaded https://github.com/shipwright-io/sample-go (main) into /workspace/source
2024/11/16 19:44:39 /usr/bin/git -c safe.directory=/workspace/source -C /workspace/source rev-parse --verify HEAD
2024/11/16 19:44:39 /usr/bin/git -c safe.directory=/workspace/source -C /workspace/source log -1 --pretty=format:%an
2024/11/16 19:44:39 /usr/bin/git -c safe.directory=/workspace/source -C /workspace/source show --no-patch --format=%ct
2024/11/16 19:44:39 /usr/bin/git -c safe.directory=/workspace/source -C /workspace/source rev-parse --abbrev-ref HEAD
*** Pod "source-context-5f8m2-kvg8b-pod", container "step-noop": ***
go.mod
main.go
Pod "source-context-5f8m2-kvg8b-pod" has succeeded! The Pod contains the working-dir-initializer from Tekton which creates source-build, so /workspace/source is not empty. The next run then failed: $ shp build run source-context --follow
Pod "source-context-xphz7-dr42s-pod" is in state "Pending"...
Pod "source-context-xphz7-dr42s-pod" is in state "Pending"...
Pod "source-context-xphz7-dr42s-pod" is in state "Pending"...
Pod "source-context-xphz7-dr42s-pod" is in state "Pending"...
Pod "source-context-xphz7-dr42s-pod" is in state "Pending"...
Pod "source-context-xphz7-dr42s-pod" is in state "Pending"...
Pod "source-context-xphz7-dr42s-pod" in "Running" state, starting up log tail
[prepare] 2024/11/16 19:55:14 Entrypoint initialization
[source-default] 2024/11/16 19:55:16 Info: ssh (/usr/bin/ssh): OpenSSH_8.7p1, OpenSSL 3.0.7 1 Nov 2022
[source-default] 2024/11/16 19:55:16 Info: git (/usr/bin/git): git version 2.43.5
[source-default] 2024/11/16 19:55:16 Info: git-lfs (/usr/bin/git-lfs): git-lfs/3.4.1 (GitHub; linux arm64; go 1.21.13 (Red Hat 1.21.13-3.el9_4) X:strictfipsruntime)
[source-default] 2024/11/16 19:55:16 /usr/bin/git -c safe.directory=/workspace/source clone -h
[source-default] 2024/11/16 19:55:16 /usr/bin/git -c safe.directory=/workspace/source submodule -h
[source-default] 2024/11/16 19:55:16 /usr/bin/git -c safe.directory=/workspace/source clone --quiet --no-tags --single-branch --depth 1 -- https://github.com/shipwright-io/sample-go /workspace/source
[source-default] 2024/11/16 19:55:17 error: unable to create file source-build/go.mod: Permission denied
[source-default] error: unable to create file source-build/main.go: Permission denied
[source-default] fatal: unable to checkout working tree
[source-default] warning: Clone succeeded, but checkout failed.
[source-default] You can inspect what was checked out with 'git status'
[source-default] and retry with 'git restore --source=HEAD :/' (exit code 128)
[noop] 2024/11/16 19:55:18 Skipping step because a previous step failed
BuildRun "source-context-xphz7" has failed at step "step-source-default" because of GitError: error: unable to create file source-build/go.mod: Permission denied
error: unable to create file source-build/main.go: Permission denied
fatal: unable to checkout working tree
warning: Clone succeeded, but checkout failed.
and retry with 'git restore --source=HEAD : /' (exit code 128)
Step details:
{
"name": "step-source-default",
"state": {
"terminated": {
"exitCode": 128,
"reason": "Error",
"message": "[{\"key\":\"shp-error-message\",\"value\":\"error: unable to create file source-build/go.mod: Permission denied\\nerror: unable to create file source-build/main.go: Permission denied\\nfatal: unable to checkout working tree\\nwarning: Clone succeeded, but checkout failed.\\nand retry with 'git restore --source=HEAD : /' (exit code 128)\",\"type\":1},{\"key\":\"shp-error-reason\",\"value\":\"GitError\",\"type\":1},{\"key\":\"StartedAt\",\"value\":\"2024-11-16T19:55:16.560Z\",\"type\":3}]",
"startedAt": "2024-11-16T19:55:16Z",
"finishedAt": "2024-11-16T19:55:17Z",
"containerID": "containerd://4d2c77b2795590817e361399142fa2b2bebc71b02ea2213a9c0d530a55f829af"
}
},
"lastState": {},
"ready": false,
"restartCount": 0,
"image": "sha256:21f50544d04a70b16869df715ee818441b80042a1ce880b1f4c42b6722e20f3d",
"imageID": "registry.saschaschwarze.de/shipwright-io/git@sha256:89e2f3d3f0bcb2657c5c446ecf3fb54197d4640b83824ac7de000ed5db1cf59b",
"containerID": "containerd://4d2c77b2795590817e361399142fa2b2bebc71b02ea2213a9c0d530a55f829af",
"started": false
}
ERROR: buildrun pod "source-context-xphz7-dr42s-pod" has failed |
There is a weird timing issue. Sometimes the directory created by the working dir initializer is there, and sometimes not (yet). I was able to always reproduce the second behavior by doing a sleep of one second before doing the clone. I then tried to implement my idea from #1573 (comment) = to clone to a different directory and then move content over. Code works perfect, but Tekton is broken here. The working dir initializer creates directories as root and without write permissions. That's just non-sense. Any Tekton workload that runs as non-root cannot work that way. Below is a codebase where I had commented out the "restore" in case the target entry already exists (source-build in my case), I also changed the build strategy to list the source root. One can see that all cloned content is present, but source-build has the wrong user to properly run it. $ shp build run source-context --follow
Pod "source-context-svr5k-c8gt7-pod" is in state "Pending"...
Pod "source-context-svr5k-c8gt7-pod" is in state "Pending"...
Pod "source-context-svr5k-c8gt7-pod" is in state "Pending"...
Pod "source-context-svr5k-c8gt7-pod" is in state "Pending"...
Pod "source-context-svr5k-c8gt7-pod" is in state "Pending"...
Pod "source-context-svr5k-c8gt7-pod" in "Running" state, starting up log tail
[prepare] 2024/11/16 20:44:49 Entrypoint initialization
[source-default] 2024/11/16 20:44:51 Info: ssh (/usr/bin/ssh): OpenSSH_8.7p1, OpenSSL 3.2.2 4 Jun 2024
[source-default] 2024/11/16 20:44:51 Info: git (/usr/bin/git): git version 2.43.5
[source-default] 2024/11/16 20:44:51 Info: git-lfs (/usr/bin/git-lfs): git-lfs/3.4.1 (GitHub; linux arm64; go 1.21.13 (Red Hat 1.21.13-3.el9_4) X:strictfipsruntime)
[source-default] 2024/11/16 20:44:51 /usr/bin/git -c safe.directory=/workspace/source clone -h
[source-default] 2024/11/16 20:44:51 /usr/bin/git -c safe.directory=/workspace/source submodule -h
[source-default] 2024/11/16 20:44:52 /usr/bin/git -c safe.directory=/workspace/source/.tmp clone --quiet --no-tags --single-branch --depth 1 -- https://github.com/shipwright-io/sample-go /workspace/source/.tmp
[source-default] 2024/11/16 20:44:52 /usr/bin/git -c safe.directory=/workspace/source/.tmp -C /workspace/source/.tmp submodule update --init --recursive --depth 1
[source-default] 2024/11/16 20:44:53 /usr/bin/git -c safe.directory=/workspace/source/.tmp -C /workspace/source/.tmp rev-parse --abbrev-ref HEAD
[source-default] 2024/11/16 20:44:53 Successfully loaded https://github.com/shipwright-io/sample-go (main) into /workspace/source/.tmp
[source-default] 2024/11/16 20:44:53 /usr/bin/git -c safe.directory=/workspace/source/.tmp -C /workspace/source/.tmp rev-parse --verify HEAD
[source-default] 2024/11/16 20:44:53 /usr/bin/git -c safe.directory=/workspace/source/.tmp -C /workspace/source/.tmp log -1 --pretty=format:%an
[source-default] 2024/11/16 20:44:53 /usr/bin/git -c safe.directory=/workspace/source/.tmp -C /workspace/source/.tmp show --no-patch --format=%ct
[source-default] 2024/11/16 20:44:53 /usr/bin/git -c safe.directory=/workspace/source/.tmp -C /workspace/source/.tmp rev-parse --abbrev-ref HEAD
[noop] total 56
[noop] drwxrwxrwx 8 root root 4096 Nov 16 20:44 .
[noop] drwxrwxrwx 3 root root 4096 Nov 16 20:44 ..
[noop] drwxr-xr-x 8 1000 1000 4096 Nov 16 20:44 .git
[noop] drwxr-xr-x 3 1000 1000 4096 Nov 16 20:44 .github
[noop] -rw-r--r-- 1 1000 1000 5 Nov 16 20:44 .shpignore
[noop] -rw-r--r-- 1 1000 1000 11357 Nov 16 20:44 LICENSE
[noop] -rw-r--r-- 1 1000 1000 291 Nov 16 20:44 OWNERS
[noop] -rw-r--r-- 1 1000 1000 1377 Nov 16 20:44 README.md
[noop] drwxr-xr-x 2 1000 1000 4096 Nov 16 20:44 docker-build
[noop] drwxr-xr-x 2 1000 1000 4096 Nov 16 20:44 docker-build-with-args
[noop] drwxr-xr-x 2 root root 4096 Nov 16 20:44 source-build
[noop] drwxr-xr-x 3 1000 1000 4096 Nov 16 20:44 source-build-with-package Let me try to find out if one can customize the working dir initialzer, otherwise I'll open an issue there. |
Tekton creates the directory with 0755 permission. Tekton runs workingdirinit as root in its default configuration. One can set the feature flag "set-security-context" to "true" to force it to run as non-root, but that is still not guaranteed to be the user that our build runs as. Related Tekton issue: tektoncd/pipeline#6842 Idea: if the build strategy has a global securityContext with runAs, then we could set this in the podTemplate security context? |
Does not work. I do not know why. A Pod with securityContext.runAsUser set to 1000 and an initContainer with the securityContext coming from set-security-context=true which has runAsNonRoot but no runAsUser. As a result, the directory is still created as root. Confused. |
Is there an existing issue for this?
Kubernetes Version
k8s: 1.28.7
Tekton Pipelines: 0.56.2
Shipwright Version
0.12.0
Current Behavior
When authoring a build strategy, builds risk failure if a build strategy step has its
workingDir
set to a sub-directory and source is cloned from git. Git expects the target directory of any clone action to be empty.Setting
workingDir
to a sub-directory of the source root (ex:contextDir
) results in errors like the following:Expected Behavior
Ideally build steps succeed if the directory is a subPath of the working directory. However, this may prove difficult due to the way Tekton, Kubernetes, and potentially the underlying container runtime operate (everything runs in a single TaskRun/Pod today).
Steps To Reproduce
workingDir
set to a sub-path of$(params.shp-source-root)
Anything else?
This is perhaps something that we document as a known issue - ex: in a guide for Build Strategy authors.
The text was updated successfully, but these errors were encountered: