Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable loopback adapter in hermetic unshare namespace #1469

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

brianwcook
Copy link
Contributor

@brianwcook brianwcook commented Sep 25, 2024

Bazel uses a client server architecture to execute even when performing network isolated builds.It works fine as long as there is any adapter, even a loopback addapter. The default unshare env has a loopback device [lo] but it is DOWN by default. This PR brings lo UP in the unshare environment so that hermetic Bazel builds will work.

It is dependent on the 'ip' command being present in the buildah image so another PR is open for that:
konflux-ci/buildah-container#90

Bazel uses a client server architecture to execute even when
performing network isolated builds.It works fine as long as there
is any adapter, even a loopback addapter. The default unshare env
has a loopback device [lo] but it is DOWN by default. This PR
brings lo UP in the unshare environment so that hermetic Bazel
builds will work.
--tls-verify=$TLSVERIFY --no-cache \
--ulimit nofile=4096:4096 \
-f "$dockerfile_path" -t $IMAGE .
COMMAND="ip link set lo up;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe a comment why it's needed would be useful here

Comment on lines +415 to +422
COMMAND="ip link set lo up;
buildah build \
$VOLUME_MOUNTS \
"${BUILDAH_ARGS[@]}" \
"${LABELS[@]}" \
--tls-verify=$TLSVERIFY --no-cache \
--ulimit nofile=4096:4096 \
-f "$dockerfile_path" -t $IMAGE ."
Copy link
Contributor

@chmeliik chmeliik Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quoting this properly to avoid bugs due to whitespace in BUILDAH_ARGS or LABELS will be... interesting

Something like this could potentially work

buildah_cmd_array=(
    buildah build
    $VOLUME_MOUNTS
    "${BUILDAH_ARGS[@]}"
    "${LABELS[@]}"
    ...
)
buildah_cmd=$(printf "%q " "${buildah_cmd_array[@]}")
unshare ... -- sh -c "ip link set lo up && $buildah_cmd"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw I didn't change any quoting - are you saying it was not quoted properly before, but you want to fix it now?

Copy link
Contributor

@chmeliik chmeliik Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we were executing a command which wasn't quite quoted properly, but the BUILDAH_ARGS and LABELS - those most likely to contain whitespace - were quoted properly.

Now, we're executing sh -c "$huge_string", where the huge_string is not quoted properly compared to before

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok - well I struggled to come up with a way to ensure that

unshare ( a and b) was run
vs
unshare (a) then b

If there is any better way to accomplish it than what I added, I'm open to suggestions. It is easy to tell when things are working properly by adding a line to the dockerfile which lists network devices. If it is running with unshare, you only have 'lo' - but without unshare, you have a second eth adapter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so that we're on the same page about the bugs introduced by this patch (i.e. it's not just nitpicking)

This what would happen previously

#!/bin/bash

VOLUME_MOUNTS="--volume /tmp:/foo --volume /tmp:/bar"
BUILDAH_ARGS=(
    --pull=never
    --squash
    --build-arg="FOO=some build arg"
)
LABELS=(
    --label="com.example.org/description=This is a description."
)
TLSVERIFY=true
IMAGE=localhost/test:latest
dockerfile_path=Dockerfile

set -x

buildah build \
  $VOLUME_MOUNTS \
  "${BUILDAH_ARGS[@]}" \
  "${LABELS[@]}" \
  --tls-verify=$TLSVERIFY --no-cache \
  --ulimit nofile=4096:4096 \
  -f "$dockerfile_path" -t $IMAGE .
$ bash working.sh
+ buildah build --volume /tmp:/foo --volume /tmp:/bar --pull=never --squash '--build-arg=FOO=some build arg' '--label=com.example.org/description=This is a description.' --tls-verify=true --no-cache --ulimit nofile=4096:4096 -f Dockerfile -t localhost/test:latest .
STEP 1/4: FROM fedora-minimal:40
...

This what will happen now

#!/bin/bash

VOLUME_MOUNTS="--volume /tmp:/foo --volume /tmp:/bar"
BUILDAH_ARGS=(
    --pull=never
    --squash
    --build-arg="FOO=some build arg"
)
LABELS=(
    --label="com.example.org/description=This is a description."
)
TLSVERIFY=true
IMAGE=localhost/test:latest
dockerfile_path=Dockerfile

COMMAND="ip link set lo up;
    buildah build \
      $VOLUME_MOUNTS \
      "${BUILDAH_ARGS[@]}" \
      "${LABELS[@]}" \
      --tls-verify=$TLSVERIFY --no-cache \
      --ulimit nofile=4096:4096 \
      -f "$dockerfile_path" -t $IMAGE ."

set -x

sh -c "$COMMAND"
$ bash broken.sh
+ sh -c 'ip link set lo up;
    buildah build       --volume /tmp:/foo --volume /tmp:/bar       --pull=never --squash --build-arg=FOO=some build arg       --label=com.example.org/description=This is a description.       --tls-verify=true --no-cache       --ulimit nofile=4096:4096       -f Dockerfile -t localhost/test:latest .'

Error: accepts at most 1 arg(s), received 15

(Notice how the --build-arg and --label arguments are missing quotes)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to fix it and send a whole diff

Copy link
Contributor

@chmeliik chmeliik Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also fixes some quoting bugs that were already present

fix.diff (apply with git apply after rebasing on main)
diff --git a/task/buildah/0.2/buildah.yaml b/task/buildah/0.2/buildah.yaml
index 8c8719b2..94aa81ad 100644
--- a/task/buildah/0.2/buildah.yaml
+++ b/task/buildah/0.2/buildah.yaml
@@ -256,10 +256,11 @@ spec:
       )
 
       BUILDAH_ARGS=()
+      UNSHARE_ARGS=()
 
       if [ "${HERMETIC}" == "true" ]; then
         BUILDAH_ARGS+=("--pull=never")
-        UNSHARE_ARGS="--net"
+        UNSHARE_ARGS+=("--net")
         for image in $BASE_IMAGES; do
           unshare -Ufp --keep-caps -r --map-users 1,1,65536 --map-groups 1,1,65536 -- buildah pull $image
         done
@@ -284,10 +285,12 @@ spec:
         BUILDAH_ARGS+=("--skip-unused-stages=false")
       fi
 
+      VOLUME_MOUNTS=()
+
       if [ -f "$(workspaces.source.path)/cachi2/cachi2.env" ]; then
         cp -r "$(workspaces.source.path)/cachi2" /tmp/
         chmod -R go+rwX /tmp/cachi2
-        VOLUME_MOUNTS="--volume /tmp/cachi2:/cachi2"
+        VOLUME_MOUNTS+=(--volume /tmp/cachi2:/cachi2)
         # Read in the whole file (https://unix.stackexchange.com/questions/533277), then
         # for each RUN ... line insert the cachi2.env command *after* any options like --mount
         sed -E -i \
@@ -314,7 +317,7 @@ spec:
       if [ -d "${YUM_REPOS_D_FETCHED}" ]; then
         chmod -R go+rwX ${YUM_REPOS_D_FETCHED}
         mount_point=$(realpath ${YUM_REPOS_D_FETCHED})
-        VOLUME_MOUNTS="${VOLUME_MOUNTS} --volume ${mount_point}:${YUM_REPOS_D_TARGET}"
+        VOLUME_MOUNTS+=(--volume "${mount_point}:${YUM_REPOS_D_TARGET}")
       fi
 
       LABELS=(
@@ -335,12 +338,12 @@ spec:
       if [ -e /activation-key/org ]; then
         cp -r --preserve=mode "$ACTIVATION_KEY_PATH" /tmp/activation-key
         mkdir /shared/rhsm-tmp
-        VOLUME_MOUNTS="${VOLUME_MOUNTS} --volume /tmp/activation-key:/activation-key -v /shared/rhsm-tmp:/etc/pki/entitlement:Z"
+        VOLUME_MOUNTS+=(--volume /tmp/activation-key:/activation-key -v /shared/rhsm-tmp:/etc/pki/entitlement:Z)
         echo "Adding activation key to the build"
 
       elif find /entitlement -name "*.pem" >> null; then
         cp -r --preserve=mode "$ENTITLEMENT_PATH" /tmp/entitlement
-        VOLUME_MOUNTS="${VOLUME_MOUNTS} --volume /tmp/entitlement:/etc/pki/entitlement"
+        VOLUME_MOUNTS+=(--volume /tmp/entitlement:/etc/pki/entitlement)
         echo "Adding the entitlement to the build"
       fi
 
@@ -357,13 +360,19 @@ spec:
       # Prevent ShellCheck from giving a warning because 'image' is defined and 'IMAGE' is not.
       declare IMAGE
 
-      unshare -Uf $UNSHARE_ARGS --keep-caps -r --map-users 1,1,65536 --map-groups 1,1,65536 -w ${SOURCE_CODE_DIR}/$CONTEXT -- buildah build \
-        $VOLUME_MOUNTS \
-        "${BUILDAH_ARGS[@]}" \
-        "${LABELS[@]}" \
-        --tls-verify=$TLSVERIFY --no-cache \
-        --ulimit nofile=4096:4096 \
-        -f "$dockerfile_copy" -t "$IMAGE" .
+      buildah_cmd_array=(
+          buildah build
+          "${VOLUME_MOUNTS[@]}"
+          "${BUILDAH_ARGS[@]}"
+          "${LABELS[@]}"
+          --tls-verify="$TLSVERIFY" --no-cache
+          --ulimit nofile=4096:4096
+          -f "$dockerfile_path" -t "$IMAGE" .
+      )
+      buildah_cmd=$(printf "%q " "${buildah_cmd_array[@]}")
+      command="ip link set lo up && $buildah_cmd"
+
+      unshare -Uf "${UNSHARE_ARGS[@]}" --keep-caps -r --map-users 1,1,65536 --map-groups 1,1,65536 -w "${SOURCE_CODE_DIR}/$CONTEXT" -- sh -c "$command"
 
       container=$(buildah from --pull-never "$IMAGE")
       buildah mount $container | tee /shared/container_path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! I wasn't trying to claim that there were no bugs. I really wish that we had a solid way to add per-task unit and integration tests that would clearly identify these kinds of issues. It is something I've been talking to @Dannyb48 about lately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

I really wish that we had a solid way to add per-task unit and integration tests that would clearly identify these kinds of issues.

@tisutisu is already looking into that, and @lcarva had a proof of concept for it (which is being evaluated as part of the investigation) https://docs.google.com/document/d/1qGOhEa21UzgJUtPd43RZOhEyCrl9tJo5EwRZzFeRofg/edit

@chmeliik
Copy link
Contributor

Have you verified that this does not mess with the network isolation?

@brianwcook
Copy link
Contributor Author

Have you verified that this does not mess with the network isolation?

I did. I added debugging statements to ensure that during my testing only the 'lo' adapter was present (to ensure both commands were running in the unshare env) and I also included things in the Dockerfile which should fail in hermetic mode, and they did.

@openshift-merge-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants