Skip to content
This repository has been archived by the owner on Dec 12, 2024. It is now read-only.

structure testing docker volumes are convoluted and potentially problematic #147

Open
balopat opened this issue Sep 6, 2017 · 4 comments

Comments

@balopat
Copy link
Contributor

balopat commented Sep 6, 2017

I think we might need a bit of a redesign here, couldn't solve it quickly, so I'll document the problems I see.

Currently in our build we have the following participants:

  • the docker-enabled host (in case of GCCB it's a GCE instance, in case of container-builder-local it's my workstation) starts the build steps
  • mvn step is the first (and only) step of our build currently, that spins up the "step-container"
  • that kicks off the maven process, which has an execution phase that runs the structure_test.sh script
  • structure_test.sh downloads ext_run.sh - which was seemingly designed to run from a docker-enabled host directly, not a container - and runs it
  • ext_run.sh spins up a "sibling-container" (as it shares now the daemon with the "step-container") from the built openjdk image

Volumes and directories:

GCCB:

  • host: /workspace contains the project directory (passed in)
  • step-container: /workspace is a volume mounted from the /workspace directory from host (NOTE: in the future this might change to a separate, isolated volume, that is the clone of the /workspace directory so that the steps don't mutate the host filesystem!!)
  • sibling-container:
    • /cfg - that is mounted with -v /workspace/openjdk8/target/classes/.cfg:/cfg in the case of openjdk8
    • /workspace - that is coming from the --workspace argument of ext_run.sh: --workspace /workspace/openjdk8/target/test-classes/openjdk-test-common-out/workspace for example

The problematic part is that both sibling-container volumes work currently because of the coincidental design overlap - host \workspace is directly mounted onto \workspace in the step-container. As soon as this symmetry breaks we have issues.
The reason is that the "sibling-container" is talking to the docker daemon on the host and as such resolves directory based docker volumes from the host.

To see where this reliance already causes problem, let's have a look at the same directories in the container-builder-local case:

  • host: /home/balopat/dev/proj/openjdk-runtime contains the project directory (passed in)
  • step-container: /workspace is a cloned volume mounted from the /home/balopat/dev/proj/openjdk-runtime directory (NOTE: this is intentionally designed like this and it might be the future on GCE too!)
  • sibling-container:
    • /cfg - that is mounted with -v /workspace/openjdk8/target/classes/.cfg:/cfg in the case of openjdk8 ==> ERROR, the directory is empty
    • /workspace - that is coming from the --workspace argument of ext_run.sh: --workspace /workspace/openjdk8/target/test-classes/openjdk-test-common-out/workspace for example ==> ERROR, the directory is empty

Structure tests fail with this setup.

The problem:

  • our current setup doesn't work directly with container-builder-local
  • in the future, if/when GCCB changes the way /workspace is mounted on steps (e.g. an isolated, cloned volume), then we'll fail on GCCB as well.

Not exactly sure about the best way to go. Couple of options:

  • we can modify ext_run.sh to prepare it for running from within a container and pass in explicitly the right volumes
  • we can potentially completely rewire the execution flow and rely more on GCCB/GCCB-local for each step instead of wrapping the whole flow into maven - this way we could use the runtimes-common structure tests as they were designed to be used - as a step in GCCB
@balopat
Copy link
Contributor Author

balopat commented Sep 6, 2017

@aslo @nkubala FYI, let me know what you guys think.

@nkubala
Copy link
Contributor

nkubala commented Sep 6, 2017

Haha, good timing on this one: I'm in the middle of a major redesign of the structure tests as we speak, which will pull the test execution outside of a container. Check out GoogleCloudPlatform/runtimes-common#332 for progress

@balopat
Copy link
Contributor Author

balopat commented Sep 7, 2017

awesome @nkubala, let's see how that shakes out, it looks promising!
Also it seems that based on our conversation the container-builder-local might have support in the future to define workspace as a directory vs the sandbox volume only - that would help as well in our case (b/65417396)

@aslo
Copy link
Contributor

aslo commented Sep 7, 2017

Great work on this analysis! I @nkubala 's change looks promising.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants