Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add custom docker image support #1193

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

devsnek
Copy link

@devsnek devsnek commented Nov 24, 2020

Fixes #315

@devsnek devsnek force-pushed the custom-docker-images branch from 403e26b to 34ae07e Compare November 24, 2020 20:11
@devsnek
Copy link
Author

devsnek commented Nov 24, 2020

cc @jyn514

if this needs tests, where should they go?

@jyn514
Copy link
Member

jyn514 commented Nov 24, 2020

@devsnek there aren't currently tests for a full build: #822. If you tested locally it works that's good enough, and I'll also test it manually myself before merging.

Copy link
Member

@jyn514 jyn514 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to use a git dependency of rustwide for now so it's possible to test that it works?

@@ -322,14 +330,18 @@ impl RustwideBuilder {

let local_storage = tempfile::Builder::new().prefix("docsrs-docs").tempdir()?;

let metadata = Metadata::from_crate_root(&build_dir.build_dir())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, are you sure this is right? I think builds happen in a temporary directory, not in the source directory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested it, but Build::host_source_dir returns self.dir.source_dir() so i just made that method public.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have build_dir here, not source_dir - is that intentional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately this won't work. source_dir() is populated during the run() call (build.rs => prepare.rs), so it's going to be empty where you call it. What we could do instead is to make rustwide's Crate::copy_source_to method public, copy the source to a temporary directory, fetch the metadata from it, and then delete the temporary directory.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about a BuildDir::create_or_get_source_dir (or whatever name) which both docs.rs and run() can call so we don't waste cpu cycles moving everything twice

@jyn514 jyn514 added A-builds Area: Building the documentation for a crate S-blocked Status: marked as blocked ❌ on something else such as an RFC or other implementation work. labels Nov 24, 2020
@devsnek devsnek marked this pull request as ready for review November 24, 2020 22:51
@devsnek devsnek force-pushed the custom-docker-images branch from 4806164 to 0c7d7e4 Compare November 26, 2020 20:36
Comment on lines 286 to 287
pub fn docker_image(&self) -> Option<String> {
self.docker_image.clone()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's return a borrow here, since it's not dynamic.

Suggested change
pub fn docker_image(&self) -> Option<String> {
self.docker_image.clone()
pub fn docker_image(&self) -> Option<&str> {
self.docker_image.as_ref()

@jyn514
Copy link
Member

jyn514 commented Nov 27, 2020

One potential issue I see here is that we'll run out of disk space on the build server really quickly: right now we have 20 GB free (out of 100 GB) but we'll quickly run out of that if every project has their own dockerfile. The current workaround is to run docker container prune && docker image prune -a about once a week, but if each docker image is a separate tag, I don't think they'll ever be removed, and we'll run out of space pretty quickly.

@devsnek
Copy link
Author

devsnek commented Nov 27, 2020

@jyn514 i don't know much about docker storage stuff but I was thinking you would probably want to delete the image right after the build.

@jyn514
Copy link
Member

jyn514 commented Nov 27, 2020

Yup, that would work! Can you add logic for that here? I think docker image rm <image> would work.

@devsnek
Copy link
Author

devsnek commented Nov 27, 2020

@jyn514 does the dummy crate need to be built in the custom docker image?

@jyn514
Copy link
Member

jyn514 commented Nov 27, 2020

@devsnek I don't quite follow - the dummy crate doesn't configure a docker image, right? So it should be built in the default image, which is https://github.com/rust-lang/crates-build-env.

@@ -424,6 +436,11 @@ impl RustwideBuilder {
build_dir.purge()?;
krate.purge_from_cache(&self.workspace)?;
local_storage.close()?;
if let Some(image) = metadata.docker_image() {
std::process::Command::new("docker")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::process::Command::new("docker")
if image != "rustops/crates-build-env" {
std::process::Command::new("docker");
}

Otherwise someone could DOS the queue by publishing a bunch of crates with crates-build-env set.

Actually, I guess they could do that anyway, crates-build-env isn't special ... we just use it by default, so it will be redownloaded. But if you publish a bunch of crates in a row with the same image, it will redownload the image for each crate.

@pietroalbini what do you think, is that a threat model worth worrying about? It won't break the server, it will just make builds really slow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some other ways to break this:

  • Make an enormous (like 100gb) docker image
  • Make small images which can download quickly enough to hit docker's new ratelimits

@devsnek
Copy link
Author

devsnek commented Nov 27, 2020

ah yeah that makes sense, I was thinking the cargo.toml of the target crate would somehow be loaded for the dummy crate which makes no sense of course...

@pietroalbini
Copy link
Member

So, my understanding is there are two problems to solve:

  • We need to ensure the images we download are not too large, to avoid exhausting disk space on the server.
  • We need to ensure we don't hit Docker Hub's rate limits.

The first issue can be fixed with an experimental command, docker manifest inspect, which allows to fetch information about a remote image before downloading it:

pietro@january: ~/tmp$ docker manifest inspect rustops/crates-build-env
{
        "schemaVersion": 2,
        "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
        "config": {
                "mediaType": "application/vnd.docker.container.image.v1+json",
                "size": 4686,
                "digest": "sha256:1a347871fe6e52601397c10649f49781e704fe732ca1194f762c06d62dd2a9fc"
        },
        "layers": [
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 28558714,
                        "digest": "sha256:6a5697faee43339ef8e33e3839060252392ad99325a48f7c9d7e93c22db4d4cf"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 847,
                        "digest": "sha256:ba13d3bc422b493440f97a8f148d245e1999cb616cb05876edc3ef29e79852f2"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 162,
                        "digest": "sha256:a254829d9e55168306fd80a49e02eb015551facee9c444d9dce3b26d19238b82"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 5176,
                        "digest": "sha256:956656aa3a91375bfd9eaf0c2e429fe88ffc8e0da6a415bf0621dfacd74a01a7"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 2065351348,
                        "digest": "sha256:6dc46121fa2b1fb5b1eb401c30a4b2272b13d9f411a049d0794118cad4e7471e"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 159,
                        "digest": "sha256:e619300b8e707992732abccb37e3d9950ccb23ab8d4de064d99c802668af5050"
                },
                {
                        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                        "size": 670,
                        "digest": "sha256:25fa93f27f0567e5ba7a7ddaed764bb269967125c7e99aece3986415cf0d7ca4"
                }
        ]
}

Then we can sum the size of each layer, and if it's greater than the limit we abort the build with an error. Instead if it is lower than the limit we fetch the image with the config.digest to ensure we're fetching the image we want. The problem is, calling this command consumes yet another Docker Hub rate limit point, which is far from great.

To "solve" the rate limiting problem, and to prevent slowing down the queue too much, we could gate using custom Docker image with a false-by-default limit, which we lift only if a crate opens an issue and can't add the dependency to crates-build-env (like the nginx case we had). What do y'all think?

For the implementation side I'd move the size check over to Rustwide, doing a breaking change on the SandboxImage::remote() function to be fn remote(name: &str, size_limit: Option<usize>) -> ....

@@ -424,6 +436,11 @@ impl RustwideBuilder {
build_dir.purge()?;
krate.purge_from_cache(&self.workspace)?;
local_storage.close()?;
if let Some(image) = metadata.docker_image() {
std::process::Command::new("docker")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd move this to rustwide's SandboxImage::purge_from_cache().

@jyn514
Copy link
Member

jyn514 commented Nov 27, 2020

To "solve" the rate limiting problem, and to prevent slowing down the queue too much, we could gate using custom Docker image with a false-by-default limit, which we lift only if a crate opens an issue and can't add the dependency to crates-build-env (like the nginx case we had). What do y'all think?

I really like this idea, that gets us three things:

  • People aren't using a custom docker image when they don't need to, so crater still benefits from the new dependencies in crates-build-env most of the time
  • We're downloading fewer docker images overall, so it's less strain on our resources
  • We know who requested an override, so a) we can verify the docker image is a reasonable size, and b) if they change the image, we know exactly who to ask pointed questions.

@devsnek can you add a resource limit for docker images that's disabled by default? It would go in Limits, under docbuilder/limits.rs.

@pietroalbini
Copy link
Member

@jyn514 do you also want to have size limits for images implemented?

@jyn514
Copy link
Member

jyn514 commented Nov 27, 2020

Sure, size limits seem reasonable - maybe 5 GB to start?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-builds Area: Building the documentation for a crate S-blocked Status: marked as blocked ❌ on something else such as an RFC or other implementation work.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build in custom docker image
3 participants