Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR]: Container image layers per-package #244

Open
alexeagle opened this issue Jan 16, 2024 · 8 comments
Open

[FR]: Container image layers per-package #244

alexeagle opened this issue Jan 16, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@alexeagle
Copy link
Member

What is the current behavior?

Currently our best-effort to make a layered image for Python is three layers: interpreter, site-packages, app:
https://github.com/aspect-build/bazel-examples/blob/main/oci_python_image/py_layer.bzl

However, when a package appears under multiple applications, the site-packages layer for each app will have a duplicate of that package, making the build outputs larger. When pytorch is one such package (it is many gigabytes for the GPU variant) then this problem leads to builds taking minutes just to fetch build inputs from a remote cache.

Describe the feature

We can probably do something with aspects to visit each third-party package and produce a tar file containing its site-packages entry as a distinct layer. Then when we build a container image for an application these tar files should just get re-used rather than being re-built.

@alexeagle alexeagle added the enhancement New feature or request label Jan 16, 2024
@github-actions github-actions bot added the untriaged Requires traige label Jan 16, 2024
@ewianda
Copy link

ewianda commented Jan 31, 2024

Hi @alexeagle I had been thinking about this problem, but never thought I could use aspect for this. So after seeing your suggestion, I went ahead implemented this feature. But the bummer was that I hit the max limit of number of layers which I believe is 127 for docker

3557592c9b6f: Loading layer [==================================================>]     432B/432B
c7d6f3eb36c2: Loading layer [==================================================>]     794B/794B
956257f1d3a3: Loading layer [==================================================>]  3.183kB/3.183kB
667da44de747: Loading layer [==================================================>]  1.343kB/1.343kB
88a0745cd9e2: Loading layer [==================================================>]   11.6kB/11.6kB
f3af998a1f5e: Loading layer [==================================================>]  3.543kB/3.543kB
9abe9f6a4fae: Loading layer [==================================================>]     906B/906B
3b5c52259645: Loading layer [==================================================>]  1.711MB/1.711MB
710de8cf73d3: Loading layer [===>                                               ]  32.77kB/497.4kB
max depth exceeded

Not sure if you had thoughts about how this will play out.

I can share my implementation may be in

https://github.com/aspect-build/bazel-examples/tree/main/oci_python_image

@siddharthab
Copy link
Contributor

siddharthab commented Feb 27, 2024

A layer per package would be too much for some docker fs implementations (I forget which ones). We typically keep separate image layers for large packages, and bundle all smaller packages together. Something like this:

HEAVY_PIP_PACKAGES = [("\\(torch\\|triton\\)/"), ("nvidia_"), ("xgboost/"), ("\\(jaxlib\\|jax\\)/"), ("scipy/"), ("numpy/")]
PY_INTERPRETER_REGEX = "\\.runfiles/rules_python~[^~/]\\+~python~python_[^/]\\+_x86_64-unknown-linux-gnu"
PIP_PACKAGES_REGEX = "\\.runfiles/rules_python~[^~/]\\+~pip~"
EXTERNAL_PACKAGES_REGEX = "\\.runfiles/[^/]\\+~"

LAYER_FILTER_CMDS = {
    "_app": "grep -v -e '{}' -e '{}' -e '{}' $< >$@".format(PY_INTERPRETER_REGEX, PIP_PACKAGES_REGEX, EXTERNAL_PACKAGES_REGEX),
    "_external": "(grep -e '{}' $< | grep -v -e '{}' -e '{}' >$@) || true".format(EXTERNAL_PACKAGES_REGEX, PIP_PACKAGES_REGEX, PY_INTERPRETER_REGEX),
    "_interpreter": "grep -e '{}' $< >$@".format(PY_INTERPRETER_REGEX),
    "_pip": "(grep -e '{}' $< | grep -v -e '~pip~pip_[^_]\\+_\\({}\\)' >$@) || true".format(PIP_PACKAGES_REGEX, "\\|".join(HEAVY_PIP_PACKAGES)),
} | {
    "_pip_heavy_{}".format(i): "(grep '\\.runfiles/rules_python~[^~/]\\+~pip~pip_[0-9]\\+_{}' $< >$@) || true".format(suffix)
    for i, suffix in enumerate(HEAVY_PIP_PACKAGES)
}

# Keep heavy, slow-changing, layers first.
ORDERED_LAYERS = ["_interpreter", "_external"] + ["_pip_heavy_{}".format(i) for i in range(len(HEAVY_PIP_PACKAGES))] + ["_pip", "_app"]

def py_image(...):
    ...
    for (suffix, cmd) in LAYER_FILTER_CMDS.items():
        native.genrule(
            name = manifest + suffix,
            srcs = [manifest],
            outs = [manifest + suffix + ".spec"],
            cmd = cmd,
            **kwargs
        )
        tar(
            name = layers + suffix,
            srcs = [binary],
            mtree = manifest + suffix,
            **kwargs
        )

    oci_image(
        name = name,
        base = base,
        entrypoint = ["/entrypoint.sh", entrypoint],
        workdir = "/" + entrypoint + ".runfiles/_main",
        tars = [entrypoint_layer] + [
            layers + suffix
            for suffix in ORDERED_LAYERS
        ],
        **kwargs
    )

@alexeagle
Copy link
Member Author

@ewianda thanks for that investigation. Since we can only have "a few" layers, I don't have a principled suggestion for this. Some special cases like pytorch seems like the only valid compromise, but it's not a very clean API to ask the user to tell us which packages are like that at the py_image terminal.
Maybe the best we can do for this issue is document the workaround. I hope as the ML folks get to a more steady state they can follow normal software engineering practices and not have a GB package?

@siddharthab
Copy link
Contributor

Yes, documentation and guidance are our best bets here. The code that I pasted above can be refactored to accept a list of heavy packages from the user to tease out into separate layers. Anything more (like setting a wheel size limit and automatically computing this list) will need support from other components in the ecosystem.

@mattem mattem removed the untriaged Requires traige label Apr 29, 2024
@betaboon
Copy link

betaboon commented May 6, 2024

the nix-ecosystem had a similar problem of limiting the layer-count when building images with layer-per-package.
IIRC the approach described in this blog post by graham is what ended up being implemented.
maybe that's of interested to some here.

@alexeagle
Copy link
Member Author

Nice, thanks @betaboon that's indeed very interesting. I dunno when we'll get to it... and it's not a rules_py issue, really more of a rules_oci issue that it should do something smart when there are "too many layers".
(Really it's another docker problem, docker build was so poorly thought out :(

@thesayyn
Copy link
Member

FWIW, we have a py_image_layer rule now that supports putting the packages into their own layers with layer_groups https://github.com/aspect-build/rules_py/blob/main/docs/py_image_layer.md

@thesayyn
Copy link
Member

A layer per package would be too much for some docker fs implementations

Exactly, this is essential what's stopping us from splitting all the packages into their own layer, however, it's now possible to put certain packages into their layers to optimize this process. It works well so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

6 participants