[FR]: Container image layers per-package #244

alexeagle · 2024-01-16T16:32:44Z

What is the current behavior?

Currently our best-effort to make a layered image for Python is three layers: interpreter, site-packages, app:
https://github.com/aspect-build/bazel-examples/blob/main/oci_python_image/py_layer.bzl

However, when a package appears under multiple applications, the site-packages layer for each app will have a duplicate of that package, making the build outputs larger. When pytorch is one such package (it is many gigabytes for the GPU variant) then this problem leads to builds taking minutes just to fetch build inputs from a remote cache.

Describe the feature

We can probably do something with aspects to visit each third-party package and produce a tar file containing its site-packages entry as a distinct layer. Then when we build a container image for an application these tar files should just get re-used rather than being re-built.

The text was updated successfully, but these errors were encountered:

ewianda · 2024-01-31T22:10:53Z

Hi @alexeagle I had been thinking about this problem, but never thought I could use aspect for this. So after seeing your suggestion, I went ahead implemented this feature. But the bummer was that I hit the max limit of number of layers which I believe is 127 for docker

3557592c9b6f: Loading layer [==================================================>]     432B/432B
c7d6f3eb36c2: Loading layer [==================================================>]     794B/794B
956257f1d3a3: Loading layer [==================================================>]  3.183kB/3.183kB
667da44de747: Loading layer [==================================================>]  1.343kB/1.343kB
88a0745cd9e2: Loading layer [==================================================>]   11.6kB/11.6kB
f3af998a1f5e: Loading layer [==================================================>]  3.543kB/3.543kB
9abe9f6a4fae: Loading layer [==================================================>]     906B/906B
3b5c52259645: Loading layer [==================================================>]  1.711MB/1.711MB
710de8cf73d3: Loading layer [===>                                               ]  32.77kB/497.4kB
max depth exceeded

Not sure if you had thoughts about how this will play out.

I can share my implementation may be in

https://github.com/aspect-build/bazel-examples/tree/main/oci_python_image

siddharthab · 2024-02-27T06:48:40Z

A layer per package would be too much for some docker fs implementations (I forget which ones). We typically keep separate image layers for large packages, and bundle all smaller packages together. Something like this:

HEAVY_PIP_PACKAGES = [("\\(torch\\|triton\\)/"), ("nvidia_"), ("xgboost/"), ("\\(jaxlib\\|jax\\)/"), ("scipy/"), ("numpy/")]
PY_INTERPRETER_REGEX = "\\.runfiles/rules_python~[^~/]\\+~python~python_[^/]\\+_x86_64-unknown-linux-gnu"
PIP_PACKAGES_REGEX = "\\.runfiles/rules_python~[^~/]\\+~pip~"
EXTERNAL_PACKAGES_REGEX = "\\.runfiles/[^/]\\+~"

LAYER_FILTER_CMDS = {
    "_app": "grep -v -e '{}' -e '{}' -e '{}' $< >$@".format(PY_INTERPRETER_REGEX, PIP_PACKAGES_REGEX, EXTERNAL_PACKAGES_REGEX),
    "_external": "(grep -e '{}' $< | grep -v -e '{}' -e '{}' >$@) || true".format(EXTERNAL_PACKAGES_REGEX, PIP_PACKAGES_REGEX, PY_INTERPRETER_REGEX),
    "_interpreter": "grep -e '{}' $< >$@".format(PY_INTERPRETER_REGEX),
    "_pip": "(grep -e '{}' $< | grep -v -e '~pip~pip_[^_]\\+_\\({}\\)' >$@) || true".format(PIP_PACKAGES_REGEX, "\\|".join(HEAVY_PIP_PACKAGES)),
} | {
    "_pip_heavy_{}".format(i): "(grep '\\.runfiles/rules_python~[^~/]\\+~pip~pip_[0-9]\\+_{}' $< >$@) || true".format(suffix)
    for i, suffix in enumerate(HEAVY_PIP_PACKAGES)
}

# Keep heavy, slow-changing, layers first.
ORDERED_LAYERS = ["_interpreter", "_external"] + ["_pip_heavy_{}".format(i) for i in range(len(HEAVY_PIP_PACKAGES))] + ["_pip", "_app"]

def py_image(...):
    ...
    for (suffix, cmd) in LAYER_FILTER_CMDS.items():
        native.genrule(
            name = manifest + suffix,
            srcs = [manifest],
            outs = [manifest + suffix + ".spec"],
            cmd = cmd,
            **kwargs
        )
        tar(
            name = layers + suffix,
            srcs = [binary],
            mtree = manifest + suffix,
            **kwargs
        )

    oci_image(
        name = name,
        base = base,
        entrypoint = ["/entrypoint.sh", entrypoint],
        workdir = "/" + entrypoint + ".runfiles/_main",
        tars = [entrypoint_layer] + [
            layers + suffix
            for suffix in ORDERED_LAYERS
        ],
        **kwargs
    )

alexeagle · 2024-02-27T15:00:28Z

@ewianda thanks for that investigation. Since we can only have "a few" layers, I don't have a principled suggestion for this. Some special cases like pytorch seems like the only valid compromise, but it's not a very clean API to ask the user to tell us which packages are like that at the py_image terminal.
Maybe the best we can do for this issue is document the workaround. I hope as the ML folks get to a more steady state they can follow normal software engineering practices and not have a GB package?

siddharthab · 2024-02-27T21:43:57Z

Yes, documentation and guidance are our best bets here. The code that I pasted above can be refactored to accept a list of heavy packages from the user to tease out into separate layers. Anything more (like setting a wheel size limit and automatically computing this list) will need support from other components in the ecosystem.

betaboon · 2024-05-06T10:09:00Z

the nix-ecosystem had a similar problem of limiting the layer-count when building images with layer-per-package.
IIRC the approach described in this blog post by graham is what ended up being implemented.
maybe that's of interested to some here.

alexeagle · 2024-06-16T20:02:20Z

Nice, thanks @betaboon that's indeed very interesting. I dunno when we'll get to it... and it's not a rules_py issue, really more of a rules_oci issue that it should do something smart when there are "too many layers".
(Really it's another docker problem, docker build was so poorly thought out :(

thesayyn · 2024-11-13T22:20:25Z

FWIW, we have a py_image_layer rule now that supports putting the packages into their own layers with layer_groups https://github.com/aspect-build/rules_py/blob/main/docs/py_image_layer.md

thesayyn · 2024-11-13T22:21:53Z

A layer per package would be too much for some docker fs implementations

Exactly, this is essential what's stopping us from splitting all the packages into their own layer, however, it's now possible to put certain packages into their layers to optimize this process. It works well so far.

alexeagle added the enhancement New feature or request label Jan 16, 2024

github-actions bot added the untriaged Requires traige label Jan 16, 2024

aspect-ghbot added this to Open Source Jan 16, 2024

mattem removed the untriaged Requires traige label Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FR]: Container image layers per-package #244

[FR]: Container image layers per-package #244

alexeagle commented Jan 16, 2024

ewianda commented Jan 31, 2024

siddharthab commented Feb 27, 2024 •

edited

Loading

alexeagle commented Feb 27, 2024

siddharthab commented Feb 27, 2024

betaboon commented May 6, 2024

alexeagle commented Jun 16, 2024

thesayyn commented Nov 13, 2024

thesayyn commented Nov 13, 2024

[FR]: Container image layers per-package #244

[FR]: Container image layers per-package #244

Comments

alexeagle commented Jan 16, 2024

What is the current behavior?

Describe the feature

ewianda commented Jan 31, 2024

siddharthab commented Feb 27, 2024 • edited Loading

alexeagle commented Feb 27, 2024

siddharthab commented Feb 27, 2024

betaboon commented May 6, 2024

alexeagle commented Jun 16, 2024

thesayyn commented Nov 13, 2024

thesayyn commented Nov 13, 2024

siddharthab commented Feb 27, 2024 •

edited

Loading