Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple layer ordering from layer name with an explicit interface #322

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

schneems
Copy link

@schneems schneems commented Jan 13, 2025

@schneems schneems force-pushed the schneems/explicit-layer-ordering branch from 3ab9df2 to 890b675 Compare January 13, 2025 19:37
@buildpack-bot
Copy link
Member

Maintainers,

As you review this RFC please queue up issues to be created using the following commands:

/queue-issue <repo> "<title>" [labels]...
/unqueue-issue <uid>

Issues

(none)

@joeybrown-sf
Copy link
Contributor

Any layers with a duplicate load_order will be evaluated in alphabetic order (the same behavior as today). Any layers without a load_order will be loaded before layers with an explicit order. A load order of 0 will be loaded after layers without an explicit load_order. This default means that any explicitly ordered layers will have their PATH modifications prepended after the default layers, which has the effect of taking precedence.

I think it would be simpler to assign a load_order of 0 to layers that do not explicitly set one. That does mean that all layers with either no explicit load_order or those with an explicit load_order of 0 are treated the same and would be put into alphabetical order. But I think that makes the feature and implementation simpler. What do you think?

@schneems
Copy link
Author

schneems commented Jan 13, 2025

I think it would be simpler to assign a load_order of 0 to layers that do not explicitly set one. That does mean that all layers with either no explicit load_order or those with an explicit load_order of 0 are treated the same and would be put into alphabetical order. But I think that makes the feature and implementation simpler. What do you think?

Short: Sounds good. Making it zero would work if it would help out implementation. I don't have strong reservations or need to differentiate between the two values. Updated.

Long:
I don't have strong feelings about the defaults. Talking out loud a bit:

As I described it, if the spec says that it cannot be a negative number in the TOML then you could implement it by defaulting to -1 and then apply the same ordering logic for layers with a conflict. My one concern with defaulting to 0 would be if people are accidentally using 0 as an explicit ordering value (starting from 0 instead of 1). So if someone was using 0 to mean "after all un-ordered-layers, but the first of the ordered" it wouldn't behave as expected. Though it also seems like 0 would be a "common sense" default and setting it to zero would be the same as saying "apply the unordered logic, but before number 1.

If I was implementing to the spec in Rust and it says "this value must be a positive integer" then would like that I could model that in a struct directly instead of having one representation for "this is compliant with the spec" and then having to transform it to a signed integer for the comparison. I see that benefit.

I'm also realizing that maybe what you're hinting at is that instead of doing it in code, the lifecycle could insert it in the TOML files when it's missing after buildpack execution and then all the tooling could have strong guarantees it would always exist. That makes sense and seems valuable to be consistent. I'm for it

@schneems schneems force-pushed the schneems/explicit-layer-ordering branch from 7a3bdaf to c8881c4 Compare January 23, 2025 19:22
@schneems schneems force-pushed the schneems/explicit-layer-ordering branch from c8881c4 to 3dbe584 Compare January 23, 2025 19:23
We talked about this in the meeting. I'm not 100% certain I'm capturing the original suggestion and might update this with more details, but this is the general idea. I recall another benefit was mentioned for this scheme but didn't capture it and it's slipped my memory.

Signed-off-by: Schneems <[email protected]>
@schneems schneems force-pushed the schneems/explicit-layer-ordering branch from 47b272c to df2500c Compare January 23, 2025 19:31
@schneems
Copy link
Author

Updated after discussion from the working group meeting.

@jabrown85
Copy link
Contributor

The voting has begun and will end on Feb 13th.

```
build = true
launch = true
load_order = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After peeking at the code to refresh my memory - the launch.toml for each layer doesn't exist in the final run image.

launcher reads the buildpacks in the order they are written in metadata.toml when processing. It then reads the directories in alphanumeric order within each buildpack directory.

lifecycle would need to persist this information elsewhere, like config/metadata.toml so that launcher could iterate the directories in the specified order.

[[buildpacks]]
id = "<buildpack ID>"
version = "<buildpack version>"
api = "<buildpack API version>"
optional = false
ordered-layers = ["<ruby-layer-1>", "<ruby-layer-3>"]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the launch.toml for each layer doesn't exist in the final run image.

Minor correction it's Layer Content Metadata that I'm proposing we write to https://github.com/buildpacks/spec/blob/main/buildpack.md#layer-content-metadata-toml. And you're right, it's not persisted to disk (TIL, or rather 4 days ago I learned).

I think that looking at what's there, it would make sense to put it in the config/metadata.toml as I'm assuming the launcher already has to load that into memory to do its thing? The alternative would be to introduce another file. At this stage, I'm not really opinionated on how the information persists to make it to the launcher.

Also to state outloud: Whatever we do, we should make sure that when the order changes (but nothing else)...it produces a different OCI SHA as it will have possibly different behavior. But also we need to be mindful about not busting more than we have to. If this config metadata file is in it's own dir, then I'm assuming it could be in it's own layer which would be tiny and let any other cached (OCI) layers keep living their happy and productive lives.

Co-authored-by: Jesse Brown <[email protected]>
Signed-off-by: Richard Schneeman <[email protected]>
While TOML supports integers, the current implementation details of the CNB project mean that it is converted to JSON and then back to TOML which loses type information https://github.com/buildpacks/lifecycle/issues/884. This issue may make comparing values for equivalence harder. This behavior prevents me from strongly suggesting that the `load_order` key should be a TOML Integer type.

# Alternatives
[alternatives]: #alternatives
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see an alternative listed

Alternative: Order file in each layer

The individual buildpack author can emit <layer-dir>/CNB_LAUNCH_LAYER_ORDER_INDEX with a numeric value. From an implementation perspective, I think I like it better than launch.toml or a aggregate file at the buildpack directory. Being a file on disk means it doesn't have to be handled in intermediate files or show up in OCI image labels.

I don't know what implications it has on a buildpack author and if there would be any concerns around re-ordering layers and cache busting. I'd love for you weigh in @schneems

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To riff on that request: I would propose $CNB_LAYERS_DIR/<layer>.order and have the value be a positive integer between 0-u64::Max (inclusive) with allowance for whitespace on either side of the number. This fits with some other use cases https://github.com/buildpacks/spec/blob/main/buildpack.md#build.

For my use case it would be $ cat /layers/bundler.order # => 42. I like that this distinguishes that things put inside of the layer dir (i.e. /layers/bundler/<contents> are for the benefit of the application versus metadata and other configuration are for the interface.

Could optionally do <name>.order.integer or <name>.load-order.integer. Though it seems a little odd to write a whole file for only one value.

I don't know what implications it has on a buildpack author and if there would be any concerns around re-ordering layers and cache busting. I'd love for you weigh in @schneems

I think that when the layer is restored, it wouldn't restore the order information, in the same way that cache = true is not persisted on cache load from the toml. From a library author, it's two file writes instead of one, for the platform implementer, it's two file reads instead of one.

I'm pondering on which interface is easier or harder to mess up. I.e. if someone does $ echo 2 >> /layers/wronglayername.load-order.integer or $ echo 2 /layers/<correctlayername>/load-order.integer (by accident) it would be easy to detect if the "wronglayername" doesn't exist, but hard if it does (if the order is input into the incorrect layer name). It would be hard to detect a mis-use like <correctlayername>/load-order.integer as CNB would think it's simply a file they want in that directory.

The last thing to consider is discovery of the buildpack author. If the examples show <layer-name>.toml with a load-order= in it, then It will be easy to see. If it's in a completely different file, I'm not sure we'll bother to update the examples. In general, I like my interfaces to not require me to remember to do two things (even if in this case, I still have to remember to add the load order to the file even if it's toml).

One possible modification (to any scheme) would be to make an order required i.e. error if load-order=0 is not found then fail to build or at least emit a warning. That seems invasive, maybe we wait to see if people get tripped up by forgetting it and consider adding a warning/error later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually suggesting that <layer-dir>/CNB_LAUNCH_LAYER_ORDER_INDEX as layers/bundler/CNB_LAUNCH_LAYER_ORDER_INDEX. I wasn't clear enough in my comment, sorry.

Your comment above mentions you like the inside/outside layer distinction. What is your opinion of per-layer file representation? I like that it travels with the layer and it doesn't bloat the metadata of the image or have the buildpack author having to do 2 things. A cat echo 2 > /layers/bundler/CNB_LOAD_ORDER is a pretty simple interface that doesn't require the layer name matching or anything like that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schneems what do you think about ^?

I'm not a heavy -1 on <layer>.toml but it ends up places I don't like (Image Label, config/metadata.toml). I quite like the simplicity of each directory having something in it that we can use for ordering at launch time. Having order stored outside the directory requires a list of ordered layers in a single location which feels more complicated and error prone (casing of dir names, length of dir names in toml array, what to do with items that don't have a match, etc).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that when the layer is restored, it wouldn't restore the order information, in the same way that cache = true is not persisted on cache load from the toml.

Is this the desired behavior?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a heavy -1 on .toml but it ends up places I don't like (Image Label, config/metadata.toml)

I agree, but this seems less bad to me than co-mingling "application stuff" (layer contents) with "lifecycle stuff" (layer order).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that when the layer is restored, it wouldn't restore the order information, in the same way that cache = true is not persisted on cache load from the toml.
Is this the desired behavior?

I think so. Otherwise it would be weird if someone remembered to write the cache/launch/build state, but forgot to update the order number and suddenly you've got something in an unexpected order.

I guess technically in that scenario (talking through it) you could make the argument that erasing the order would also make it default to 0 which could be considered "unexpected." I would guess that it's more expected that "not touching layer numbering" == "layer is now ordered as zero". As opposed to "not touching layer numbering" == "inherit from last known layer numbering." Basically: I think it's easier to reason about if the options are "Explicit or default to zero" rather than "explicit or default to inheritance." I'm open to a counter example though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jabrown85 I added a section with this alternative, can you review and resolve this conversation? b4f1f91

Copy link
Member

@AidanDelaney AidanDelaney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of the export phase is limited. So I'm wondering if we need to add more spec complexity to it. In order to do that I'd like some clarifying examples of where this spec change would actually impact buildpack authors.


- At "build" time, multiple buildpacks can run simultaneously. If a layer is created with `build=true` in its TOML, it will be loaded and visible to buildpacks that run after it.
- At "launch" time. When an image is finalized, if a layer is created with `launch=true`, it is TOML and will be loaded when the image is launched (such as `docker run`).
- Inside of the same buildpack. A large buildpack may generate several layers that depend on each other. For example the `heroku/ruby` buildpack first downloads a ruby executable and places it in a `ruby` layer. It then executes code such as `gem install` and `bundle install` that depend on that ruby executable. It's a buildpack author's responsibility to make sure any environment variable modifications needed at `build` or `launch` time are also tracked and re-exported (or passed) to code running in the same buildpack.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be sufficient to make buildpack authors aware of the "internal buildpack" alphanumeric names? That way the buildpack authors could choose ye-olde-school 01layer and 02layer as names like rc.d files? By making buildpacks authors aware of the ordering could we avoid this RFC change which places a burden of understanding on all buildpack authors?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a small bit of text that makes that solution not possible for all scenarios:

  • The current ordering mechanism requires prefixing a directory to change order. Some binaries cannot be relocated without being rebuilt.

Basically, some binaries are not relocatable so what you're suggesting isn't possible for every buildpack author.

From the ergonomics side: I experimented with tooling to auto-re-order layers as they're written heroku/buildpacks-ruby#384. The result is much more complex than I would have guessed before implementation. I think this is approach is suboptimal for buildpacks written in languages like Rust or Go and is outside the reach for a buildpacks written in bash.

# Definitions
[definitions]: #definitions

It's worth noting the current behavior. [Buildpacks](https://github.com/buildpacks/spec/blob/main/buildpack.md) produce layers on disk. These layers are loaded in order to produce a runnable image by the [platform's lifecycle](https://github.com/buildpacks/spec/blob/main/platform.md). The order in which the layers are loaded can have differing effects on the final image outcome. For example, when a layer prepends a value to the `PATH` environment variable, the last layer to be loaded will "win" if multiple layers prepend a path with the same executable name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the current implementation https://github.com/buildpacks/lifecycle/blob/2c0f9bd6cda7a25f5510b1b0c86e8a21273d0890/phase/exporter.go#L42 which exports the layers in order of the list of Buildpacks provided to the exporter. So the order of buildpack layers is explicit. As you point out below, this RFC is specifying an explicit order for the layers provided by a single buildpack.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to respond to all comments. I'm not sure what you're trying to communicate here.

So the order of buildpack layers is explicit.

I think you might be responding to my wording choice in the RFC title "Explicit". From the platform perspective the order is specified/defined i.e. it's deterministic and consistent. I was thinking of the zen of python when I wrote it i.e. "Explicit is better than implicit." But implicit isn't quite the right word for the current behavior. It's more that it's ambiguous whether the effect of layer ordering on the name was intentional or not.

Maybe the title should be changed to "Decouple layer ordering from layer name with an explicit interface"? I think that's more precise, I updated the title of the PR.


## Problem statement

- Today a single buildpack can write multiple layers.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or it can write multiple slices. Slices allow explicit ordering through a list of glob patterns (https://buildpacks.io/docs/for-buildpack-authors/how-to/write-buildpacks/create-slice-layers/)

Copy link
Member

@natalieparellano natalieparellano Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's only for the application directory, no? (Also, ordering of layers within the image config is different from the order in which the lifecycle/launcher processes env files ...although it's interesting to think about what could happen if we combined these concepts)


- Today a single buildpack can write multiple layers.
- For launch or the next buildpack in build, each layer is evaluated in alphabetic order via the spec.
- This order can differ from the order that layers were written by the buildpack which is surprising and can result in difficult to debug problem.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you provide a concrete example? I understand the problem that layer A of buildpack X could provide /bin/foo and layer B of buildpack X could provide /bin/foo and whichever alias of that command is available at runtime depends on the order of export of the layers of buildpack X. I'm wondering in what concrete situations this has been a problem?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering in what concrete situations this has been a problem?

In Richard's case, he encountered this issue with Ruby apps (iirc due to the Ruby app pattern of having bin-scripts in the app source?).

For me, I ran into this separately with Python. This issue affected our CNB since we need to use a venv (virtual environment), and venvs contain a bin/python script that must shadow the actual python binary on PATH in order for the venv to work as expected.

In an earlier CNB design (prior to switching to venvs), I had the Python binary installed in a layer named python and the app dependencies installed into a layer named dependencies (both layers in the same CNB; we've mostly avoided using micro-buildpacks for a number of reasons). I had to change the layer name to venv to make the layer envs ordered as expected, when I much preferred the naming dependencies.

Whilst I could have switched to eg 01_python and 02_dependencies, this:
(a) would cause all cached layers to be invalidated (both for that specific time, and any other time I want to add layers, if it shuffles the numbering)
(b) means things like app stack traces now expose ~messy implementation details
(c) could break customer apps if they had become dependent on certain layers having certain paths (which ideally they wouldn't, but ...)

I also think there is a possible CNB maintainer discoverability issue - whilst the existing spec does state the layer ordering aspect, it may be easy for someone to refactor a CNB and not realise the impact it had, vs an explicit ordering in the layer config.

- Today a single buildpack can write multiple layers.
- For launch or the next buildpack in build, each layer is evaluated in alphabetic order via the spec.
- This order can differ from the order that layers were written by the buildpack which is surprising and can result in difficult to debug problem.
- The current ordering mechanism requires prefixing a directory to change order. Some binaries cannot be relocated without being rebuilt.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't yet understand this. AFAIK we're saying that a binary at $CNB_LAYERS_DIR/<layer>/bin/foo can't be written to $CNB_LAYERS_DIR/01<layer>/bin/foo?

Copy link
Member

@hone hone Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AidanDelaney This is the most compelling argument for me. I think you can mitigate with potentially better documentation/awareness for everything else. A concrete example here is that when compiling binaries, you often (sometimes for performance reasons) need to pass --prefix to the configure script. This hardcodes the path in binary itself. If the buildpack author needs to change the order via naming, this prefix configuration will be invalid which may cause the binary to be unable to run. This would mean the buildpack author would need to recompile any binaries in that path.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also commented here #322 (comment)


> - What use cases does it support?

- Re-ordering layers without having to move files on disk.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't yet understand why we need to re-order layers by moving files on disk? For example, the packteo php-composer buildpack creates 3 layers (https://github.com/paketo-buildpacks/php-composer/blob/428c4ce4577621703065c5f5a4f5e99a7c7b4751/packages/contributor.go#L80). If the buildpack wants to explicitly order these layers then it can, as you point out, name them 01composer, 02php-composer-packages and 03php-composer-cache?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't yet understand why we need to re-order layers by moving files on disk?

You suggested prefixing layer names with numbers:

A consequence of that is, you need to move files on disk. For example:

  • A buildpack has two layers 01_expensive_compiled_binary and 02_results_of_command

People deploy with this buildpack. The buildpack author then realizes they need to add another layer and it needs to come first

  • The buildpack now has three layers 01_first, 02_expensive_compiled_binary and 03_results_of_command.

If you want to preserve the contents of the cache of expensive_compiled_binary and results_of_command from prior deploys (which, lets say I do) then the only way to implement the prefix layer name ordering is by moving files on disk from 01_expensive_compiled_binary (from the first deploy) to 02_expensive_compiled_binary (to this current deploy).

Basically, I'm using "move files on disk" as a shorhand to reference the alternative you're suggesting.

## Problem statement

- Today a single buildpack can write multiple layers.
- For launch or the next buildpack in build, each layer is evaluated in alphabetic order via the spec.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the following more precise?

Suggested change
- For launch or the next buildpack in build, each layer is evaluated in alphabetic order via the spec.
- The collection of layers provided by a buildpack is ordered by the directory order of the filesystem on which they are located

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's (strictly) correct. The lifecycle's ordering could diverge from the order of disks on the filesystem based on how "alphabetic order" is implemented.

The lifecycle/platform spec says

The lifecycle SHALL apply buildpack-provided modifications to the environment as outlined in the Buildpack Interface Specification.

From that link it says:

The lifecycle MUST order all paths provided by a given buildpack alphabetically ascending.

There's no "order of filesystem," in the spec. All ordering is done by the lifecycle, though I do understand that wording was intended to mimic what someone would view as a natural order of files on disk. Perhaps some wording to that effect might help the reader?


- At "build" time, multiple buildpacks can run simultaneously. If a layer is created with `build=true` in its TOML, it will be loaded and visible to buildpacks that run after it.
- At "launch" time. When an image is finalized, if a layer is created with `launch=true`, it is TOML and will be loaded when the image is launched (such as `docker run`).
- Inside of the same buildpack. A large buildpack may generate several layers that depend on each other. For example the `heroku/ruby` buildpack first downloads a ruby executable and places it in a `ruby` layer. It then executes code such as `gem install` and `bundle install` that depend on that ruby executable. It's a buildpack author's responsibility to make sure any environment variable modifications needed at `build` or `launch` time are also tracked and re-exported (or passed) to code running in the same buildpack.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a buildpack author's responsibility to make sure any environment variable modifications needed at build or launch time are also tracked and re-exported (or passed) to code running in the same buildpack.

Could you elaborate on this part? I don't quite follow

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Here's an explicit example, it's a bit verbose, but I hope it illustrates what I'm talking about:

Copy link
Author

@schneems schneems Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly to add: If my original wording was unclear and you were able to follow my example: It might be helpful to have you re-explain this concept in your own words, that wording might make sense to a newcomer (I'm pretty close to this specific concept already so it's hard for me to spot the gaps in my language).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the example. I guess if I were to restate it I would say, that the lifecycle does some special magic to prepare the environment for each buildpack, and buildpacks that both write and read environment variables to accomplish their work need to replicate the special magic of the lifecycle. Having an explicit ordering makes this all clearer because we can expect lifecycle(build), lifecycle(launch), and buildpack env setting to work "the same" if a buildpack creates layers in the order in which it expects them to be loaded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if my language is any clearer here, but it makes sense in my head

schneems added 2 commits March 6, 2025 08:55
Signed-off-by: Schneems <[email protected]>
@schneems schneems changed the title Introduce explicit layer ordering RFC Decouple layer ordering from layer name with an explicit interface Mar 6, 2025
@schneems
Copy link
Author

schneems commented Mar 6, 2025

I've responded to all comment threads and acted on all requests for changes. LMK if any other clarifications are needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants