Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added system stash command #1

Merged
merged 1 commit into from
Jul 23, 2021
Merged

Added system stash command #1

merged 1 commit into from
Jul 23, 2021

Conversation

schaefi
Copy link
Contributor

@schaefi schaefi commented Jun 16, 2021

The system stash command allows to create a container image
from a given root tree. The result can be considered a cache
file for the stackbuild command.

@schaefi schaefi requested a review from davidcassany June 16, 2021 15:59
@schaefi schaefi self-assigned this Jun 16, 2021
@schaefi
Copy link
Contributor Author

schaefi commented Jun 16, 2021

@davidcassany this is a very simple implementation of the stash command to turn a given root tree into a docker container. I'd like to discuss this code and have some questions.

  • If the stash command is called for the purpose to create a root tree cache, I don't think we will ever need a derived container reference, or am I mistaken ?

  • I intentionally did not read an in-tree XML description from the build of the root-tree as I think the information about the author, tag and container name is specific to the creation of the cache and not necessarily connected with the kiwi build that created the root tree. Thoughts ?

  • I think it would be good to optionally register the image in a local registry via podman. I would leave the remote push outside of the plugin code and in the hands of the user though

Feedback welcome

Thanks

@schaefi schaefi requested a review from Conan-Kudo June 16, 2021 16:05
@Conan-Kudo
Copy link
Member

I'll take a look at this a bit later (working on a bunch of things atm), but @iammattcoleman, can you look at this?

@schaefi
Copy link
Contributor Author

schaefi commented Jun 16, 2021

Thanks, I think if people looking at it who might not have the information from the initial conversation it is helpful to share some background information here:

We would like to create two new sub-commands in a plugin called stackbuild

  • stackbuild
  • stash

The idea is to allow a kiwi image build based on the contents of a container which serves as the root-tree to begin with. One use case would be to rebuild an image from a cache. In terms of commands this would look like the following

$ kiwi-ng system build ... --target-dir /tmp/myimage
$ kiwi-ng system stash ... --root /tmp/myimage/build/image-root --target-dir /tmp/caches

... some time later

$ kiwi-ng system stackbuild --container-image cache_name -- ... --target-dir /tmp/myimage-rebuild

stackbuild should be allowed to use any container from some registry, such that the use case is not limited to just caches. The stash command as implemented here just exists to allow the creation of a container from some given root tree. I like to extend the stash command to also do podman import ... such that the local registry has this stash image, but this is not yet implemented as I'd like to have a conversation about the current state and if all that makes sense

Thanks

@davidcassany
Copy link

* If the stash command is called for the purpose to create a root tree cache, I don't think we will ever need a derived container reference, or am I mistaken ?

This is indeed a good question. My idea is that you (as a user) don't need a derived container reference, but you (as a developer) certainly need to take into account caches that might be built on top of another one. So you don't need the reference because in the cli because this is already part of the root tree, hence if any, it already fixed. I'd follow a similar logic as we do for containers build in stackbuild command. The result of a root tree of a stackbuild command should include, together with the XML description, the original container it is based on, this way if after the stackbuild command the user calls stash it has the chance to append on top of the original container an extra layer including only the changes applied during the stackbuild.

Copy link

@davidcassany davidcassany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the implementation, just some few details.

Regarding the default values I would make them based on the XML values, however this is just a minor detail, I have no strong opinion. I think I could live with almost any default criteria and this current implementation is damn simple, which is a great plus.

However, regarding the container build logic please consider the insights I commented, I believe that with very little changes it can be way more flexible to allow using a local container storage as a target destination, a remote registry, etc.

The same about the derived image concept. If the root tree itself was already based on a pre-existing container we should layer on top of that. We should keep the imported root in <root>/image as an OCI archive and then compute a new layer for the current root-tree on top of that (this is already workflow for derived containers).

kiwi_stackbuild_plugin/tasks/system_stash.py Outdated Show resolved Hide resolved
kiwi_stackbuild_plugin/tasks/system_stash.py Outdated Show resolved Hide resolved
kiwi_stackbuild_plugin/tasks/system_stash.py Outdated Show resolved Hide resolved
kiwi_stackbuild_plugin/tasks/system_stash.py Outdated Show resolved Hide resolved
a former system prepare or build call
--target-dir=<directory>
the target directory to store the container files
--tag=<name>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we could use the date as a default and remove the date from the container name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this partially like you suggested. I think we can discuss especially the tagging here.

@schaefi
Copy link
Contributor Author

schaefi commented Jul 19, 2021

@davidcassany finally I found the time to refactor the code for the stash command and put on the feedback I got from your side. Also added tests and worked with the stash command to see if it actually makes sense. From my perspective it does as it is right now but probably there are still some questions to be clarified

@schaefi
Copy link
Contributor Author

schaefi commented Jul 19, 2021

This is indeed a good question. My idea is that you (as a user) don't need a derived container reference, but you (as a developer) certainly need to take into account caches that might be built on top of another one

yep and done that way now, every new stash will add a layer on top of an existing one, if present

@schaefi schaefi requested a review from davidcassany July 19, 2021 10:08
@schaefi
Copy link
Contributor Author

schaefi commented Jul 19, 2021

I also thought a list command would be useful to list the available stashes

@schaefi
Copy link
Contributor Author

schaefi commented Jul 19, 2021

@davidcassany And after that I finally reached the state where you have started to work on the XML merging strategy. The way the stash container is created will currently include the image XML configuration from the root-path to stash. Now if one uses this stash for a build of an image I was questioning myself:

  • Treat the stash embedded image configuration as the primary config
    This means you basically build the stash image with modifications and ignore the build image config

  • Treat the build image config as the primary config
    This means the stash config gets overwritten

  • Try some merging strategy between the stash config and the build image config
    At this point I could imagine a full can of worms and unexpected results

At the moment I think the stash should not include the kiwi image description, such that it can be used generically for any build process. This of course can lead to inconsistent systems, e.g selecting a leap stash to build a TW image. I would expect the build process based on incompatible stash roots to fail at the package manager level. From my perspective this is a responsibility of the user to create and re-use stash containers in a workable way. As there are so many pitfalls with pre-defined root environments which we don't care for in kiwi when using "--allow-existing-root" I think we also don't need to try to be clever when the root comes from a container stash.

All this just my thoughts and I'm happy if we can have a conversation about it

Thanks much

@davidcassany
Copy link

davidcassany commented Jul 19, 2021

The way the stash container is created will currently include the image XML configuration from the root-path to stash.

I think this is the right way, so the image data is self contained. So it can be used for reproducibility use cases as I do not have to worry about keeping the original XML (imagine they are part of an evolving git repository, I don't want to be forced to keep the stashed image and its generator commit together). I do not remember if runtime choices are kept, mostly the profile that created the root-tree (probably the ideal case would be to keep the profiled XML), I would not care about specific runtime flags changing repos or stuff like that, those cases are out of scope IMHO.

Now if one uses this stash for a build of an image I was questioning myself:

* Treat the stash embedded image configuration as the primary config
  This means you basically build the stash image with modifications and ignore the build image config

I am not sure I understand what you mean. But I'd never give priority to the stashed XML compared some XML provided by the user.

* Treat the build image config as the primary config
  This means the stash config gets overwritten

I believe this makes sense, if I provide an XML, I want KIWI to make use of it, regardless of whatever there was in the stashed image.

* Try some merging strategy between the stash config and the build image config
  At this point I could imagine a full can of worms and unexpected results

LOL 😆 Yes, I would not consider a merging two XMLs that are compliant with the our schema. This is too complex, profiling could become a real mess. I quickly elaborated on my idea of merging XML below, which follows a different approach. Not that I really believe this is something we need, just as a mental exercise. 😄

At the moment I think the stash should not include the kiwi image description, such that it can be used generically for any build process. This of course can lead to inconsistent systems, e.g selecting a leap stash to build a TW image. I would expect the build process based on incompatible stash roots to fail at the package manager level. From my perspective this is a responsibility of the user to create and re-use stash containers in a workable way. As there are so many pitfalls with pre-defined root environments which we don't care for in kiwi when using "--allow-existing-root" I think we also don't need to try to be clever when the root comes from a container stash.

I slightly disagree, while I agree we should not try to be smart I'd expect a slightly different behavior. Let me consider few cases:

  • Building on top of a stashed image coming from anywhere, no checks, no stashed XML. Clearly the user has to provide an XML otherwise there is no way KIWI can do anything.

  • Building on top of a stashed image created by KIWI and the user provides an XML. In this case the XML provided by the user takes precedence and all the project stored within the stashed image is wiped and import phase (XML, config.sh, etc.)

  • Building on top of a stashed image created by KIWI and the user does not provide an XML. In that case the in image XML is used at the image gets rebuild. Here is the rebuild concept.

These are the use cases I'd consider. For that I'd say the rule is:

  1. If user provides an XML, the stashed XML, if any, is completely ignored.
  2. If user does not provide an XML only stashed images including an XML are possible, otherwise it simply fails after fetching the stashed image.

All this just my thoughts and I'm happy if we can have a conversation about it

Sure lets have a conversation if there are still grey areas. JFYI next Friday I start a three weeks vacation.

My approach on merging strategy

My approach was to define a new schema (based on the artifacts of the current one, so no redefinition of elements) that is less restrictive (does not require description, neither preferences and does not consider profiles). This schema purpose would only be to define descriptions that are not complete, they are only meant to be merged on top of a real valid schema. In fact, the stashed image could be a required attribute of <image> element. So the idea is that the plugin parses using this new type of schema, collects data and applies the data into real fully valid schema found inside the image. Some thing like an structured way of modifying an XMLState instance.

My idea was pretty simple, forget about profiles and distinguish elements by two categories: (i) drop-in elements, (ii) mergeable. And only care about top level elements. For instance <packages> is a drop-in element, the merging strategy for them is dumb, I just add the whole section as there were multiple <packages> sections. Then, <preferences> would be a mergeable section, so that requires specific logic to define what merging means. I got up to the conclusion only preferences and description sections are mergeable and also thought that by removing profiles from the picture turns the idea of merging some <preferences> into something doable. I did not consider going deeper in the three, so no specfic logic merging <type>, keep the stashed one or replace with a new provided one.

My idea was to allow XMLs that could look like:

<image stashed="opensuse/jeos-live" name="appliance_on_top_of_jeos">
  <packages type="image"> <!-- This would be computed as a full new packages section added in the stashed XML -->
     <!-- a list of additional packages -->
  </packages>
</image>

or

<image stashed="opensuse/jeos-live">
   <preferences>
      <!-- This results into a replacement of the whole built `<type>`-->
      <type image="iso"....> <!-- some different type section -->
   </preferences>
</image>

So all of these I think it is reasonable as long as one considers:

  • the <image stashed="..."> XML is not a KIWI compliant XML, it is mapped against a new schema that is slightly different compared the original.
  • the stashed XML is already profiled (it does not include profiles, it is the actual digested XML used for the build)
  • the merging strategy is simple and only up to top level elements and it always results into a full KIWI XML that is validated against the KIWI schema.

Copy link

@davidcassany davidcassany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schaefi nice coding 👍 😎

kiwi_stackbuild_plugin/tasks/system_stash.py Show resolved Hide resolved
@schaefi
Copy link
Contributor Author

schaefi commented Jul 20, 2021

@davidcassany Thanks much for your feedback 👍

  • Building on top of a stashed image created by KIWI and the user provides an XML. In this case the XML provided by the user takes precedence and all the project stored within the stashed image is wiped and import phase (XML, config.sh, etc.)

  • Building on top of a stashed image created by KIWI and the user does not provide an XML. In that case the in image XML is used at the image gets rebuild. Here is the rebuild concept.

This makes perfect sense and changes my mind. You are right let's do it that way when building based on a stash image

@schaefi
Copy link
Contributor Author

schaefi commented Jul 20, 2021

My approach on merging strategy

Thanks for the details on the idea. It feels more like an extension strategy rather than a merging strategy ;)
From my perspective this would be nice and can lead to really small descriptions if the stash it is based on provides a good base description to apply the "merging strategy". I also like your coding with the schema extension in kiwi_stackbuild_plugin/schema.rnc. If ok with you I create an issue adding your details provided here and if we go forward we should make sure that nothing of the existing concept code in that area gets lost

Would that be ok ?

@davidcassany
Copy link

It feels more like an extension strategy rather than a merging strategy ;)

Yes, extension might be better suited :)

A an issue for that would nice. Thanks

@schaefi schaefi mentioned this pull request Jul 20, 2021
The system stash command allows to create a container image
from a given root tree. The result can be considered a cache
file for the stackbuild command.
@schaefi schaefi force-pushed the add_stash_command branch from deddb1c to 097d223 Compare July 20, 2021 09:18
@schaefi
Copy link
Contributor Author

schaefi commented Jul 20, 2021

@davidcassany ok I think I have arranged all the information into issues. I did a squash/rebase and think this PR would be ready for a merge. My next step would then be looking in your stackbuild implementation and adapt with regards to the functionality now provided by stash, adding tests and come up with a next PR.

If there is anything we might postpone for later I would create a feature branch and push it there. So making sure that nothing of your work gets lost.

Would that work for you ?

Thanks

@davidcassany
Copy link

Would that work for you ?

Absolutely fine

@schaefi schaefi merged commit de6a49f into master Jul 23, 2021
@schaefi schaefi deleted the add_stash_command branch July 23, 2021 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants