Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Overhaul user data formats documentation #5551

Merged
merged 10 commits into from
Jul 29, 2024

Conversation

TheRealFalcon
Copy link
Member

Proposed Commit Message

docs: Overhaul user data formats documentation

Fixes GH-4739

Additional Context

I kept (and updated) the part handler piece as I think it should stay here until #4649 is implemented.

I started documenting cloud-config-jsonp, but decided against it and filed #5549 instead. I don't think we should be using it or recommending it to anybody. I also updated the vendor data page accordingly to remove the suggestion there.

There's unfortunately some significant overlap with #5546 . I stole some of the language there because a full-on cherry-pick would have been too hard to merge. If the commit there can be simplified, I'm happy to cherry-pick it and rebase mine on top to give Alberto some credit for his additions.

I'm not married to the format I chose, so if a reviewer has any better ideas, feel free to suggest.

Test Steps

Merge type

  • Squash merge using "Proposed Commit Message"
  • Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

@github-actions github-actions bot added the documentation This Pull Request changes documentation label Jul 25, 2024
Copy link
Contributor

@s-makin s-makin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I really love this change. I think it feels more standardised and definitely a lot more straightforward, and the consistent format for each section in the format page makes it easier to parse the information. Just have one small suggestion, but otherwise lgtm :)

doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/format.rst Show resolved Hide resolved
Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this work! The organization and standardization is much improved. It is much easier to read and adds lots of context.

I left some suggestions and questions inline.


.. code-block::
A user data script is a single shell script to be executed once per instance.
User data scripts are run relatively late in the boot process, after most
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relatively late in the boot process, after most other cloud-init modules have run.

This is helpful, but still vague. Would it make sense to make this a non-relative statement? I think that it would help if we just tell the user which stage it runs during (with a link to the stage in the boot order page).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think that our stages go long enough that it makes sense to place these relative to a specific module. I don't think it's confusing info if the user doesn't need it. Let me know if you think it makes sense.

doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved

MIME multi-part archive
=======================

This list of rules is applied to each part of this multi-part file.
| **Header:** Content-Type: multipart/mixed;
| **Content-Type:** multipart/mixed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised to see this here. Can mime itself contain mime data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh. The header is the content-type declaration...so...it's a bit weird.

=======================

| **Header** n/a
| **Content-Type** n/a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but given that it's a binary file, it can't be specified anywhere.

doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
doc/examples/part-handler.txt Outdated Show resolved Hide resolved
doc/examples/part-handler.txt Show resolved Hide resolved

Begins with: ``#include`` or ``Content-Type: text/x-include-url`` when using
a MIME archive.
* ``type``: The content type of the MIME part
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This format doesn't use a MIME part, does it? Maybe something like this instead?

type: The Content-Type identifier for the type of user data in content
content: The user data configuration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It translates fairly directly to a mime part that gets generated from the config. I can pull the MIME details out of this section, but IMO it'll make some of the other fields harder to understand.

doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@s-makin s-makin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from me :)

@aciba90 aciba90 mentioned this pull request Jul 26, 2024
2 tasks
Copy link
Contributor

@aciba90 aciba90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for this effort. It looks great!


* It is run very early in boot, even before the ``cc_bootcmd`` module
* It is run on every boot
* The environment variable ``INSTANCE_ID`` is set to the current instance ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INSTANCE_ID is also provided for cc_bootcmd, could you please remove this item?

@TheRealFalcon
Copy link
Member Author

Not all comments have been addressed, but I wanted to push what I had so far.

--------------
A user data script is a single shell script to be executed once per instance.
User data scripts are run relatively late in the boot process, during the
'modules:final' stage as part of the "cc_scripts_user" module.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that `modules:final" text that is commonly seen in the log, but I think it would be easier to standardize on the more generic "final" stage, since the "modules" entrypoint is really just an implementation detail that the user shouldn't need to know about.

Also, could we link to the respective stages on the boot order page both here and in the cloud boothook section?

@TheRealFalcon
Copy link
Member Author

I believe the only fix left is to move the part handler stuff. Since that requires a rebase and merge, I figured it'd be best to push first.

@TheRealFalcon
Copy link
Member Author

Most recent commit adds the part handler split. Tbh, I don't love how it's split, but I do think it makes sense to keep information in both places. What do people think of having the example in both locations?

I believe with this commit I have addressed all of the comments.

@@ -211,6 +212,7 @@ scaleway
seedurl
serverurl
setup-keymap
shellscript
Copy link
Member Author

@TheRealFalcon TheRealFalcon Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this because in most places this would be a legitimate misspelling, but I couldn't find any way to ignore a spelling for a single line.

Copy link
Member

@holmanb holmanb Jul 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that I understand, did you mean to say "... would not be a legitimate mispelling"?

I think Chad submitted a PR in the last day or two that ignored spelling for a single line.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that I understand, did you mean to say "... would not be a legitimate mispelling"?

No. I think that most of the time we want "shell script" and so I think adding this has the potential to silence legitimate misspellings.

Copy link
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheRealFalcon I think this is really close. I left a couple more comments.

doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
Content found to be gzip compressed will be uncompressed.
The uncompressed data will then be used as if it were not compressed.
This is typically useful because user data is limited to ~16384 [#]_ bytes.
.. _user_data_formats-mime_archive:

MIME multi-part archive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you feel about ordering the "formats that deal with other user data formats" in order of increasing complexity in the list above and also in the sections starting here?

If that seems reasonable, I would suggest:

Include file
Jinja template
cloud config archive
Mime multi-part archive
Part handler

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I want to put the mime one first as cloud config archive is framed as an alternative to the mime type.

doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved
@holmanb holmanb self-assigned this Jul 29, 2024
@TheRealFalcon TheRealFalcon merged commit f9ab856 into canonical:main Jul 29, 2024
24 checks passed
@TheRealFalcon TheRealFalcon deleted the userdata-formats branch July 29, 2024 19:35
holmanb pushed a commit to holmanb/cloud-init that referenced this pull request Aug 2, 2024
holmanb pushed a commit that referenced this pull request Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation This Pull Request changes documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants