docs: Overhaul user data formats documentation #5551

TheRealFalcon · 2024-07-25T14:26:18Z

Proposed Commit Message

docs: Overhaul user data formats documentation

Fixes GH-4739

Additional Context

I kept (and updated) the part handler piece as I think it should stay here until #4649 is implemented.

I started documenting cloud-config-jsonp, but decided against it and filed #5549 instead. I don't think we should be using it or recommending it to anybody. I also updated the vendor data page accordingly to remove the suggestion there.

There's unfortunately some significant overlap with #5546 . I stole some of the language there because a full-on cherry-pick would have been too hard to merge. If the commit there can be simplified, I'm happy to cherry-pick it and rebase mine on top to give Alberto some credit for his additions.

I'm not married to the format I chose, so if a reviewer has any better ideas, feel free to suggest.

Test Steps

Merge type

Squash merge using "Proposed Commit Message"
Rebase and merge unique commits. Requires commit messages per-commit each referencing the pull request number (#<PR_NUM>)

s-makin

Overall I really love this change. I think it feels more standardised and definitely a lot more straightforward, and the consistent format for each section in the format page makes it easier to parse the information. Just have one small suggestion, but otherwise lgtm :)

doc/rtd/explanation/format.rst

holmanb

I really like this work! The organization and standardization is much improved. It is much easier to read and adds lots of context.

I left some suggestions and questions inline.

holmanb · 2024-07-25T17:53:39Z

doc/rtd/explanation/format.rst


-.. code-block::
+A user data script is a single shell script to be executed once per instance.
+User data scripts are run relatively late in the boot process, after most


relatively late in the boot process, after most other cloud-init modules have run.

This is helpful, but still vague. Would it make sense to make this a non-relative statement? I think that it would help if we just tell the user which stage it runs during (with a link to the stage in the boot order page).

I do think that our stages go long enough that it makes sense to place these relative to a specific module. I don't think it's confusing info if the user doesn't need it. Let me know if you think it makes sense.

doc/rtd/explanation/format.rst

holmanb · 2024-07-25T18:13:47Z

doc/rtd/explanation/format.rst


 MIME multi-part archive
 =======================

-This list of rules is applied to each part of this multi-part file.
+| **Header:** Content-Type: multipart/mixed;
+| **Content-Type:** multipart/mixed


I'm surprised to see this here. Can mime itself contain mime data?

Heh. The header is the content-type declaration...so...it's a bit weird.

holmanb · 2024-07-25T18:16:45Z

doc/rtd/explanation/format.rst

+=======================
+
+| **Header** n/a
+| **Content-Type** n/a


Isn't there a content type?

Yes, but given that it's a binary file, it can't be specified anywhere.

doc/rtd/explanation/format.rst

doc/examples/part-handler.txt

holmanb · 2024-07-25T19:08:48Z

doc/rtd/explanation/format.rst


-Begins with: ``#include`` or ``Content-Type: text/x-include-url``  when using
-a MIME archive.
+* ``type``: The content type of the MIME part


This format doesn't use a MIME part, does it? Maybe something like this instead?

type: The Content-Type identifier for the type of user data in content
content: The user data configuration

It translates fairly directly to a mime part that gets generated from the config. I can pull the MIME details out of this section, but IMO it'll make some of the other fields harder to understand.

doc/rtd/explanation/format.rst

s-makin

+1 from me :)

aciba90

Many thanks for this effort. It looks great!

aciba90 · 2024-07-26T09:40:30Z

doc/rtd/explanation/format.rst

+
+* It is run very early in boot, even before the ``cc_bootcmd`` module
+* It is run on every boot
+* The environment variable ``INSTANCE_ID`` is set to the current instance ID


INSTANCE_ID is also provided for cc_bootcmd, could you please remove this item?

TheRealFalcon · 2024-07-26T19:43:36Z

Not all comments have been addressed, but I wanted to push what I had so far.

holmanb · 2024-07-26T19:51:41Z

doc/rtd/explanation/format.rst

--------------
+A user data script is a single shell script to be executed once per instance.
+User data scripts are run relatively late in the boot process, during the
+'modules:final' stage as part of the "cc_scripts_user" module.


I know that `modules:final" text that is commonly seen in the log, but I think it would be easier to standardize on the more generic "final" stage, since the "modules" entrypoint is really just an implementation detail that the user shouldn't need to know about.

Also, could we link to the respective stages on the boot order page both here and in the cloud boothook section?

TheRealFalcon · 2024-07-26T20:45:14Z

I believe the only fix left is to move the part handler stuff. Since that requires a rebase and merge, I figured it'd be best to push first.

TheRealFalcon · 2024-07-26T21:12:51Z

Most recent commit adds the part handler split. Tbh, I don't love how it's split, but I do think it makes sense to keep information in both places. What do people think of having the example in both locations?

I believe with this commit I have addressed all of the comments.

TheRealFalcon · 2024-07-26T21:34:36Z

doc/rtd/spelling_word_list.txt

@@ -211,6 +212,7 @@ scaleway
 seedurl
 serverurl
 setup-keymap
+shellscript


I don't love this because in most places this would be a legitimate misspelling, but I couldn't find any way to ignore a spelling for a single line.

Not sure that I understand, did you mean to say "... would not be a legitimate mispelling"?

I think Chad submitted a PR in the last day or two that ignored spelling for a single line.

Not sure that I understand, did you mean to say "... would not be a legitimate mispelling"?

No. I think that most of the time we want "shell script" and so I think adding this has the potential to silence legitimate misspellings.

holmanb

@TheRealFalcon I think this is really close. I left a couple more comments.

doc/rtd/explanation/format.rst

holmanb · 2024-07-26T23:40:18Z

doc/rtd/explanation/format.rst

-Content found to be gzip compressed will be uncompressed.
-The uncompressed data will then be used as if it were not compressed.
-This is typically useful because user data is limited to ~16384 [#]_ bytes.
+.. _user_data_formats-mime_archive:

 MIME multi-part archive


How would you feel about ordering the "formats that deal with other user data formats" in order of increasing complexity in the list above and also in the sections starting here?

If that seems reasonable, I would suggest:

Include file
Jinja template
cloud config archive
Mime multi-part archive
Part handler

Yes, but I want to put the mime one first as cloud config archive is framed as an alternative to the mime type.

doc/rtd/explanation/format.rst

Fixes canonicalGH-4739

Fixes GH-4739

github-actions bot added the documentation This Pull Request changes documentation label Jul 25, 2024

s-makin reviewed Jul 25, 2024

View reviewed changes

doc/rtd/explanation/format.rst Outdated Show resolved Hide resolved

doc/rtd/explanation/format.rst Show resolved Hide resolved

holmanb mentioned this pull request Jul 25, 2024

#cloud-config-jsonp is almost completely useless #5549

Open

holmanb requested changes Jul 25, 2024

View reviewed changes

holmanb reviewed Jul 25, 2024

View reviewed changes

doc/rtd/explanation/format.rst Show resolved Hide resolved

TheRealFalcon force-pushed the userdata-formats branch from b06010a to 80e8a13 Compare July 25, 2024 19:16

s-makin approved these changes Jul 26, 2024

View reviewed changes

aciba90 mentioned this pull request Jul 26, 2024

Doc drop in 3rd party modules #5548

Merged

2 tasks

aciba90 reviewed Jul 26, 2024

View reviewed changes

holmanb reviewed Jul 26, 2024

View reviewed changes

TheRealFalcon added 8 commits July 26, 2024 15:46

docs: Overhaul user data formats documentation

9dd70e1

update spelling list

7eacb07

Use bold rather than subheadings

3285054

hopefully less confusing header/content-type

fc3c056

comments

15f4c1c

bring back section headers

48533e9

more comments

0dcf106

split the part handler stuff

884d1a7

TheRealFalcon force-pushed the userdata-formats branch from a59274d to 884d1a7 Compare July 26, 2024 21:10

TheRealFalcon requested a review from holmanb July 26, 2024 21:12

fix lint

b70e716

TheRealFalcon commented Jul 26, 2024

View reviewed changes

holmanb reviewed Jul 26, 2024

View reviewed changes

comments

22667de

TheRealFalcon requested a review from holmanb July 29, 2024 17:41

holmanb approved these changes Jul 29, 2024

View reviewed changes

holmanb self-assigned this Jul 29, 2024

TheRealFalcon merged commit f9ab856 into canonical:main Jul 29, 2024
24 checks passed

TheRealFalcon deleted the userdata-formats branch July 29, 2024 19:35

holmanb pushed a commit to holmanb/cloud-init that referenced this pull request Aug 2, 2024

docs: Overhaul user data formats documentation (canonical#5551)

72c342d

Fixes canonicalGH-4739

holmanb pushed a commit that referenced this pull request Aug 6, 2024

docs: Overhaul user data formats documentation (#5551)

79174b6

Fixes GH-4739

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Overhaul user data formats documentation #5551

docs: Overhaul user data formats documentation #5551

TheRealFalcon commented Jul 25, 2024

s-makin left a comment

holmanb left a comment

holmanb Jul 25, 2024

TheRealFalcon Jul 26, 2024

holmanb Jul 25, 2024

TheRealFalcon Jul 25, 2024

holmanb Jul 25, 2024

TheRealFalcon Jul 25, 2024

holmanb Jul 25, 2024

TheRealFalcon Jul 25, 2024

s-makin left a comment

aciba90 left a comment

aciba90 Jul 26, 2024

TheRealFalcon commented Jul 26, 2024

holmanb Jul 26, 2024

TheRealFalcon commented Jul 26, 2024

TheRealFalcon commented Jul 26, 2024

TheRealFalcon Jul 26, 2024 •

edited

Loading

holmanb Jul 26, 2024 •

edited

Loading

TheRealFalcon Jul 29, 2024

holmanb left a comment

holmanb Jul 26, 2024

TheRealFalcon Jul 29, 2024

docs: Overhaul user data formats documentation #5551

docs: Overhaul user data formats documentation #5551

Conversation

TheRealFalcon commented Jul 25, 2024

Proposed Commit Message

Additional Context

Test Steps

Merge type

s-makin left a comment

Choose a reason for hiding this comment

holmanb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

s-makin left a comment

Choose a reason for hiding this comment

aciba90 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheRealFalcon commented Jul 26, 2024

Choose a reason for hiding this comment

TheRealFalcon commented Jul 26, 2024

TheRealFalcon commented Jul 26, 2024

TheRealFalcon Jul 26, 2024 • edited Loading

Choose a reason for hiding this comment

holmanb Jul 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

holmanb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TheRealFalcon Jul 26, 2024 •

edited

Loading

holmanb Jul 26, 2024 •

edited

Loading