Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update NoCloud docs #5521

Merged
merged 9 commits into from
Jul 26, 2024
Merged

Conversation

holmanb
Copy link
Member

@holmanb holmanb commented Jul 17, 2024

Proposed Commit Message

doc(NoCloud): Categorize the different configuration types

Formally document providing runtime configuration in system configuration.
Introduce names to identify previously unnamed NoCloud concepts.
Add more structure - discrete sections for:
- runtime configuration types
- discovery configuration
- configuration sources

Additional context

Related to #5515

I'm not opposed to changing the new names if better ones are proposed.

I think that it is important to name and group different classes of configuration to create a more structured model for users (and developers) to think about cloud-init.

@holmanb holmanb requested review from aciba90 and s-makin July 17, 2024 23:37
@holmanb holmanb force-pushed the holmanb/nocloud-config-detect-docs branch 2 times, most recently from dc967ce to 8e4f63e Compare July 18, 2024 06:37
Add more structure to the docs.
Introduce vocabulary to describe core (yet unnamed) cloud-init concepts.
@holmanb holmanb force-pushed the holmanb/nocloud-config-detect-docs branch from 8e4f63e to d30d6a4 Compare July 18, 2024 06:41
@holmanb holmanb changed the title Holmanb/nocloud config detect docs Update NoCloud docs Jul 18, 2024
Comment on lines 19 to 23
Cloud-init discovers four types of configuration at runtime. The source of
these configuration types is configurable with a meta-configuration. This
meta-configuration can be delivered to cloud-init in different ways, but is
different from the configurations that cloud-init uses to configure the
instance at runtime.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this quite confusing - even after reading it a few times, I'm still a bit lost. I think what it's saying here is:
"There are four configuration types that cloud-init uses to configure the instance at runtime. The source of each type can be set using a meta-configuration, which can be delivered to cloud-init in two ways."
(let me know if I've understood that wrongly)

Copy link
Contributor

@aciba90 aciba90 Jul 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We are introducing yet another new term meta-configuration, which in https://docs.cloud-init.io/en/latest/explanation/configuration.html#base-configuration is called base-configuration. For the sake of simplicity, could we standardize on only one name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"There are four configuration types that cloud-init uses to configure the instance at runtime. The source of each type can be set using a meta-configuration, which can be delivered to cloud-init in two ways."
(let me know if I've understood that wrongly)

+1 yes that is the correct understanding - I'm happy to use the wording you've suggested if that is preferred.

I agree. We are introducing yet another new term meta-configuration, which in https://docs.cloud-init.io/en/latest/explanation/configuration.html#base-configuration is called base-configuration.

I wouldn't call these two the same thing. Base configuration maps to what I call "system configuration" in this page. This new term was introduced to describe something that both the "line configuration" and "base configuration" are capable of providing - that is, the source of each type.

For the sake of simplicity, could we standardize on only one name?

I'm happy to cross-link and standardize (either base configuration or system configuration works for me) with the page you linked.

@s-makin @aciba90 thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

user data, vendor data, metadata, and network config already have their own pages, so I'm not sure we need to re-explain them. Can we instead just link to them and explain where they need to go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think linking to them makes sense, but the links should come with context, otherwise readers might not know which type they're interested in. I'd be happy for us to leave the subsections as they are to provide this info at the point where users might be looking for it, but with the links added to show where to find more detailed info.

Re: standardizing on terms, I'm always in favour of simplifying and using the minimum amount of jargon possible. I don't have strong opinions on which term is better, so happy to leave that to your discretion, but I agree with Alberto that we should pick one and use that everywhere. When you say "line configuration", I don't know if I know what that means and haven't seen that term used elsewhere. Do you mean passing config via the command line? If so, I'm not sure that "line configuration" describes it in a satisfactory way because it feels like it could mean a few different things.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you say "line configuration", I don't know if I know what that means and haven't seen that term used elsewhere.

It is a new term. Please let me know if there is something missing from the current written definition.

In cloud-init we have a way of handling a configuration that gets passed either on the kernel command line or SMBIOS. Both sources of line configuration use the same format, but nobody ever bothered to give this configuration format a name, so any time we want to reference it, we have to call it "the configuration format that gets accepted by BIOS/DMI and kernel command line". Using this mouthful of a phrase over and over again is tiring to parse, and if a user is unfamiliar with one of the two modes of providing line configuration then it becomes even more labour intensive to grok the passage that the phrase gets used in.

Do you mean passing config via the command line?

That is one of the ways that a line configuration can be passed, yes, but it is broader than that.

If so, I'm not sure that "line configuration" describes it in a satisfactory way because it feels like it could mean a few different things.

The intent is to have a term that could mean multiple things because a line configuration is a configuration format that can also be provided as a serial number through smbios/DMI.

Copy link
Contributor

@s-makin s-makin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I thought I was adding my comments in a single review, turns out not >< anyway, added a few comments etc

different from the configurations that cloud-init uses to configure the
instance at runtime.

user-data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For symmetry with vendor-data:

Suggested change
user-data
user-data (optional)
--------------------

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this symmetry would be nice, however it that's not how it works. This implementation choice was a little odd from a UI perspective, since user-data can actually be empty.

Maybe it would be best to standardize this interface and allow all of them to be optional?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meta data and user data are sometimes optional depending on the source. It's maybe not worth documenting separately, but e.g., when using the kernel cmdline, I don't think you need either.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meta data and user data are sometimes optional depending on the source.

Thanks for the clarification @TheRealFalcon, and my apologies @aciba90 you were right too - we were describing the behavior of different sources. I guess I missed that they behave differently for different sources. That's.... not at all intuitive. It looks like wasn't always that way either - in c125394 user-data and meta-data changed from being optional to being required for labeled filesystems and for the local filesystem. Unfortunately the commit message doesn't even mention this behavior change so we have no justification for why these two sources behave differently.

when using the kernel cmdline, I don't think you need either.

Confirmed. On further examination of the code it appears that all four types are optional for the following sources: kernel command line, DMI, system configuration at the top level[1], system configuration nested under the NoCloud key[2], HTTP, FTP.

Unless we think that there is a good reason to require I'm tempted to standardize this behavior so that we can have a more intuitive and user-friendly user-interface. Currently a local datasource that doesn't provide a meta-data file or a user-data file is already on a failure path, so just silently removing this superficial requirement would probably be safe to include in SRU. While this still would technically be required for users of nested system configuration in order for ds-identify to identify the system (in the case that no override is provided ds=nocloud isn't in the line configuration and datasource_list has >1 datasources included), at least cloud-init's Python code wouldn't be imposing a false requirement.

@aciba90 @TheRealFalcon Thoughts?

[1] note, however, that ds-identify nocloud detection is completely broken and always has been, however these configurations will still by applied by cloud-init if None is included in the datasource list (it usually is unless only a single datasource is specified)
[2] however ds-identify nocloud detection fails unless user-data and meta-data are both provided

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm tempted to standardize this behavior so that we can have a more intuitive and user-friendly user-interface.

Just to confirm, you want to try to standardize on making user-data and meta-data optional for every possible ways of configuring NoCloud, right? If that's the case, I think it is worth the implementation and SRU effort.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've filed #5553 to track this.

@blackboxsw blackboxsw added the documentation This Pull Request changes documentation label Jul 24, 2024
Copy link
Contributor

@aciba90 aciba90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for overhauling this page. I really like the separation of configuration sources and meta configurations.

I left some extra inline comment / requests.

In addition to that, regarding user-data, meta-data, vendor-data, network-config. We sometimes refer to them with the - in between the both words e.g. user-data, other times all together e.g., userdata, and other times with a space in between e.g. user data. I know this happens all over the place and it is not exclusive of this document page but, do you think it is worth trying to adhere to only one of those naming styles? I am thinking about users trying to look for those terms.

doc/rtd/reference/datasources/nocloud.rst Outdated Show resolved Hide resolved
doc/rtd/reference/datasources/nocloud.rst Show resolved Hide resolved
Copy link
Member

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments inline. Additionally,

We lost the warning at the top of the page.

User data placed under /etc/cloud/ will not be recognized as a source of configuration data by the NoCloud datasource. While it may be acted upon by cloud-init, using DataSourceNone should be preferred.

Was that intentional?

Also, not something you added, but since you're overhauling the page, can we get a Network Configuration section at the bottom? Example meta-data flows right into talking about network config without a new header anywhere. It makes it hard to link to that part of the page.

See the :ref:`network configuration<network_config>` documentation for
information on network configuration formats.

Meta-configuration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about calling this "Discovery configuration"? Meta seems a bit generic to me and might cause confusion when referring to meta data.

Long-term, I don't think it makes sense for this section to live here. Neither form of configuration listed here is specific to nocloud. It also feels a little redundant when reading the Configuration Sources section after this one, but for now I'm not sure we have a better place for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about calling this "Discovery configuration"? Meta seems a bit generic to me and might cause confusion when referring to meta data.

Nice suggestion, thanks!

Copy link
Member Author

@holmanb holmanb Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long-term, I don't think it makes sense for this section to live here. Neither form of configuration listed here is specific to NoCloud. It also feels a little redundant when reading the Configuration Sources section after this one, but for now I'm not sure that we have a better place for it.

Agreed. Perhaps we could create a new page that generically describes cloud-init's "identify" stage - which this is effectively a configuration for. Then we could link to the new page from the boot stages page.

Aside: I also think that the Example: Creating a disk section probably deserves its own "How to" page that can be linked from this page.

doc/rtd/reference/datasources/nocloud.rst Outdated Show resolved Hide resolved
@holmanb
Copy link
Member Author

holmanb commented Jul 25, 2024

Many thanks for overhauling this page. I really like the separation of configuration sources and meta configurations.

Thanks!

I left some extra inline comment / requests.

I think that I've addressed them all.

In addition to that, regarding user-data, meta-data, vendor-data, network-config. We sometimes refer to them with the - in between the both words e.g. user-data, other times all together e.g., userdata, and other times with a space in between e.g. user data. I know this happens all over the place and it is not exclusive of this document page but, do you think it is worth trying to adhere to only one of those naming styles? I am thinking about users trying to look for those terms.

+1 I've updated these terms to follow the style guide, but I do have misgivings about the guide as it is. I filed a separate issue for those.

@holmanb
Copy link
Member Author

holmanb commented Jul 25, 2024

I left some comments inline. Additionally,

I think that I've addressed them all

We lost the warning at the top of the page.

User data placed under /etc/cloud/ will not be recognized as a source of configuration data by the NoCloud datasource. While it may be acted upon by cloud-init, using DataSourceNone should be preferred.

Was that intentional?

Yes, I forgot to mention that.

I don't think that the warning serves much purpose after this change, since the docs now describe how to include user-data directly in the base configuration for NoCloud - so what reason is there to prefer DataSourceNone? If the user reads this page, they will know that placing user data under /etc/cloud isn't how NoCloud expects to receive its configuration.

Also, not something you added, but since you're overhauling the page, can we get a Network Configuration section at the bottom? Example meta-data flows right into talking about network config without a new header anywhere. It makes it hard to link to that part of the page.

Let me know if the change I've made makes sense - I just renamed to match the example meta-data format.

@holmanb
Copy link
Member Author

holmanb commented Jul 25, 2024

Thanks for the reviews @TheRealFalcon @aciba90! I think I've addressed all comments. Requesting re-review.

@TheRealFalcon
Copy link
Member

If the user reads this page, they will know that placing user data under /etc/cloud isn't how NoCloud expects to receive its configuration.

Yeah, this is much better. I'd still prefer putting some kind of warning or note when talking about system configuration to look to the Source 1 example of doing things rather than trying to put cloud-config directly into files in under /etc/cloud. I'm guessing that people will probably still think "I have something called cloud-config, so it should go in the cloud.cfg", but if you disagree I think it's ok as-is.

Let me know if the change I've made makes sense - I just renamed to match the example meta-data format.

Sorry, I meant before line 439. The page switches from talking about meta data to talking about network config, but never introduces a new section about it.

@TheRealFalcon
Copy link
Member

This isn't your change, so not necessary to change here, but the aliases emphasis strength makes it hard to understand:
image
I initially read it as seedfrom alias: s (without the emphasis) which is kind of hard to parse. Also, what does cloud-config preferred mean?

@holmanb
Copy link
Member Author

holmanb commented Jul 25, 2024

Also, what does cloud-config preferred mean?

Providing multiple ways of accomplishing the same thing is not ideal. Since we're discouraging use of this, we should provide an alternative. That's what this was trying to accomplish.

@holmanb
Copy link
Member Author

holmanb commented Jul 25, 2024

Sorry, I meant before line 439. The page switches from talking about meta data to talking about network config, but never introduces a new section about it.

Gotcha, I've added that. Also I removed the bits about /etc/network/interfaces files because they are not relevant to NoCloud and debian is using netplan in cloud images.

Copy link
Contributor

@aciba90 aciba90 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, many thanks!

@TheRealFalcon
Copy link
Member

TheRealFalcon commented Jul 26, 2024

Also I removed the bits about /etc/network/interfaces files because they are not relevant to NoCloud

It is specific to NoCloud though...
https://github.com/canonical/cloud-init/blob/main/cloudinit/sources/DataSourceNoCloud.py#L224

and debian is using netplan in cloud images.

I'm not sure why Debian is relevant here.

@holmanb
Copy link
Member Author

holmanb commented Jul 26, 2024

and debian is using netplan in cloud images.

I'm not sure why Debian is relevant here.

Debian is relevant because it is one of the three distros that were called out as eni users (Alpine, Debian, and Ubuntu). Since Debian no longer uses eni on cloud images, that leaves only Alpine users that might be using this format on their stock cloud images. Since a shrinking number of distributions are using this as their default distro format, this isn't as relevant as it used to be.

It is specific to NoCloud though...

What I'm proposing is a step in the direction of deprecating ENI as an input configuration format. This format has been called "legacy" for years and is only supported by NoCloud and ConfigDrive - although the docs (erroneously) claim NoCloud and OpenStack.

Do we want to support 3 user-facing input formats indefinitely? I would prefer if we only supported 2 of these formats (or 1 really). I think energy would be better spent on cloud-init's networking stack if we focused users (and developer effort) onto fewer formats. Un-documenting this legacy format is a step in that direction, and if we have agreement on this direction I'm happy to remove the other reference to the eni format from the docs as well (in a followup PR).

@blackboxsw
Copy link
Collaborator

and debian is using netplan in cloud images.

I'm not sure why Debian is relevant here.

Debian is relevant because it is one of the three distros that were called out as eni users (Alpine, Debian, and Ubuntu). Since Debian no longer uses eni on cloud images, that leaves only Alpine users that might be using this format on their stock cloud images. Since a shrinking number of distributions are using this as their default distro format, this isn't as relevant as it used to be.

It is specific to NoCloud though...

What I'm proposing is a step in the direction of deprecating ENI as an input configuration format. This format has been called "legacy" for years and is only supported by NoCloud and ConfigDrive - although the docs (erroneously) claim NoCloud and OpenStack.

Do we want to support 3 user-facing input formats indefinitely? I would prefer if we only supported 2 of these formats (or 1 really). I think energy would be better spent on cloud-init's networking stack if we focused users (and developer effort) onto fewer formats. Un-documenting this legacy format is a step in that direction, and if we have agreement on this direction I'm happy to remove the other reference to the eni format from the docs as well (in a followup PR).

I think we agree with the sentiment that we want to support less formats and spend energy maintaining and growing network v1 and v2. We also discussed that the best path forward would be to allow our code to represent a given feature as deprecated when it encounters that format, allow the deprecation message to suggest the desirable alternative format and provide a reasonable deprecation window before removing the supporting code. Omitting the documentation of deprecated features when we have active deprecation() messaging in our code makes sense.

@holmanb holmanb merged commit 81ef45e into canonical:main Jul 26, 2024
24 checks passed
holmanb added a commit to holmanb/cloud-init that referenced this pull request Aug 2, 2024
…#5521)

Formally document providing runtime configuration in system configuration.
Introduce names to identify previously unnamed NoCloud concepts.
Add more structure - discrete sections for:
- runtime configuration types
- discovery configuration
- configuration sources
holmanb added a commit that referenced this pull request Aug 6, 2024
Formally document providing runtime configuration in system configuration.
Introduce names to identify previously unnamed NoCloud concepts.
Add more structure - discrete sections for:
- runtime configuration types
- discovery configuration
- configuration sources
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation This Pull Request changes documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants