-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add diagrams of provisioning methods #3069
Conversation
f9b27c7
to
dde6274
Compare
dde6274
to
07da0d2
Compare
Installer-based provisioning with PXE and HTTP boot are both ready for comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of adding the images (which end up in every single guide), can't we use https://docs.asciidoctor.org/diagram-extension/latest/diagram_types/plantuml/ to render them when needed? Especially since we won't use them downstream I'd much prefer that.
@ekohl I can't include them at the moment (see PR Description), but noted. For now, I'd like to focus on getting feedback on the diagrams and use this PR for this purpose. We don't have to merge it. |
7e5be73
to
704c567
Compare
Perhaps a new top level directory is a better place? These kinds of references that aren't part of the docs themselves but still influence the docs |
I don't really want to make things more complicated. I'll add the source files in the existing |
704c567
to
6ac4468
Compare
I've hidden the Puppet stuff from the diagram previews to simplify them. (This can be enabled in All diagrams are now ready for comments. |
8fe95a5
to
e6099b4
Compare
guides/common/images/provisioning-installer-bootdisk-fullhost.png
Outdated
Show resolved
Hide resolved
guides/image-sources/provisioning-installer-bootdisk-fullhost.plantuml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive work, this will greatly benefit everyone using Foreman.
@Lennonka I would suggest sharing it on community.theforeman.org, the content is huge, and more eyes see more things. I don't know each workflow in detail myself, would feel more confident to get more opinions from the community.
Host -> DHCP : requests the reserved IP | ||
note over Host : boots from the USB/CD/DVD drive | ||
note over Host : OS installer loads | ||
User -> Host : eliminates the USB/CD/DVD drive\n(too soon?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't know the exact step where it happens, do we have to mention it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, they have to do it at some point, otherwise they would be stuck in a loop :D
So we might as well suggest when the earliest good time is.
As it is a necessary step in the process, I would prefer to keep it.
guides/common/images/provisioning-installer-bootdisk-fullhost.png
Outdated
Show resolved
Hide resolved
e6099b4
to
8845e35
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you're describing is Foreman's orchestration, which is complex. I only looked at one workflow, but I think they're not really correct. For example, I get the impression you think templates are pre-rendered. This is not the case and they're actually all rendered live as requested.
participant "Puppet\nserver" as Puppet | ||
!endif | ||
|
||
note over Foreman : has an image\nwith cloud-init flag |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I look at the UI then the flag is called User Data
, not cloud-init.
group Template [cloud-init] | ||
Foreman -> Proxy : renders the cloud-init script | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internally Foreman will render the user data template as the first step:
https://github.com/theforeman/foreman/blob/e80d5558767bb713c13264e821122baea577af76/app/models/concerns/orchestration/compute.rb#L63-L66
This rendered template is then sent to the compute resource directly and the whole Proxy is not involved at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is correct then.
Proxy -> DNS : forwards DNS records | ||
!if ($puppet) | ||
Foreman -> Proxy : creates Puppet sign request (autosign enabled) | ||
Proxy -> Puppet : forwards Puppet sign request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No proxying is happening. The Smart Proxy (depending on implementation) sets up the Puppetserver so it will accept the Certificate Signing Request (CSR) that the client will later present.
In the default Smart Proxy implementation that means Foreman Proxy rewrites autosign.conf
but there's also a different one based on providing a JWT in the CSR.
https://theforeman.org/manuals/3.11/index.html#4.3.7PuppetCA documents both puppetca_hostname_whitelisting
(default) and puppetca_token_whitelisting
on a fairly low level. There's no documentation for this in the new docs nor any documentation on how to set up the more secure token-based implementation. (And on a related note, I noticed I forgot to drop puppetca_puppet_cert
documentation so I opened theforeman/theforeman.org#2168.)
Foreman -> Proxy : renders the cloud-init script | ||
end | ||
Foreman -> Cloud : creates new instance\nwith cloud-init info | ||
Cloud -> Foreman : reports IP address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically it can report IPv4 and IPv6. If both are reported, that also means the DNS record will get both A
and AAAA
records:
https://github.com/theforeman/foreman/blob/e80d5558767bb713c13264e821122baea577af76/app/models/concerns/orchestration/compute.rb#L69-L72
After this will store the instance details that the compute resource provided:
https://github.com/theforeman/foreman/blob/e80d5558767bb713c13264e821122baea577af76/app/models/concerns/orchestration/compute.rb#L73-L74
Not all compute resources provide IPs and some provide MAC addresses. In that case it will also try to reserve an IP:
https://github.com/theforeman/foreman/blob/e80d5558767bb713c13264e821122baea577af76/app/models/concerns/orchestration/compute.rb#L75-L78
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically it can report IPv4 and IPv6.
That is not the level of detail we want here.
After this will store the instance details that the compute resource provided
Not all compute resources provide IPs and some provide MAC addresses. In that case it will also try to reserve an IP
I don't understand what it means for the diagram.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not the level of detail we want here.
It's about singular vs plural.
Cloud -> Foreman : reports IP address | |
Cloud -> Foreman : reports IP addresses |
I don't understand what it means for the diagram.
Probably not for this specific diagram.
group Template [cloud-init] | ||
Foreman -> Proxy : renders the cloud-init script | ||
end | ||
Foreman -> Cloud : creates new instance\nwith cloud-init info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this?
Foreman -> Cloud : creates new instance\nwith cloud-init info | |
Foreman -> Cloud : sets up the compute instance\nwith cloud-init info |
Foreman -> Proxy : creates DNS records | ||
Proxy -> DNS : forwards DNS records |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not entirely correct. First the code reference: https://github.com/theforeman/foreman/blob/e80d5558767bb713c13264e821122baea577af76/app/models/concerns/orchestration/dns.rb#L63-L71
It will create all feasible DNS record types. That can mean A
, AAAA
and PTR
(for both v4 and v6), depending on the known IP addresses. So the first part is simplified, but correct.
The second part is IMHO confusing. The Proxy tells the DNS server to create the DNS records. A Foreman Proxy, especially the DNS module, implements the facade pattern. It abstracts away the actual implementation away from Foreman. IMHO it is forwarding the DNS record creation, not the records themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what wording do you suggest precisely?
Foreman -> Proxy : creates Puppet sign request (autosign enabled) | ||
Proxy -> Puppet : forwards Puppet sign request | ||
!endif | ||
Foreman -> Cloud : starts the instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming that means that it's correct.
== Cloud-init script == | ||
|
||
!include prov-initial-configuration.iuml | ||
Host -> Foreman : calls home\n(disables build mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the part that can go through the Proxy's template feature. The use of the Foreman Proxy templates module is optional. If that isn't used then cloud-init requests it directly from Foreman.
Host -> Puppet : sends initial facts | ||
Puppet -> Foreman : forwards initial facts | ||
Host -> Puppet : compiles catalog | ||
note over Host : host is configured |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say this is different, depending on how much detail you want to include.
If the host bootstrapped, we can assume a certificate. Our default templates do not include this, so it first starts with the SSL bootstrap:
- Host -> Puppet : send CSR
- Puppet -> Host : send certificate (if allowed to)
Then a regular Puppet run continues:
- Host -> Puppet : send facts
- Host -> Puppet : request catalog
- Puppet -> Foreman : forward facts
- Puppet -> Foreman : request ENC
- Puppet -> Host : respond with requested catalog
- Host runs catalog
- Host -> Puppet : send report
- Puppet -> Foreman : forward report
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remake the SSL bootstrap as part of initial configuration, i.e. prov-initial-configuration.iuml
.
Shocking.
Saying that they are not correct without providing correct information is not helpful.
I am aware of that and different template kinds are requested at different times, which I'm trying to convey in the diagrams:
Thanks for clarifying some stuff, though. |
Ah, now I understand the confusion. They are templates and they are rendered, but I was looking at it from a Smart Proxy modules perspective. We have one module called Then we also have PXE templates. These are managed by the To clarify how the correct proxy is related: it's via the subnet on the provisioning interface. On the subnet you can select a proxy to be used for templates (if not, it's retrieved directly from Foreman). You can select a different proxy to be used for TFTP. In practice I'd say it's uncommon for them to point to different proxies. A more common scenario is that you use one feature but not the other and this is the simplest UI to achieve that. Finish templates are yet another thing. That is the SSH provisioning method, which I believe is typically used on image based provisioning without cloud-init. The code for this is: https://github.com/theforeman/foreman/blob/e80d5558767bb713c13264e821122baea577af76/app/models/concerns/orchestration/ssh_provision.rb#L20-L28 (note the priority is 2000, so it happens way after a host is actually created). I very much think this is the effort I wanted to get started when we had the provisioning discussion in Brno back in May. Digging into this and describing it was what I tried to steer towards. That is why I refer to the code a lot: I'm also looking things up myself. Note that for now I've only looked at If you looked at the code, you may also have noticed I only linked to I'll look at your other comments tomorrow, but wanted to get this out of my head before I went to sleep. |
@ekohl Please, remember that UML is about abstraction. I want to make these diagrams as simplified as possible to make them easy to understand for first-time users. Savvy? |
Very cool tool! |
For now I've only had time to do research and haven't had time to distill it into user-friendly diagrams. I work bottom-up, so I start with all the details. As a next step I'd like to build a table with all the possible orchestration actions and put them in a table with:
That will help me verify the diagrams for factual correctness. Right now we're in the middle of branching preparations and due to breakages in nightly I'm behind, so I won't have time to do so this week. |
@ekohl This has been available for comments since July 20th, some pieces even way before that. And now at the last minute when I need to merge it, you are telling me that you have to do reverse engineering, but you don't really have time to do it before the deadline? |
I started a review because you explicitly asked for it in https://community.theforeman.org/t/rfc-diagrams-of-provisioning-workflows/39045, even though I was very busy with unblocking the Foreman 3.12 branching that happens this week. Then I had questions about half the lines in the first diagram I looked at so I took the time to investigate what they should be. I did that because you didn't provide any sources or references how you built these diagrams. After reviewing one diagram I came to the conclusion that I don't trust your research and factual correctness. Just because you are impatient doesn't mean we should just merge it. I don't consider this time sensitive because it's all well established functionality and we haven't had these diagrams for years. I have a really hard time accepting that we need to merge this when there are known concerns. |
Well, it's true that we have time before it gets actually published. I'll explain it to my managers. |
Sharing my research here. First I did some small investigation where the classes are used. At least for core. For now this is just a list of tasks. I'd like this to be converted into a complete UML diagram, similar to what we have for Kafo (https://github.com/theforeman/kafo/blob/master/doc/kafo_run.uml & https://github.com/theforeman/kafo/blob/master/doc/kafo_run.png). The intended audience for that is developers, but it can be a great source for user documentation. Small rake task I wrote that generate everything belowThis is ugly and a hack. Ideally this is extended a bit with additional plugins, such as Katello. A more dynamic way to find the classes (perhaps using task methods: :environment do
classes = [
Orchestration::Compute,
Orchestration::Puppetca,
Orchestration::SSHProvision,
Orchestration::Realm,
Orchestration::DHCP,
Orchestration::DNS,
Orchestration::TFTP,
Orchestration::ExternalIPAM,
]
Row = Struct.new('Row', :priority, :action, :name, :code)
table = []
puts <<~HEADER
# Models with orchestration
## Host::Managed
* `Orchestration::Compute`
* `Orchestration::Puppetca`
* `Orchestration::SSHProvision`
* `Orchestration::Realm`
## Nic::Managed
* `Orchestration::DHCP`
* `Orchestration::DNS`
* `Orchestration::TFTP`
* `Orchestration::ExternalIPAM`
# Overview
<details>
<summary>Extracted code</summary>
HEADER
classes.each do |cls|
methods = cls.instance_methods + cls.private_instance_methods
methods.select { |method| method.end_with?('_create') }.each do |method|
code = "#{cls.name}##{method.name}"
puts "## #{code}"
puts
puts "```ruby"
puts cls.instance_method(method).source
puts "```"
puts
cls.instance_method(method).source.scan(/queue\.create\(.+?\)$/m) do |parameters|
name = parameters.match(/(:name\s*=>|name:)\s*_\(["'](?<name>.+)["']\)/)[:name]
priority = parameters.match(/(:priority\s*=>|priority:)\s*(?<priority>[\d_]+)/)[:priority]
action = parameters.match(/(:action\s*=>|action:)\s*\[self, :(?<action>[A-Za-z_]+)/)[:action]
table << Row.new(priority, action, name, code)
end
end
end
puts "</details>"
puts
puts "| priority | action | name | code |"
puts "|---|---|---|---|"
table.sort_by { |row| row.priority.to_i }.each do |row|
puts "| #{row.priority} | `#{row.action}` | #{row.name} | `#{row.code}` |"
end
end Models with orchestrationHost::Managed
Nic::Managed
OverviewExtracted codeOrchestration::Compute#queue_compute_create def queue_compute_create
if find_image.try(:user_data)
queue.create(:name => _("Render user data template for %s") % self, :priority => 2,
:action => [self, :setUserData])
end
queue.create(:name => _("Set up compute instance %s") % self, :priority => 3,
:action => [self, :setCompute])
if compute_provides?(:ip) || compute_provides?(:ip6)
queue.create(:name => _("Acquire IP addresses for %s") % self, :priority => 4,
:action => [self, :setComputeIP])
end
queue.create(:name => _("Query instance details for %s") % self, :priority => 5,
:action => [self, :setComputeDetails])
if compute_provides?(:mac) && (mac_based_ipam?(:subnet) || mac_based_ipam?(:subnet6))
queue.create(:name => _("Set IP addresses for %s") % self, :priority => 6,
:action => [self, :setComputeIPAM])
end
if compute_attributes && compute_attributes[:start] == '1'
queue.create(:name => _("Power up compute instance %s") % self, :priority => 1000,
:action => [self, :setComputePowerUp])
end
end Orchestration::Puppetca#queue_puppetca_create def queue_puppetca_create
post_queue.create(:name => _("Cleanup PuppetCA certificates for %s") % self, :priority => 51,
:action => [self, :delCertificate])
post_queue.create(:name => _("Enable PuppetCA autosigning for %s") % self, :priority => 55,
:action => [self, :setAutosign])
end Orchestration::SSHProvision#queue_ssh_provision_create def queue_ssh_provision_create
post_queue.create(:name => _("Prepare post installation script for %s") % self, :priority => 2000,
:action => [self, :setSSHProvisionScript])
post_queue.create(:name => _("Wait for %s to come online") % self, :priority => 2001,
:action => [self, :setSSHWaitForResponse])
post_queue.create(:name => _("Configure instance %s via SSH") % self, :priority => 2003,
:action => [self, :setSSHProvision])
end Orchestration::Realm#queue_realm_create def queue_realm_create
queue.create(:name => _("Create realm entry for %s") % self, :priority => 1,
:action => [self, :set_realm])
end Orchestration::DHCP#queue_dhcp_create def queue_dhcp_create
logger.debug "Scheduling new DHCP reservations for #{self}"
queue.create(id: generate_dhcp_task_id("create"), name: _("Create DHCP Settings for %s") % self, priority: 10, action: [self, :set_dhcp]) if dhcp?
end Orchestration::DNS#queue_dns_create def queue_dns_create
logger.debug "Scheduling new DNS entries"
DnsInterface::RECORD_TYPES.each do |record_type|
if dns_feasible?(record_type)
queue.create(:name => _("Create %{type} for %{host}") % {:host => self, :type => dns_class(record_type).human}, :priority => 10,
:action => [self, :set_dns_record, record_type])
end
end
end Orchestration::TFTP#queue_tftp_create def queue_tftp_create
host.operatingsystem.template_kinds.each do |kind|
queue.create(:name => _("Deploy TFTP %{kind} config for %{host}") % {:kind => kind, :host => self}, :priority => 20, :action => [self, :setTFTP, kind])
end
return unless build
queue.create(:name => _("Fetch TFTP boot files for %s") % self, :priority => 25, :action => [self, :setTFTPBootFiles])
end Orchestration::ExternalIPAM#queue_external_ipam_create def queue_external_ipam_create
return unless (external_ipam?(subnet) || external_ipam?(subnet6)) && errors.empty?
logger.debug "Scheduling new IP reservation(s) in external IPAM for #{self}"
queue.create(id: generate_external_ipam_task_id("create", subnet.network_type), name: _("Creating IPv4 in External IPAM for %s") % self, priority: 10, action: [self, :set_add_external_ip, {:ip => ip, :subnet => subnet}]) if ip.present? && subnet.present? && external_ipam?(subnet)
queue.create(id: generate_external_ipam_task_id("create", subnet6.network_type), name: _("Creating IPv6 in External IPAM for %s") % self, priority: 10, action: [self, :set_add_external_ip, {:ip => ip6, :subnet => subnet6}]) if ip6.present? && subnet6.present? && external_ipam?(subnet6)
true
end
|
@ekohl I'm not sure how I'm supposed to use that table. It seems to me that the table introduces too much detail for what I'm trying to achieve and it would make the diagrams even more complicated than they already are. Also, the table doesn't tell me what triggers those actions. |
Feel free to suggest further modifications in a subsequent PR. Thank you for cooperation! |
I have serious concerns over the fact that this PR seems to have been merged without an approval and with an unresolved request for changes. (For the record, I also searched through Jira and Slack to see if I could find an ack of some sorts in downstream-only resources but didn’t find anything convincing.) I’m worried that merging PRs like this one would set a precedent that I would be highly uncomfortable with. From my point of view, the following conventions were not adhered to before this PR got merged:
I am aware of one precedent when this team merged an update without following the conventions and without finalizing a technical review, but that merge was at least preceded by a team conversation about whether everyone was okay with this approach (#555 (comment)) . And as a matter of fact, that strategy did not work out so well: because the content was not intended for publication, it wasn't anyone's priority, no one took the time to work on it, and then it was deleted a couple of years later: 5e75a54 Because of all of the above, I would like to propose reverting these changes until @Lennonka can secure a transparent approval and address the lingering concerns. Additionally, to prevent a similar situation in the future, we could consider updating our merge process to, for example, prevent anyone from merging a PR that has not received proper technical and peer approvals (by requesting at least one formal ack on a PR). I’d like to know what others would think about this, please do let me know (here or elsewhere, publicly or privately). One final note: The fact that I'm sharing my concerns and proposing the above steps to mitigate the situation does not mean I don't appreciate @Lennonka's effort. I do appreciate it. I think this is a significant and impressive initiative that deserves proper attention and due diligence. |
New diagrams to describe provisioning methods in PlantUML (SVG export available).
The diagrams will not be included in the docs for now, but I will include them once I'll document the provisioning methods properly. This classification of provisioning methods is incompatible with the current docs.
Classification of provisioning methods
Installer-based provisioning with network boot
Installer-based provisioning with boot disk
Image-based provisioning
In some cases I had to guess a lot, so don't take my word as gospel and raise your concerns if you think it's wrong.
Please cherry-pick my commits into: N/A (hoping to finish it for 3.12)