-
Notifications
You must be signed in to change notification settings - Fork 882
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(set_passwords): Run module in Network stage #5395
perf(set_passwords): Run module in Network stage #5395
Conversation
Cloud-init blocks login until Config stage completes[1] to prevent users from connecting to the instance via ssh prior to user configuration[2]. To enable faster ssh, cloud-init can simply move the set_passwords module sooner in boot. This follows precedent of previous races of this kind[3]. The grub_dpkg module sometimes takes a long time to run, so this should improve time to ssh in that case and some others. This will make both chpasswd and passwd run earlier in boot than before. This should be safe, since chpasswd and passwd both use PAM, which to my knowledge just requires r/w access to /etc/. This effectively reverts b3c9b6a and moves the set_passwords module into Network stage. [1] canonical#2111 [2] https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/2013403 [3] https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/781101
LGTM, but since this is reordering services we should get a review from an SRU reviewer |
@rbasak would you please review? I'm not sure about whether this would be SRU-able. This is a change in (service ordering)behavior, but the change is gated by a configuration setting change which the user may choose to react to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks.
This change would improve boot performance across all clouds, and as such, I would vote to SRU it.
IMHO this is a feature change—the feature being "performance". Feature changes are explicitly permitted under the current exception, and users are expected to be protected by the existing QA process. Wearing my SRU hat then I think this is fine. It's certainly worth thinking about how to ensure its correctness and minimising regression risk though. Given that reordering modules carries more risk of unexpected interactions than normal, it probably also falls under the "major changes should be called out in the SRU template" requirement from the exception. I suppose there's a risk that an existing user is relying on a window prior to a password change occurring using old credentials, but I think that's a race condition anyway and not reasonable for us to maintain behaviour for. Perhaps that's also worth mentioning in the SRU documentation.
I agree but am happy that you thought about and documented this! If you haven't already it's worth checking that there's no expected interaction with |
I forgot to mention that therefore my conclusion is that most of the regression risk is in the specifics within cloud-init (eg. how modules interact or what's available to modules in which stage, etc) rather than in the interaction with the system. |
I agree with @aciba90 -- this will be a significant performance improvement that should be SRU'd and I can't think of a situation where one would be relying on cloud-init running set_passwords later. |
Thanks for the review @basak.
Both before an after this change, cloud-init's use of |
Results below:
in another terminal confirm that login is possible
now set pam_nologin
and confirm that pam_nologin is set:
now that we've confirmed no_login is set, try modifying shadow via pam:
All appears to work as expected with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
I also just exercised locale setting both C, C.utf8, ja_JP.utf8 etc with a password provided to ensure there was no side-effect passwords with and without unicode characters with set-passwords was ordered before |
With something like this landing in tip of main, we'll also have to update our quilt patches in stable downstream which currently revert the upstream patch which changed this behavior to cope with this changeset in tip of main. Before we decide to SRU this ordering change, let's also reflect this changeset to CC: @enr0n as well to ensure he is aware of changesets to systemd ordering in case something interacts with ongoing foundations work. |
Thanks for the heads up. There is no ongoing work that I'm aware of which would be affected by this. |
@blackboxsw what updates are you thinking besides a patch apply conflict? |
@holmanb I was only thinking that we previously carried a couple of downstream patches we may need to audit if we decide to SRU this feature related to bootspeed to stable releases as we still have a couple of outstanding systemd ordering reverts in place to "retain original behavior" that would likely impact any bootspped results we hope to see on those stable releases: The patches I'm thinking about under consideration in the jammy branch would be
It's possible that there are minor quilt patch conflicts with the above patches that can easily be resolved. It's not necessary that we drop these patches, but something we should evaluate if we are pushing towards better boot performance on stable releases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on the general approach here. We can sort downstream stable behavior related to quilt patches separately when those pull requests are under review.
Cloud-init blocks login until Config stage completes[1] to prevent users from connecting to the instance via ssh prior to user configuration[2]. To enable faster ssh, cloud-init can simply move the set_passwords module sooner in boot. This follows precedent of previous races of this kind[3], and it effectively reverts b3c9b6a and while moving the set_passwords module into Network stage. The snap and grub_dpkg modules take a significant amount of time due to runtime of external commands. This should improve time to ssh for all users of those modules. This will make both chpasswd and passwd run earlier in boot than before. This should be safe, since chpasswd and passwd both use PAM, which to my knowledge just requires r/w access to /etc/. Probably the biggest noticeable effect will be for snapd users, which will no longer have to wait an extra ~13s for snapd to start before they can ssh into the instance. graphical.target @27.796s └─multi-user.target @27.796s └─snapd.seeded.service @14.658s +13.135s └─basic.target @14.207s └─sockets.target @14.197s └─snapd.socket @14.121s +61ms └─sysinit.target @14.006s └─cloud-init.service @11.135s +2.506s └─systemd-networkd-wait-online.service @9.670s +1.448s └─systemd-networkd.service @9.575s +71ms └─network-pre.target @9.561s └─cloud-init-local.service @5.564s +3.983s └─systemd-remount-fs.service @1.581s +150ms └─systemd-fsck-root.service @1.299s +195ms └─systemd-journald.socket @1.011s └─-.mount @880ms └─-.slice @880ms. [1] #2111 [2] https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/2013403 [3] https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/781101
Proposed Commit Message
Additional details
Test Steps