Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nixos/systemd-boot-builder: Allow characters like before #334473

Conversation

ElvishJerricco
Copy link
Contributor

Description of changes

Reverts #333952

Modify the regex in systemd-boot-builder.py to match specialisation names correctly.

Similar to #333857, but this regex works with boot counting, I've reverted the faulty mitigation that was merged in the meantime, and I've added a test.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Aug 13, 2024
@r-vdp
Copy link
Contributor

r-vdp commented Aug 13, 2024

Should we still assert that specialisation names don't contain + then?

@ElvishJerricco
Copy link
Contributor Author

Hm.... Well, in theory we could modify the regex to something more complicated like (.+?)(\+\d(-\d)?)? to match the name followed by an optional boot count like +2-1, so that + can be in the specialisation name, because the boot loader specification technically says that's allowed. But the actual implementation in systemd-boot would bug out if you had such a boot entry file name; the linked code would see the first +, fail to parse a boot count, and miss the real boot count later in the name.

@r-vdp
Copy link
Contributor

r-vdp commented Aug 13, 2024

Yeah, so I meant, should we not have an assertion in nixos then that fails if a specialisation name has a +, just to avoid anyone ending up in such a situation?

@ElvishJerricco
Copy link
Contributor Author

Well, systemd-boot would only bug out if it should be doing boot counting, but there's a + in the name. So that limits the scope. So maybe we go with the more complicated regex, and have the assertion only if boot counting is enabled. Everything will continue working the way it did before boot counting if you don't have it enabled, including + in the name, because the regex would accept names the same way as before. Though I guess the regex will have to only include the (\+\d(-\d)?)? part if boot counting is enabled, or else we'd break on specialisation names like foo+3.

@ElvishJerricco
Copy link
Contributor Author

Actually, no, regardless of any of this, you just can't have a specialisation name that ends in a valid boot count. That will screw up systemd-boot. So that must actually be a bug we've had all along. So yea, we can use a better regex, and we can assert that with systemd-boot enabled, you're not allowed to have a specialisation whose name ends in a valid boot count.

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Aug 13, 2024
@@ -65,7 +65,7 @@ def from_path(cls: Type["Entry"], path: Path) -> "Entry":
# Matching nixos*-generation-$number*.conf
rex_generation = re.compile(r"^nixos.*-generation-([0-9]+).*\.conf$")
# Matching nixos*-generation-$number-specialisation-$specialisation_name*.conf
rex_specialisation = re.compile(r"^nixos.*-generation-([0-9]+)-specialisation-([a-zA-Z0-9_]+).*\.conf$")
rex_specialisation = re.compile(r"^nixos.*-generation-([0-9]+)-specialisation-([^+]+).*\.conf$")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s be as restrictive as possible here, rather than loosening the grammar. We’ve already gotten bitten by having allowed too much in the past; allowing literally any character other than + is probably setting ourselves up for regret. How about just [a-zA-Z0-9_-]+? (And I guess we should probably match against the counter format as well?)

You said something about specializations starting with generation breaking things, too? Can we check for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was wrong about the generation thing.

We can't be particularly restrictive here, because the whole reason we have the bug in the first place is that specialisation names used to be unrestricted. To maintain compatibility, we have to be as relaxed as possible.

Now, as I said above, I've realized that we've always had problems if your specialisation name ended in a valid boot count, so I'm currently making a change to assert against that. And the other problem is that, because of some faulty systemd code, you can't have a + in your specialisation name if you're using boot counting. So since boot counting is a new feature, I don't mind adding an assertion in that case against having + in the name at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but we’ve learned that we cannot be as permissive as we were and need to break compatibility. Keeping compatibility with arbitrary strings is no longer an option. To avoid further misery, we should be as strict as possible under the constraints of existing reality. We’ve found that disallowing dashes is too far, but I don’t think that means we need to support completely arbitrary strings. After all, anyone who decided + would be a nice character to include in their specialization name already has a problem.

So I think that, rather than clinging to backwards compatibility with something that was way too permissive, we pick the most restrictive thing we can that doesn’t break everyone’s setup. I’d be surprised if there are many people at all with specializations that don’t match [a-zA-Z0-9_-]+, given Nix’s identifier syntax.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like, there’s no blocker to loosening the regex some more if we find out that it’s really too strict. But as we’re learning now, making things stricter is hard. We have to do it now, so we should make the most of it to minimize future regret.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@emilazy I very strongly disagree. The point of this series of PRs is to fix a bug. It is not to improve the acceptable inputs. What you're describing does not fix the bug. It perpetuates the bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can consider adding restrictions at a later time. But I think it needs consideration and thoughtfulness, especially with regards to compatibility and migration. Given that this is a channel blocker, I think it's important to do the most obvious thing quickly, which is the maximally compatible thing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I mean, arguably not being able to put boot counts is also a “bug” that started as soon as systemd-boot started looking at that, right? We’re already choosing to perpetuate that bug by fixing that and breaking compatibility. To avoid perpetuating it we’d have to patch systemd-boot or something. So I feel like we’ve already crossed that rubicon and ought to take steps to avoid having to cross it again.

But, how about just using the much more strict regex only if boot counting is on, since it’s going to break naming compatibility and need separate handling anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure but that bug has been around a long time and no one noticed; it's not a brand new regression that we need to fix.

I don't particularly see the point in using a more strict regex if boot counting is on. But admittedly, IMO the real solution to all these problems is to stop using file names as data, and to migrate to a more reasonable separate state file for tracking this stuff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, with this change we already literally are using a more strict regex if boot counting is on, it’s just only strict about one specific thing we regretted :)

But anyway I don’t want to block the channel further, just voicing my concerns about this coming back to bite us again later.

@ElvishJerricco ElvishJerricco force-pushed the systemd-boot-specialisation-regex branch from 571fe2e to 05e1dd3 Compare August 13, 2024 23:59
@ElvishJerricco
Copy link
Contributor Author

Ok, new changes. No longer reverting the assertion. Just making it much more specific, to account for A) A bug we've apparently always had and never known about, and B) Incompatible behavior with the new boot counting feature.

The regex now matches anything for the specialisation, excluding for any boot count suffix. The aforementioned assertion, along with the fact that a name with a boot count suffix would have errored in the past, should make this both safe and maximally compatible.

@ElvishJerricco
Copy link
Contributor Author

Closed in favor of #334526

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants