Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove impractical statement about backports #9188

Closed
wants to merge 1 commit into from

Conversation

edolstra
Copy link
Member

Motivation

It has never been our policy to do backports for all intermediate versions. It goes against a rapid release scheme: if we release every few weeks, it's impractical to have to do backports against a gazillion releases between the latest stable version used by NixOS and the current Nix release.

Context

Priorities

Add 👍 to pull requests you find important.

It has never been our policy to do backports for all intermediate
versions. It goes against a rapid release scheme: if we release every
few weeks, it's impractical to have to do backports against a
gazillion releases between the latest stable version used by NixOS and
the current Nix release.
@grahamc
Copy link
Member

grahamc commented Oct 19, 2023

I think it'd be great if there was tighter collaboration with the NixOS maintenance teams on this, too. For example, setting up an agreed schedule to migrate new Nix releases available on nixpkgs's master branch into the unstable version, and moving it to stable a week or two later. From there, that can be reverted based on triage of problems people are seeing in the field.

One "trap" in this current system is there's no real commitment or agreement about how long these versions are "supported" and people are accidentally bound on both sides of this. It'd be great to increase communication and reduce the friction here.

For clarity, Nixpkgs master has the following releases:

  • 2.3.16
  • 2.10.3
  • 2.11.1
  • 2.12.1
  • 2.13.6
  • 2.14.1
  • 2.15.2
  • 2.16.1
  • 2.17.1 (marked as "stable")
  • 2.18.1
    ... with 2.4, 2.5, 2.6, 2.7, 2.8, and 2.9 removed.

while NixOS 23.05 has:

  • 2.3.16
  • 2.10.3
  • 2.11.1
  • 2.12.1
  • 2.13.6 (marked as "stable")
  • 2.14.1
  • 2.15.2
  • 2.16.1
  • 2.17.1
    ... with 2.4, 2.5, 2.6, 2.7, 2.8, and 2.9 removed.

This is a lot of backports to make. Why do so many versions sntick around for so long? How do we reduce the number of versions that are considered "stable", and get focused on moving forward with quality and robustness?

@roberth
Copy link
Member

roberth commented Oct 19, 2023

Why do so many versions stick around for so long?

  • Lack of a stable interface to link against
  • Keeping options open to compensate for mediocre testing
    • sporadic efforts to improve, but nothing radical
  • Quick-but-manual bisecting by having a bunch of built versions in the cache
  • Testing the lib against multiple versions of Nix, notably including the Nixpkgs minimum acceptable Nix version for evaluation, currently 2.3.

@grahamc
Copy link
Member

grahamc commented Oct 19, 2023

For each of those it seems to me that depending using Nix as a flake would better serve most of those options, including the cache -- since the cached versions are already prebuilt in c.n.o for releases. It doesn't seem to me that having them be available in the tip of nixpkgs (master/stable) is necessary to tick those boxes?

@fricklerhandwerk
Copy link
Contributor

I agree with @roberth: the reason Nixpkgs maintainers have to rely on multiple Nix releases is likely lacking quality assurance on our end.

While we sure are liable to fixing that, I suggest to defer the issue of how to deal with that reality within Nixpkgs and NixOS to Nixpkgs and NixOS maintainers, as it won't change quickly.

Improving the testing situation and stabilising programmable interfaces are two of our too many priorities. In my opinion these should be the only ones until further notice. Both are expensive and unappealing, therefore neither likely to get done by volunteers nor easy to get funding for.

@grahamc
Copy link
Member

grahamc commented Oct 20, 2023 via email

@Ericson2314
Copy link
Member

I asked during the meeting, is making backports even hard? I think it takes about 1 minute per backport. @edolstra seems to say actually it's not the backporting that is hard, but the releasing afterwards that is. That makes a lot more sense, and IMO clarifies what we we should do --- make releasing easier.


Backports never skip releases.
If a feature is backported to version `x.y`, it must also be available in version `x.(y+1)`.
This ensures that upgrading from an older version with backports is still safe and no backported functionality will go missing.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we end up keeping the current de facto situation that releases between the one used in NixOS and the latest won't have backported patches, we should at least state that explicitly here.

I still think it would be suboptimal, and things just working as one would expect would be way better UX than some random place in the manual saying "tough luck", but at least people will have a way to be sure about it.

@roberth
Copy link
Member

roberth commented Oct 23, 2023

Having done two series of backports today. The circumstances were pretty bad.

  • One of the backport series was for updating the CI configuration, which hadn't been done consistently before (ie the impractical statement wasn't applied, making further backports impractical). This won't be a problem for subsequent backports, especially when the impractical statement is applied.

  • We're close to 2.13 EOL (because of upcoming NixOS EOL end of year). Number of active release branches is almost as high as it gets.

  • 2.13 should have been 2.15 instead anyway.

More observations:

  • A positive effect of backporting to multiple releases is that it splits up the rebasing work. Each release is a checkpoint of sorts that can be committed, tested on CI, merged independently.

  • If you apply multiple labels at once, you get O(n²) comments on the thread.

    bro

    ci: bump install-nix-action, don't fail fast #8534 (comment)

  • The backport action often fails because of release notes.

  • I've spent very little time doing actual rebase work. Most of it was spent on multitasking, adding labels, checking the PR list.

    • I might have tried to hard to avoid setting multiple labels at once.
    • I could have created more manual backports in one go, instead of waiting for CI. (Though that way I knew I wouldn't have to backtrack, so maybe this is an ok approach while multi-tasking (multi-tasking bad anyway?))
  • It seems that NixOS will release with 2.18; not 2.19. The 2.18 release would have been a good release to make a lot of low-risk "churn" happen.

Conclusion

  • branches are in a better state now
  • release notes tooling will help with backports
  • complete release automation will help with releases (Eelco)
  • we can expect the situation to improve
  • heavy churning season is in March to early April, Sep to early Oct (regardless of the impractical statement question)

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-10-20-nix-team-meeting-minutes-96/34557/1

@fricklerhandwerk
Copy link
Contributor

Discussed in Nix team meeting:

  • We know we need to support the latest release
  • We know we need to support the version in Nixpkgs stable
  • How about the versions in between?
    • @regnat: Supporting them adds a bunch of overhead
    • @roberth: Because Nix has no stable C++ API, it is sometimes needed to stick to an older Nix version for projects that use it
  • Decision:
    • Support the range of versions from NixOS stable to the latest one
    • @roberth will act as a watchdog to make sure we're following that rule

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-11-10-nix-team-meeting-minutes-102/35379/1

@lf-
Copy link
Member

lf- commented Nov 13, 2023

Why do so many versions stick around for so long?

I don't have a direct answer to that question, but I can point out a pattern of Nix versions being unstable on their initial release, which, when wearing my distro maintainer hat, I would say leads to desire to be very cautious with updating to newer releases without also testing them pretty extensively. Given this pattern plus the lack of C++ API compatibility across releases, I think that the current NixOS policy of maintaining the set of Nix versions that was there at release until the release goes EOL is pretty reasonable in view of keeping peoples' systems working on the versions they know work correctly.

The reason I've noticed these breakages, incidentally, is because I get pinged with most Nix releases updating in nixpkgs due to the Nix C++ API changing (for a pretty simple use case) and causing a nix-doc build failure and subsequent nix-doc release with another #ifdef. I don't blame the Nix maintainers here, changes are necessary for evolution, yet, a secondary actually-stable API cannot come soon enough.

To illustrate, here is a likely-incomplete list of Nix versions that have had broken initial releases due to inadequate testing upstream in about the past year, which led to being reverted downstream in nixpkgs:

When a release breaks in such a way that downstream catches it, it creates a need for emergency releases of dependent projects such as nixpkgs or the installer to revert the updated Nix version, and it erodes trust in new releases being good enough to ship quickly.

My suggestion to the Nix team is to work on making testing more extensive and to spend more compute on testing releases. It is worthwhile compiling a list of commonly used CI tools and adding them to tests. It is likely worthwhile writing some tooling to check code coverage on evaluating/building things from nixpkgs with hydra or similar, and hunt for tests that add coverage that is not hit in the Nix test suite.

Operationally, #7830 regressed since apparently the tests weren't run(?). I suspect this process issue has been fixed since.

I agree with the comments above about improving the release process rather than stopping doing backports.


To be clear, Nix is not uniquely bad on this. ghc regresses a lot of things pretty often, and has had also very many seriously bad releases due to inadequate testing (including not running test suites which exist but weren't wired into CI), at least a few of which wound up getting caught by Hydra building the known universe and finding that various packages' tests were failing. This is a hard domain, and I greatly appreciate the work done by the Nix team in improving processes over the past year.

However, without putting in the work to get a rate of regression closer to Rust's, for example, where there are .1 releases, but they mostly are not seriously regressing, I don't think it is reasonable to expect that downstream consumers will trust running the latest version of Nix as quickly as downstream consumers tend to adopt the latest stable Rust release.

@roberth
Copy link
Member

roberth commented Dec 1, 2023

  • heavy churning season is in March to early April, Sep to early Oct (regardless of the impractical statement question)

Actually shortly after the NixOS release is another opportunity, because we get to backport the churn (if it's low risk!) at the lower cost of one or two releases worth of backports.

@fricklerhandwerk
Copy link
Contributor

Closing as we now implement the procedure as described.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants