The debuginfo Rube Goldberg machine #3188

pmatilai · 2024-06-28T10:12:52Z

pmatilai
Jun 28, 2024
Maintainer

Working on an old piece of code like rpm is much like city infrastructure renewals: you try to expect the unexpected and plan accordingly, but every now and then you'll still get surprised when you break the asphalt: "what are all these pipes, they don't exist in any drawing?". And consequently, work gets delayed to sort it all out. Several times.

Perhaps the main headline feature in rpm 4.20 is the declarative buildsystem support in the spec files. This was a feature I first dreamed up around 2012, which alone suggests there was quite a bit of plumbing to sort out before it could happen. One of the more critical support features for that was ability to append and prepend to existing spec sections. In order to support that, the previously very special %prep section with its built-in %setup and %prep pseudo-macros needed to be turned into normal scriptlet, and in order to do that, the pseudo-macros needed to be turned into real macros, and in order to do that, the macro engine needed a rather thorough rework, also over several years and countless changes like #1406 and #1434. The first concrete step towards declarative builds was introduction of %autosetup in 2012, complemented with %patchlist in 2019. And so on, those are just the tip of the iceberg. It was a lot of work spread over more than a decade, but these factors were reasonably well known ahead. But this is all just backdrop to the thing that did get us by surprise, right on the finishing lines. If you do things with rpm, you have probably encountered it a few times: debuginfo packages.

That story begins somewhere around 2002. I wasn't deeply involved with rpm at that time so the early parts is based info gathered and deduced from commits to rpm and redhat-rpm-config and various fragments I've heard/read over the years and so may contain inaccuracies. But AIUI, the toolchain people at Red Hat were tasked with making debugging released binary builds meaningful. I don't know whether the task was specifically to achieve this without major changes to rpm itself, but that's how it took place: it was practically all implemented with macro voodoo + a helper script and a binary, none of which needed to be inside rpm. Much of it ended up in the rpm repository sooner or later, but it didn't need to be there. It's no mean feat, really, but it also did require some quite, uh, creative solutions.

Of course in the intervening 22 years a lot happened. In particular, around 2017 Mark Wielaard practically rewrote the underlying debugedit tool and introduced some in-rpm code for better integration, and Michael Schroeder and Richard Biener added support for debuginfo sub-packages, which also needed in-rpm code. And then in 2021 debugedit and the helper script was split to an external project because people outside the rpm ecosystem got interested in them. To a great relief to us rpm maintainers: debugedit deals with deep ELF format internals, and we never really knew what to do with it anyhow. In all that flux, the one thing that didn't change is the one thing that was almost certainly intended as a temporary hack only: the way debuginfo packages are actually enabled. It also never entered the rpm codebase at all. And that's what we ran into head-on, 22 years later, basically on the eve of the 4.20 alpha release.

In broad strokes, debuginfo packages live as template macros which are used to generate the spec preamble for them, and then a script invoked from %install post template runs to edit and collect the files. This all is done quite neatly in the generic spec scriptlet template infrastructure, except for one thing: how do you inject something into nearly every single spec preamble, without actually modifying them? I believe the brilliant-awful macro hack was originally by Elliot Lee in redhat-rpm-config, for accomplishing something else. The people adding debuginfo support saw the trick and ran away with it. Since 2002, redhat-rpm-config has contained this macro definition:

%install %{?_enable_debug_packages:%{?buildsubdir:%{debug_package}}}\
%%install\
%{nil}

This is the entry to our little Rube Goldberg machine. There's an incredible amount of powerful magic embedded in those three lines.

You need to be familiar with the rpm spec syntax and macros to properly follow this, but %install marks the beginning of the shell scriptlet where the packager tells rpm which content to put in the resulting binary package. So, %install is just a section opener string, hard-coded inside the spec parser for that purpose, and doesn't "do" anything by itself. However the above macro turns this innocent section marker to quite something else. In English: if %_enable_debug_packages macro is defined, then if %setup was used in the spec %prep section (%buildsubdir is a side-effect of that), expand the contents of %{debug_package} macro here, and then add back the %install section marker as if nothing happened.

%debug_package is defined something like this (this and following snippets trimmed for brewity):

%debug_package \
%ifnarch noarch\
%global __debug_package 1\
%_debuginfo_template\
%endif\
%{nil}

%_debuginfo_template is the spec preamble definition of a debuginfo package, something like this:

%_debuginfo_template \
%package debuginfo\
Summary: Debug information for package %{name}\
%description debuginfo\
This package provides debug information for package %{name}.\
%files debuginfo -f debugfiles.list\
%{nil}

So the %install macro override emits all that %package definition into the spec preamble section inside a %ifnarch noarch conditional to prevent it from firing on arch independent packages (which aren't expected to contain ELF files), and then emits that original %install to let you proceed with whatever it was your package does in there. It's really quite clever, but at the same time, awful. It gets weirder from there though. Notice how there's a %global __debug_package 1 inside the %ifnarch block? You'd think that it doesn't get defined on noarch packages, but it does. The macro engine doesn't know anything about %if and the like, it's only something that looks like a macro but is undefined so falls through untouched, any macros expanded. The multiline result then gets passed to the spec parser which processes the %ifs and the other content.

At the other end of %install, the rest of the magic is embedded inside %_spec_install_post template macro, something like the following, and gets appended to end of %install behind the scenes during the actual build of a package:

%__spec_install_post\
%{?__debug_package:%{__debug_install_post}}\
%{__arch_install_post}\
%{__os_install_post}\
%{nil}

Note how it tests for %__debug_package definition to avoid triggering on noarch packages. But we just concluded in the above that it gets always defined! Yet, somehow debuginfo packages are not generated for debuginfo packages, so it must work somehow? Well, it doesn't. The %__spec_install_post section actually fires for noarch packages but it silently falls through as there's nothing for it to do on a normal noarch package. But, it can leave behind tell-tale debugfiles.list etc files in the build directory if you go looking. So how does it not fail with errors then? The catch is that the %ifnarch noarch block in the %debug_package macro works for the spec preample part, so the debuginfo package is never created, and so rpm doesn't go looking for it, and the *.list files end up just being some junk in the directory, rpm doesn't care.

That's why I call it a Rube Goldberg machine: the complications may not be intentional, but it sure is complicated and precarious.

Now, what does this all have to do with our declarative buildsystems? Well, the related append and prepend options means %install can occur multiple times in the spec with -a or -p options, and you can probably see how that wouldn't go too well with this. One may think, couldn't you just turn the %install macro override into a parametric macro which only emits the debug stuff when no arguments are passed and look away for another twenty years? Well, maybe, but the madness has to stop somewhere.

The real rub was that because this %install override over exists in distros and not rpm upstream, we only really ran into it when it was far too late in the release process to start reworking something like that. Technically, I knew it existed but had blissfully forgotten, and certainly didn't remember and realize the implications when adding append/prepend modes. In any case, this blocked the use of our headline feature, so we scrambled for a few weeks to get debuginfo enablement logic properly and fully upstreamed. The existing frail machinery, together with tens of thousands of packages built on top and sometimes around it, each in their own sometimes peculiar ways, was always a terrifying thing to modify, and doubly so when under time pressure.

The end result in 4.20 utilizes some of the new dynamic spec generation features Florian Festi has been working on. It was no walk in the park though, it took us several weeks of experimenting over multiple pull-requests to get it right to the point it currently is. Among other fun, there was a bug which causes %_target_cpu and various other macros + variables to disagree with the rest of the spec during scriptlets and dynamically generated spec content, on specs with BuildArch, when an explicit --target is not passed. Causing surprises such as getting debuginfo packages when you don't expect them. And when I finally made rpm automatically reload the right platform configuration if --target noarch is not specified to address that, we discovered that mock always passes something like --target $(uname -m) to rpmbuild, even for noarch packages. Except inside koji where it always passed --target noarch for them. And so on. We also intended to enable debuginfo packages for packages without %setup, but that turned out to be too much breakage. The nice thing about the new setup is that debuginfo packages are only ever emitted during an actual build, so they don't pollute spec queries with speculative, incomplete stuff.

What is present in 4.20 resembles the old madness way too much for my liking, but various details of the old implementation have leaked to thousands of specs in ways that make changing them impossible or nearly so. At least all the machinery is now upstream under our eyes where we can hopefully simplify and streamline it gradually over time.

To those who made it this far: this is hopefully the start of on-going blogs about rpm development, "tales from the trenches" and whatnot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The debuginfo Rube Goldberg machine #3188

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

The debuginfo Rube Goldberg machine #3188

pmatilai Jun 28, 2024 Maintainer

Replies: 0 comments

pmatilai
Jun 28, 2024
Maintainer