Fix crash on continue in missing package dialog #5937

M4rtinK · 2024-10-15T00:27:45Z

Due to the way how exceptions propagate from the Anaconda installation tasks we ended up with a non-critical error exception interrupting the iteration of an important task iteration loop installing the payload.

Due to not being handled "deep enough" but instead "bubbling up" to a top level error handler the loop apparently gets interrupted & remaining tasks skipped.

This resulted in an unrelated crash ("kernel version list no available") due to all the important installation tasks being skipped, packages not being installed and installation related data not being populated.

The end result was that a non-critical error (such as a missing package) would trigger a dialog asking the user to quit or continue - but clicking "continue" would result in a weird crash.

So move the error handler check closer to the task execution, to prevent the loop from being interrupted. That way the loop will resume its iteration if "continue" is clicked in the UI.

Also convert the non-critical error if it gets raised (user deciding not to continue after the non-critical error) to a fatal one. This is necessary, as otherwise the top level error handler would get triggered, asking the user again to quit or continue.

NOTE: Longer term we really should clean this up & have all installation tasks gathered, ordered and executed from a single place. Then all the error handling could be in a single place, making things much simpler.

Resolves: INSTALLER-4045
Related: rhbz#2238045
Related: RHEL-57699

Due to the way how exceptions propagate from the Anaconda installation tasks we ended up with a non-critical error exception interrupting the iteration of an important task iteration loop installing the payload. Due to not being handled "deep enough" but instead "bubbling up" to a top level error handler the loop apparently gets interrupted & remaining tasks skipped. This resulted in an unrelated crash ("kernel version list no available") due to all the important installation tasks being skipped, packages not being installed and installation related data not being populated. The end result was that a non-critical error (such as a missing package) would trigger a dialog asking the user to quit or continue - but clicking "continue" would result in a weird crash. So move the error handler check closer to the task execution, to prevent the loop from being interrupted. That way the loop will resume its iteration if "continue" is clicked in the UI. Also convert the non-critical error if it gets raised (user deciding *not* to continue after the non-critical error) to a fatal one. This is necessary, as otherwise the top level error handler would get triggered, asking the user again to quit or continue. NOTE: Longer term we really should clean this up & have all installation tasks gathered, ordered and executed from a single place. Then all the error handling could be in a single place, making things much simpler. Resolves: INSTALLER-4045 Related: rhbz#2238045 Related: RHEL-57699

M4rtinK · 2024-10-15T00:35:23Z

This took quite a bit of time to track down what the heck is actually happening, why & what to do about it - ideally without having to change some of the tricky Anaconda innards :P Thankfully I seems to have arrive on a fairly minimal solution that seems to fix the issue as far without visibly breaking something else, as far as I can tell.

This is how it looks like - first the user will see a continue yes/no dialog - same as before, nothing new here:

Clicking "Yes" should no longer crash. Clicking "No" will now show this:

While this is a side effect of making sure the user is not asked again to continue after pressing "No" (as the top level error handler is still active) it actually IMHO looks a lot better than the previous messy exception dialog that was shown after pressing the "No" button:

Feedback welcome - if you have an idea to do this better or managed to hit some edge case this has introduced, let me know! :)

M4rtinK · 2024-10-15T00:37:47Z

NOTE, to easily reproduce the original issue, just create a kickstart like this:

rootpw anaconda
keyboard us
lang en_US.UTF-8
timezone America/New_York

%packages
no-such-package
%end

zerombr
clearpart --all --initlabel
autopart

And run the installation with it (by for example injecting it to an updated boot.iso via the local development scripts).

M4rtinK · 2024-10-15T00:42:35Z

I'll open a RHEL 10 version of this PR once we are sure this is what we want to do to fix this issue (which is IMHO quite likely :) ).

M4rtinK · 2024-10-15T10:42:09Z

/kickstart-test --skip-testtypes whatever

M4rtinK · 2024-10-15T12:20:01Z

/build-image --boot.iso

github-actions · 2024-10-15T12:32:26Z

Images built based on commit 73b412f:

boot.iso: success

Download the images from the bottom of the job status page.

rvykydal

Looks good to me.
I think the dialog showed on "No" is fine. Esp given how simple is the patch. (And how hard was to find it :) )

rvykydal · 2024-10-16T09:06:53Z

/kickstart-test --skip-testtypes whatever

The failed tests seem all like known flakes. When I run scripts/classify-failures (its latest version from rhinstaller/kickstart-tests#1315) on a folder containing only the failures I am getting:

[1261] https://github.com/rhinstaller/kickstart-tests/issues/1261
#: 1
./kstest-bindtomac-onboot-activate-httpks.2024_10_15-10_51_48.o69heokj/virt-install.log
11:04:11,746 WARNING anaconda:anaconda: display: Wayland startup failed: systemd exited with status 1
-------------------------------------------------------------------------------
[1296] https://github.com/rhinstaller/kickstart-tests/issues/1296
#: 1
./kstest-initial-setup-gui.2024_10_15-11_17_59.mbzbb1ch/virt-install.log
11:30:57,815 ERR anaconda:Exception ignored in atexit callback <function shutdown at 0x7f41b34c3740>:
-------------------------------------------------------------------------------
[1312] https://github.com/rhinstaller/kickstart-tests/issues/1312
#: 1
./kstest-bond-vlan-pre.2024_10_15-12_33_37.5jsg0i5g/virt-install.log
12:36:02,462 WARNING org.fedoraproject.Anaconda.Modules.Storage:gi.overrides.BlockDev.MpathError: Process reported exit code 1: Job for multipathd.service failed.

ie known issues, I guess ftp will is a flake as well.

I think we need to support the --retry option in the workflow (or run it by default). ... #5941

jstodola

It looks good to me from the user point of view.

M4rtinK · 2024-10-16T11:43:40Z

/kickstart-test --testtype smoke

opoplawski · 2024-10-21T03:38:10Z

Thank you very much for fixing this.

M4rtinK added port to RHEL10 f42 Fedora 42 labels Oct 15, 2024

M4rtinK temporarily deployed to gh-cockpituous October 15, 2024 00:27 — with GitHub Actions Inactive

M4rtinK changed the title ~~Fix crash on continue after a missing package non-critical error~~ Fix crash on continue in missing package dialog Oct 15, 2024

rvykydal approved these changes Oct 16, 2024

View reviewed changes

jstodola approved these changes Oct 16, 2024

View reviewed changes

M4rtinK merged commit fea233c into rhinstaller:master Oct 16, 2024
22 of 24 checks passed

M4rtinK mentioned this pull request Oct 16, 2024

Fix crash on continue after a missing package non-critical error #5942

Merged

M4rtinK removed the port to RHEL10 label Oct 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix crash on continue in missing package dialog #5937

Fix crash on continue in missing package dialog #5937

M4rtinK commented Oct 15, 2024 •

edited

Loading

M4rtinK commented Oct 15, 2024

M4rtinK commented Oct 15, 2024

M4rtinK commented Oct 15, 2024

M4rtinK commented Oct 15, 2024

M4rtinK commented Oct 15, 2024

github-actions bot commented Oct 15, 2024

rvykydal left a comment

rvykydal commented Oct 16, 2024 •

edited

Loading

jstodola left a comment

M4rtinK commented Oct 16, 2024

opoplawski commented Oct 21, 2024

Fix crash on continue in missing package dialog #5937

Fix crash on continue in missing package dialog #5937

Conversation

M4rtinK commented Oct 15, 2024 • edited Loading

M4rtinK commented Oct 15, 2024

M4rtinK commented Oct 15, 2024

M4rtinK commented Oct 15, 2024

M4rtinK commented Oct 15, 2024

M4rtinK commented Oct 15, 2024

github-actions bot commented Oct 15, 2024

rvykydal left a comment

Choose a reason for hiding this comment

rvykydal commented Oct 16, 2024 • edited Loading

jstodola left a comment

Choose a reason for hiding this comment

M4rtinK commented Oct 16, 2024

opoplawski commented Oct 21, 2024

M4rtinK commented Oct 15, 2024 •

edited

Loading

rvykydal commented Oct 16, 2024 •

edited

Loading