-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the fallback mechanism when menu entries fail to boot #195
Open
chewi
wants to merge
281
commits into
rhboot:fedora-42
Choose a base branch
from
chewi:execute-return-code
base: fedora-42
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This patch adds grub-get-kernel-settings, which reads the system kernel installation configuration from /etc/sysconfig/kernel, and outputs ${GRUB_...} variables suitable for evaluation by grub-mkconfig. Those variables are then used by 10_linux to choose whether or not to create debug stanzas. Resolves: rhbz#1226325 [rharwood: migrate man page to h2m]
This needs to be hooked up to --program-transform=, but I haven't had time. Signed-off-by: Peter Jones <[email protected]>
Since our bugs tell us that the xnu boot entries really just don't work most of the time, and they create piles of extra boot entries, because they can't quite figure out 32-vs-64 and other stuff like that. It's rediculous, and we should just boot their bootloader through the chainloader instead. So this patch does that. Resolves: rhbz#893179 Signed-off-by: Peter Jones <[email protected]>
This patch adds the ability to specify a different root on a btrfs filesystem too boot from other than the default one. btrfs-list-snapshots <dev> will list the subvolumes available on the filesystem. set btrfs_subvol=<path> and set btrfs_subvolid=<subvolid> will specify which subvolume to use and any pathnames provided with either of those variables set will start using that root. If the subvolume or subvolume id doesn't exist, then an error case will result. It is possible to boot into a separate GRUB instance by exporting the variable and loading the config file from the subvolume. Signed-off-by: Jeff Mahoney <[email protected]>
current gcc complains: grub-core/fs/btrfs.c: In function ‘grub_cmd_btrfs_info’: grub-core/fs/btrfs.c:2745:7: error: the comparison will always evaluate as ‘true’ for the address of ‘label’ will never be NULL [-Werror=address] 2745 | if (data->sblock.label) | ^~~~ grub-core/fs/btrfs.c:92:8: note: ‘label’ declared here 92 | char label[0x100]; | ^~~~~ cc1: all warnings being treated as errors Obviously this check should be on the first data byte instead of the symbol itself. Signed-off-by: Peter Jones <[email protected]>
We should export btrfs_subvol and btrfs_subvolid to have both visible to subsidiary configuration files loaded using configfile. Signed-off-by: Michael Chang <[email protected]>
Signed-off-by: Michael Chang <[email protected]> Signed-off-by: Robbie Harwood <[email protected]>
Signed-off-by: Michael Chang <[email protected]> Signed-off-by: Robbie Harwood <[email protected]>
Signed-off-by: Michael Chang <[email protected]>
Signed-off-by: Michael Chang <[email protected]> Signed-off-by: Robbie Harwood <[email protected]>
Signed-off-by: Michael Chang <[email protected]> Signed-off-by: Robbie Harwood <[email protected]>
Signed-off-by: Peter Jones <[email protected]>
This uses grub_efi_allocate_pool(), grub_efi_free_pool(), and grub_efi_free_pages() instead of open-coded efi_call_N() calls, so we get more reasonable type checking. Signed-off-by: Peter Jones <[email protected]>
This avoids syntax checkers getting confused about if it's llx or lx. Signed-off-by: Peter Jones <[email protected]>
This should never be trying this, and since we've consolidated the grubenv to always be on /boot/efi/EFI/fedora/, this code causes it to always make the wrong decision. Resolves: rhbz#1484474 Signed-off-by: Peter Jones <[email protected]>
Signed-off-by: Peter Jones <[email protected]>
GRUB now has BootLoaderSpec support, the user can choose to use this by setting GRUB_ENABLE_BLSCFG to true in /etc/default/grub. On this setup, the boot menu entries are not added to the grub.cfg, instead BLS config files are parsed by blscfg command and the entries created dynamically. A 10_linux_bls grub.d snippet to generate menu entries from BLS files is also added that can be used on platforms where the bootloader doesn't have BLS support and only can parse a normal grub configuration file. Portions of the 10_linux_bls were taken from the ostree-grub-generator script that's included in the OSTree project. Fixes to support multi-devices and generate a BLS section even if no kernels are found in the boot directory were proposed by Yclept Nemo and Tom Gundersen respectively. Signed-off-by: Peter Jones <[email protected]> [javierm: remove outdated URL for BLS document] Signed-off-by: Javier Martinez Canillas <[email protected]> [[email protected]: skip machine ID check when updating entries] Signed-off-by: Ian Wienand <[email protected]> [rharwood: use sort(1), commit message composits, drop man pages] Signed-off-by: Robbie Harwood <[email protected]>
The emu platform doesn't have a grub_backtrace() implementation, so this causes a build error. Don't attempt to call this when building grub-emu. Signed-off-by: Javier Martinez Canillas <[email protected]>
Signed-off-by: Peter Jones <[email protected]> Signed-off-by: Javier Martinez Canillas <[email protected]> [jhlavac: Use ${etcdefaultgrub} instead of /etc/default/grub] Signed-off-by: Jan Hlavac <[email protected]> [rharwood: skip on ostree installations, migrate man to h2m] Signed-off-by: Robbie Harwood <[email protected]>
Signed-off-by: Peter Jones <[email protected]>
Signed-off-by: Peter Jones <[email protected]>
Signed-off-by: Peter Jones <[email protected]>
Commit 318ee04 ("make better backtraces") reworked the backtrace logic but the changes lead to the following build error on the grub-emu platform: grub_emu_lite-symlist.o:(.data+0xf08): undefined reference to `start' collect2: error: ld returned 1 exit status make[3]: *** [Makefile:25959: grub-emu-lite] Error 1 make[3]: *** Waiting for unfinished jobs.... cat kernel_syms.input | grep -v '^#' | sed -n \ -e '/EXPORT_FUNC *([a-zA-Z0-9_]*)/{s/.*EXPORT_FUNC *(\([a-zA-Z0-9_]*\)).*/defined kernel '""'\1/;p;}' \ -e '/EXPORT_VAR *([a-zA-Z0-9_]*)/{s/.*EXPORT_VAR *(\([a-zA-Z0-9_]*\)).*/defined kernel '""'\1/;p;}' \ | sort -u >kernel_syms.lst The problem is that start and _start symbols are exported unconditionally, but these aren't defined for grub-emu since is an emultaed platform so it doesn't have a startup logic. Don't attempt to export those for grub-emu. Signed-off-by: Javier Martinez Canillas <[email protected]>
On mustang, our memory map looks like: Type Physical start - end #Pages Size Attributes reserved 0000004000000000-00000040001fffff 00000200 2MiB UC WC WT WB conv-mem 0000004000200000-0000004393ffffff 00393e00 14654MiB UC WC WT WB ldr-code 0000004394000000-00000043f7ffffff 00064000 1600MiB UC WC WT WB BS-data 00000043f8000000-00000043f801ffff 00000020 128KiB UC WC WT WB conv-mem 00000043f8020000-00000043fa15bfff 0000213c 34032KiB UC WC WT WB ldr-code 00000043fa15c000-00000043fa2a1fff 00000146 1304KiB UC WC WT WB ldr-data 00000043fa2a2000-00000043fa3e8fff 00000147 1308KiB UC WC WT WB conv-mem 00000043fa3e9000-00000043fa3e9fff 00000001 4KiB UC WC WT WB ldr-data 00000043fa3ea000-00000043fa3eafff 00000001 4KiB UC WC WT WB ldr-code 00000043fa3eb000-00000043fa4affff 000000c5 788KiB UC WC WT WB BS-code 00000043fa4b0000-00000043fa59ffff 000000f0 960KiB UC WC WT WB RT-code 00000043fa5a0000-00000043fa5affff 00000010 64KiB RT UC WC WT WB RT-data 00000043fa5b0000-00000043fa5bffff 00000010 64KiB RT UC WC WT WB RT-code 00000043fa5c0000-00000043fa5cffff 00000010 64KiB RT UC WC WT WB ldr-data 00000043fa5d0000-00000043fa5d0fff 00000001 4KiB UC WC WT WB BS-code 00000043fa5d1000-00000043fa5ddfff 0000000d 52KiB UC WC WT WB reserved 00000043fa5de000-00000043fa60ffff 00000032 200KiB UC WC WT WB ACPI-rec 00000043fa610000-00000043fa6affff 000000a0 640KiB UC WC WT WB ACPI-nvs 00000043fa6b0000-00000043fa6bffff 00000010 64KiB UC WC WT WB ACPI-rec 00000043fa6c0000-00000043fa70ffff 00000050 320KiB UC WC WT WB RT-code 00000043fa710000-00000043fa72ffff 00000020 128KiB RT UC WC WT WB RT-data 00000043fa730000-00000043fa78ffff 00000060 384KiB RT UC WC WT WB RT-code 00000043fa790000-00000043fa79ffff 00000010 64KiB RT UC WC WT WB RT-data 00000043fa7a0000-00000043fa99ffff 00000200 2MiB RT UC WC WT WB RT-code 00000043fa9a0000-00000043fa9affff 00000010 64KiB RT UC WC WT WB RT-data 00000043fa9b0000-00000043fa9cffff 00000020 128KiB RT UC WC WT WB BS-code 00000043fa9d0000-00000043fa9d9fff 0000000a 40KiB UC WC WT WB reserved 00000043fa9da000-00000043fa9dbfff 00000002 8KiB UC WC WT WB conv-mem 00000043fa9dc000-00000043fc29dfff 000018c2 25352KiB UC WC WT WB BS-data 00000043fc29e000-00000043fc78afff 000004ed 5044KiB UC WC WT WB conv-mem 00000043fc78b000-00000043fca01fff 00000277 2524KiB UC WC WT WB BS-data 00000043fca02000-00000043fcea3fff 000004a2 4744KiB UC WC WT WB conv-mem 00000043fcea4000-00000043fcea4fff 00000001 4KiB UC WC WT WB BS-data 00000043fcea5000-00000043fd192fff 000002ee 3000KiB UC WC WT WB conv-mem 00000043fd193000-00000043fd2b0fff 0000011e 1144KiB UC WC WT WB BS-data 00000043fd2b1000-00000043ff80ffff 0000255f 38268KiB UC WC WT WB BS-code 00000043ff810000-00000043ff99ffff 00000190 1600KiB UC WC WT WB RT-code 00000043ff9a0000-00000043ff9affff 00000010 64KiB RT UC WC WT WB conv-mem 00000043ff9b0000-00000043ff9bffff 00000010 64KiB UC WC WT WB RT-data 00000043ff9c0000-00000043ff9effff 00000030 192KiB RT UC WC WT WB conv-mem 00000043ff9f0000-00000043ffa05fff 00000016 88KiB UC WC WT WB BS-data 00000043ffa06000-00000043ffffffff 000005fa 6120KiB UC WC WT WB MMIO 0000000010510000-0000000010510fff 00000001 4KiB RT MMIO 0000000010548000-0000000010549fff 00000002 8KiB RT MMIO 0000000017000000-0000000017001fff 00000002 8KiB RT MMIO 000000001c025000-000000001c025fff 00000001 4KiB RT This patch adds a requirement when we're trying to find the base of ram, that the memory we choose is actually /allocatable/ conventional memory, not merely write-combining. On this machine that means we wind up with an allocation around 0x4392XXXXXX, which is a reasonable address. This also changes grub_efi_allocate_pages_real() so that if 0 is allocated, it tries to allocate again starting with the same max address it did the first time, rather than interposing GRUB_EFI_MAX_USABLE_ADDRESS there, so that any per-platform constraints on its given address are maintained. Signed-off-by: Peter Jones <[email protected]>
We are currently allocating just enough memory for the file size, which means that the kernel BSS is in limbo (and not even zeroed). We are also not honoring the alignment specified in the image PE header. This makes us use the PE optional header in which the kernel puts the actual size it needs, including BSS, and make sure we clear it, and honors the specified alignment for the image. Signed-off-by: Benjamin Herrenschmidt <[email protected]> [pjones: arm: check for the PE magic for the compiled arch] Signed-off-by: Peter Jones <[email protected]> Signed-off-by: Robbie Harwood <[email protected]>
In order to properly validate a loaded kernel's support for being loaded without a writable stack or executable, we need to be able to properly parse arbitrary PE headers. Currently, pe32.h is written in such a way that the MS-DOS header that tells us where to find the PE header in the binary can't be accessed. Further, for some reason it calls the DOS MZ magic "GRUB_PE32_MAGIC". This patch adds the structure for the DOS header, renames the DOS magic define, and adds defines for the actual PE magic. Signed-off-by: Peter Jones <[email protected]>
The aarch64 loader doesn't use efi bootservices, and therefor it has a very minimal loader which makes a lot of assumptions about the kernel layout. With the ZBOOT changes, the layout has changed a bit and we not should really be parsing the PE sections to determine how much data to copy, otherwise the BSS won't be setup properly. This code still makes a lot of assumptions about the the kernel layout, so its far from ideal, but it works. Resolves: rhbz#2125069 Signed-off-by: Jeremy Linton <[email protected]>
Currently, the kernel pages are allocated with type EFI_LOADER_DATA. While the vast majority of systems will happily execute code from those pages (i.e. don't care about memory protection), the Microsoft Surface Pro X stalls, as this memory is not designated as "executable". Therefore, allocate the kernel pages as EFI_LOADER_CODE to request memory that is actually executable. Signed-off-by: Maximilian Luz <[email protected]>
Commit de735a4 added a grub_env_set where the prefix contains the arch name in the pathname. This create issues when trying to load modules using this prefix as the pathname contains a "doubled" arch name in it (ie .../powerpc-ieee1275/powerpc-ieee1275/). Signed-off-by: Nicolas Frayer <[email protected]>
The directory extent list does not have to be a continuous list of data blocks. When GRUB tries to read a non-existant member of the list, grub_xfs_read_file() will return a block of zero'ed memory. Checking for a zero'ed magic number is sufficient to skip this non-existant data block. Prior to commit 07318ee (fs/xfs: Fix XFS directory extent parsing) this was handled as a subtle side effect of reading the (non-existant) tail data structure. Since the block was zero'ed the computation of the number of directory entries in the block would return 0 as well. Fixes: 07318ee (fs/xfs: Fix XFS directory extent parsing) Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2254370 Signed-off-by: Jon DeVree <[email protected]> Reviewed-By: Vladimir Serbinenko <[email protected]> Reviewed-by: Daniel Kiper <[email protected]>
Add -Wno-incompatible-pointer-types to ignore warnings for incompatible types Signed-off-by: Nicolas Frayer <[email protected]>
The initial fix implemented a new flag that forces the grub cfg stub to be located on the same disk as grub. This created several issues such as RAID machines not being able to boot as their partition names under grub were different from the partition where grub is located. It also simply means that any machines with the /boot partition located on a disk other than the one containing grub won't boot. This commit denies booting if the grub cfg stub is located on a USB drive with a duplicated UUID (UUID being the same as the partition containing the actual grub cfg stub) Signed-off-by: Nicolas Frayer <[email protected]>
Signed-off-by: Leo Sandoval <[email protected]>
For NX, our kernel loaders need to set write and execute page permissions on allocated pages and the stack. This patch adds those calls. Signed-off-by: Peter Jones <[email protected]> [rharwood: fix stack_attrs undefined, fix aarch64 callsites] Signed-off-by: Robbie Harwood <[email protected]>
These can be used to register a different implementation later, for example, when shim provides a protocol with those functions. Signed-off-by: Mate Kukri <[email protected]>
…quired. Signed-off-by: Mate Kukri <[email protected]>
For those manual assembly files created where no '.note.GNU-stack' section is explicitly added, linker defaults it as executable and this is the reason that RHEL CI rpminspect & annocheck tests are failing. The proposed change sets the corresponding GNU-stack sections otherwise CI detects the following security vulnerability $ annocheck annocheck --ignore-unknown --verbose --profile=el9 *.rpm 2>&1 | grep FAIL | grep stack (standard input):(standard input):Hardened: ./usr/lib/grub/x86_64-efi/kernel.exec: FAIL: gnu-stack test because .note.GNU-stack section has execute permission (standard input):(standard input):Hardened: ./usr/lib/grub/x86_64-efi/kernel.img: FAIL: gnu-stack test because .note.GNU-stack section has execute permission Signed-off-by: Leo Sandoval <[email protected]>
Stricker permissions are required on the grub.cfg file, resulting in at most 0600 owner's file permissions. This resolves conflicting requirement permissions on grub2-pc package's grub2.cfg file. Signed-off-by: Leo Sandoval <[email protected]>
This patch adds support for Radix, Xive and Radix_gtse in Options vector5 which is required for KVM LPARs. KVM LPARs ONLY support Radix and not the Hash. Not enabling Radix on any PowerVM KVM LPARs will result in boot failure. Signed-off-by: Avnish Chouhan <[email protected]> Reviewed-by: Daniel Kiper <[email protected]>
/boot/efi/EFI/$os_name/grub.cfg contains a grub cfg stub that should not be overwritten by grub2-mkconfig. Ensure that we prevent this from happening. Signed-off-by: Marta Lewandowska <[email protected]> Signed-off-by: Nicolas Frayer <[email protected]>
Remove mountpoint when checking whether or not the grub cfg stub exists and add -s to the test. This should cover scenarios where the ESP doesn't have a seperate partition but still uses a grub cfg stub Signed-off-by: Nicolas Frayer <[email protected]>
Signed-off-by: Peter Jones <[email protected]> Signed-off-by: Leo Sandoval <[email protected]>
Signed-off-by: Leo Sandoval <[email protected]>
These two commands may fail interrumping the normal boot process, so placing these inside test expressions is a safer approach. Resolves: #2305291 Suggested-by: Kan-Ru Chen: <[email protected]> Signed-off-by: Leo Sandoval <[email protected]>
Call grub_efi_check_nx_required() passing it nx_required to assign the correct value Signed-off-by: Nicolas Frayer <[email protected]>
… boot Commit 972aa68 ("Make a "gdb" dprintf that tells us load addresses.") added some debug prints to help with running gdb against grub. Besides adding a new grub_dl_print_gdb_info () function which uses `grub_qdprintf ("gdb", ...);` it also adds a new grub_efi_print_gdb_info () call to grub_efi_init (). grub_efi_print_gdb_info () is intended for the gdbinfo command and uses a non debug grub_printf () call leading to grub now always printing this message during boot breaking flicker-free boot. Add a new "debug" parameter to grub_efi_print_gdb_info () and use `grub_qdprintf ("gdb", ...);` when this is set to silence the printing done from grub_efi_init () when debugging is not enabled. Fixes: 972aa68 ("Make a "gdb" dprintf that tells us load addresses.") Signed-off-by: Hans de Goede <[email protected]>
The calculation of the size of the table was incorrect (copy/pasta from grub_acpi_rsdt_find_table() I assume...). The entries are 64-bit long. This causes us to access beyond the end of the table which is causing crashes during boot on some systems. Typically this is causing a crash on VMWare when using UEFI and enabling serial autodetection, as grub_acpi_find_table (GRUB_ACPI_SPCR_SIGNATURE); Will goes past the end of the table (the SPCR table doesn't exits) Signed-off-by: Benjamin Herrenschmidt <[email protected]>
When querying about a partition UUID, we're not checking for get_device_uuid() return value, which can possibly result in dereferencing a NULL pointer. Signed-off-by: Nicolas Frayer <[email protected]> Co-authored-by: Chuong Tran <[email protected]>
grub_script_execute_sourcecode() parses and executes code one line at a time, updating the return code each time because only the last line determines the final status. However, trailing new lines were also executed, masking any failure on the previous line. Fix this by only trying to execute the command when there is actually one present. This has presumably never been noticed because this code is not used by regular functions, only in special cases like eval and menu entries. The latter generally don't return at all, having booted an OS. When failing to boot, upstream GRUB triggers the fallback mechanism regardless of the return code. We noticed the problem while using Red Hat's patches, which change this behaviour to take account of the return code. In that case, a failure takes you back to the menu rather than triggering a fallback. Signed-off-by: James Le Cuirot <[email protected]>
Don't rely on grub_errno here because grub_script_execute_new_scope() calls grub_print_error(), which always resets grub_errno back to GRUB_ERR_NONE. It may also get reset by grub_wait_after_message(). This problem was observed when a "bad signature" error resulted in the menu being redisplayed rather than the fallback mechanism being triggered, although another change was also needed to fix it. This only happens with Red Hat's patches because upstream GRUB triggers the fallback mechanism regardless of the return code. Signed-off-by: James Le Cuirot <[email protected]>
2 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@martinezjavier's c3fcfe5 commit accidentally broke the fallback mechanism, although it also exposed another bug in upstream GRUB.
We cannot rely on
grub_errno
ingrub_menu_execute_entry()
becausegrub_script_execute_new_scope()
callsgrub_print_error()
, which always resetsgrub_errno
back toGRUB_ERR_NONE
. It may also get reset bygrub_wait_after_message()
.grub_script_execute_sourcecode()
parses and executes code one line at a time, updating the return code each time because only the last line determines the final status. However, trailing new lines were also executed, masking any failure on the previous line. Fix this by only trying to execute the command when there is actually one present. A new test case is included.See the commit messages for further details.
I can send the first change upstream, although it's unlikely to cause issues without your change, hence why it went unnoticed for so long. I gather you are starting to upstream some of your changes, so perhaps I can leave it with you?