Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kpatch-build: support CONFIG_LTO_CLANG_THIN #1325

Closed
wants to merge 3 commits into from

Conversation

liu-song-6
Copy link
Contributor

Support CONFIG_LTO_CLANG_THIN with ld.lld --lto-obj-path option.

With CONFIG_LTO_CLANG_THIN, .o files are LLVM IR binary, so CDO doesn't work on .o file. To solve this issue, we CDO the thinlto files generated by the --lto-obj-path option. Clang LTO generates the thinlto files after cross file inline, so they are good candidates for CDO. See [1] for more discussions about this.

To achieve this, we need:

  1. kpatch-build to update kernel Makefile(s) so it generates thinlto files;
  2. kpatch-build and kpatch-cc to save the thinlto file properly;
  3. kpatch-build to feed these thinlto files to CDO;
  4. The user need to supply vmlinux.o, from which we generate the symtab file. We need this because GLOBAL symbols may be marked as LOCAL in LTO vmlinux;
  5. A small workaround in CDO that ignores changes in init.text for vmlinux.o.thinlto.*

[1] #1320

Signed-off-by: Song Liu [email protected]

* such as:
* __initstub__kmod_syscall__728_5326_bpf_syscall_sysctl_init7
* while the original function is very similar, like:
* __initstub__kmod_syscall__728_5324_bpf_syscall_sysctl_init7
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here, are they "weird new functions" or is it the same function with a slight rename?

If the latter, it sounds like maybe kpatch_mangled_strcmp() should be updated to recognize this renaming pattern. That way the functions get correlated correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played with kpatch_mangled_strcmp() a little bit, but didn't solve all the problem. Let me dig more into it.

kpatch-build/kpatch-build Show resolved Hide resolved
kpatch-build/kpatch-build Outdated Show resolved Hide resolved
kpatch-build/kpatch-build Outdated Show resolved Hide resolved
sed -i "s/--thinlto-cache-dir=\$(extmod_prefix).thinlto-cache/--lto-obj-path=vmlinux.o.thinlto.o/g" "$KERNEL_SRCDIR"/Makefile

cp -f "$KERNEL_SRCDIR/scripts/Makefile.build" "$TEMPDIR/Makefile.build" || die
sed -i "s/\$(ld_flags)/\$(ld_flags) --lto-obj-path=\[email protected]/g" "$KERNEL_SRCDIR"/scripts/Makefile.build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes to Makefile and Makefile.build need short comments describing what they're doing and why

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, updating kpatch_mangled_strcmp() solves the problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

@@ -1107,6 +1142,7 @@ declare -a MAKEVARS
if [[ -n "$CONFIG_CC_IS_CLANG" ]]; then
MAKEVARS+=("CC=${KPATCH_CC_PREFIX}${CLANG}")
MAKEVARS+=("HOSTCC=clang")
MAKEVARS+=("LLVM=1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only a one-line change but it has a big impact and belongs in a separate commit with proper justification. Also I wonder how it interacts with the CONFIG_LD_IS_LLD check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears to me that I need to pass in LLVM=1 to the make command. Otherwise, the make will fail. I have the same change for PGO support.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible for the user to specify CC=clang and LD=ld.lld separately. And LLVM=1 will enable a set of utilities. From llvm.rst:

LLVM has substitutes for GNU binutils utilities. They can be enabled individually.
The full list of supported make variables::
        make CC=clang LD=ld.lld AR=llvm-ar NM=llvm-nm STRIP=llvm-strip \
          OBJCOPY=llvm-objcopy OBJDUMP=llvm-objdump READELF=llvm-readelf \
          HOSTCC=clang HOSTCXX=clang++ HOSTAR=llvm-ar HOSTLD=ld.lld
To simplify the above command, Kbuild supports the ``LLVM`` variable::
        make LLVM=1

I changed the code to use CONFIG_AS_IS_LLVM to detect the use of LLVM=1. I think this should cover most use cases.

kpatch-build/kpatch-build Outdated Show resolved Hide resolved
DIFF_OBJS="$TEMPDIR/thinlto_objs"
else
DIFF_OBJS="$TEMPDIR/changed_objs"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does lto need its own special file name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially, I thought we may still use changed_objs for something else. But I think we didn't use it at the moment.

OTOH, "changed_objs" is not accurate for thin LTO, as not all these files are changed.

kpatch-build/kpatch-build Outdated Show resolved Hide resolved
*/
if (!strncmp(childobj, "vmlinux.o", 9) &&
!strncmp(sec->name, ".init.text", 10))
sec->ignore = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this (or whatever the change eventually is) should probably be in its own commit.

It is common for the user to use LLVM=1 to enable all LLVM untilities. Use
CONFIG_AS_IS_LLVM to detect such cases, and add LLVM=1 to MAKEVARS.

Signed-off-by: Song Liu <[email protected]>
@liu-song-6
Copy link
Contributor Author

@jpoimboe Does this version look reasonable?

# Detech such case with CONFIG_AS_IS_LLVM flag.
if [[ -n "$CONFIG_AS_IS_LLVM" ]]; then
MAKEVARS+=("LLVM=1")
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still doesn't seem quite right, but maybe nobody will notice ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I couldn't think of a better test for LLVM=1.


return *s1 == *s2;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this works. To match strcmp() semantics it needs to return 0 if they match.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a simplified version (untested)

static int __kpatch_skip_digits_strcmp(char *s1, char *s2)
{
	while (*s1 == *s2) {
		if (!*s1)
			return 0;
		s1++;
		s2++;

		while (isdigit(*s1))
			s1++;
		while (isdigit(*s2))
			s2++;
	}

	return 1;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I am also confused with 1/0 return value. Let me double check..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a simplified version (untested)

static int __kpatch_skip_digits_strcmp(char *s1, char *s2)
{
	while (*s1 == *s2) {
		if (!*s1)
			return 0;
		s1++;
		s2++;

		while (isdigit(*s1))
			s1++;
		while (isdigit(*s2))
			s2++;
	}

	return 1;
}

This is really clean. Thanks!


# With LTO, many GLOBAL symbols are converted to LOCAL in vmlinux,
# which confuses the klp reloc logic. Require vmlinux.o for LTO.
[[ -e "$VMLINUX" ]] || die "For kernel with CONFIG_LTO_CLANG, please supply vmlinux.o via -v|--vmlinux option."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point I think we already know "$VMLINUX" exists so maybe we can skip this first check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will remove.

sed -i "s/--thinlto-cache-dir=\$(extmod_prefix).thinlto-cache/--lto-obj-path=vmlinux.o.thinlto.o/g" "$KERNEL_SRCDIR"/Makefile

cp -f "$KERNEL_SRCDIR/scripts/Makefile.build" "$TEMPDIR/Makefile.build" || die
sed -i "s/\$(ld_flags)/\$(ld_flags) --lto-obj-path=\[email protected]/g" "$KERNEL_SRCDIR"/scripts/Makefile.build
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

find_kobj "$i"
else
KOBJFILE=${i/.o.thinlto.o*/}
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

KOBJFILE_PATH="${TEMPDIR}/module/$KOBJFILE.ko"
SYMTAB="${KOBJFILE_PATH}.symtab"
SYMVERS_FILE="$BUILDDIR/Module.symvers"
fi
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

The kernel use a special __initcall_stub() with CONFIG_LTO_CLANG to avoid
name collisions. Handle it in kpatch_mangled_strcmp() by ignoring all
digits for symbols start with __initstub__kmod_syscall__.

Signed-off-by: Song Liu <[email protected]>
Support CONFIG_LTO_CLANG_THIN with ld.lld --lto-obj-path option.

With CONFIG_LTO_CLANG_THIN, .o files are LLVM IR binary, so CDO doesn't
work on .o file. To solve this issue, we CDO the thinlto files generated
by the --lto-obj-path option. Clang LTO generates the thinlto files
after cross file inline, so they are good candidates for CDO. See [1] for
more discussions about this.

To achieve this, we need:

  1. kpatch-build to update kernel Makefile(s) so it generates thinlto
     files;
  2. kpatch-build and kpatch-cc to save the thinlto file properly;
  3. kpatch-build to feed these thinlto files to CDO;
  4. The user need to supply vmlinux.o, from which we generate the symtab
     file. We need this because GLOBAL symbols may be marked as LOCAL in
     LTO vmlinux;

[1] dynup#1320

Signed-off-by: Song Liu <[email protected]>
@liu-song-6
Copy link
Contributor Author

@jpoimboe Could you please share your comments on the latest version? Thanks in advance!

@jpoimboe
Copy link
Member

jpoimboe commented Mar 1, 2023

I tried out the patches against a 6.2 kernel and got the following errors:

Extracting new and modified ELF sections
Binary files orig/vmlinux.o.thinlto.o2 and patched/vmlinux.o.thinlto.o2 differ
Binary files orig/vmlinux.o.thinlto.o223 and patched/vmlinux.o.thinlto.o223 differ
create-diff-object: ERROR: vmlinux.o.thinlto.o223: kpatch_bundle_symbols: 244: symbol trace_event_raw_event_task_newtask at offset 16 within section .text.trace_event_raw_event_task_newtask, expected 0
Binary files orig/vmlinux.o.thinlto.o227 and patched/vmlinux.o.thinlto.o227 differ
create-diff-object: ERROR: vmlinux.o.thinlto.o227: kpatch_bundle_symbols: 244: symbol delayed_put_task_struct at offset 16 within section .text.delayed_put_task_struct, expected 0
Binary files orig/vmlinux.o.thinlto.o232 and patched/vmlinux.o.thinlto.o232 differ
create-diff-object: ERROR: vmlinux.o.thinlto.o232: kpatch_bundle_symbols: 244: symbol __ptrace_may_access at offset 16 within section .text.__ptrace_may_access, expected 0
Binary files orig/vmlinux.o.thinlto.o234 and patched/vmlinux.o.thinlto.o234 differ
create-diff-object: ERROR: vmlinux.o.thinlto.o234: kpatch_bundle_symbols: 244: symbol trace_event_raw_event_signal_generate at offset 16 within section .text.trace_event_raw_event_signal_generate, expected 0
Binary files orig/vmlinux.o.thinlto.o307 and patched/vmlinux.o.thinlto.o307 differ
create-diff-object: ERROR: vmlinux.o.thinlto.o307: kpatch_bundle_symbols: 244: symbol __klp_shadow_get_or_alloc at offset 16 within section .text.__klp_shadow_get_or_alloc, expected 0
Binary files orig/vmlinux.o.thinlto.o469 and patched/vmlinux.o.thinlto.o469 differ
create-diff-object: ERROR: vmlinux.o.thinlto.o469: kpatch_bundle_symbols: 244: symbol trace_event_raw_event_vm_unmapped_area at offset 16 within section .text.trace_event_raw_event_vm_unmapped_area, expected 0
Binary files orig/vmlinux.o.thinlto.o510 and patched/vmlinux.o.thinlto.o510 differ
create-diff-object: ERROR: vmlinux.o.thinlto.o510: kpatch_bundle_symbols: 244: symbol get_arg_page at offset 16 within section .text.get_arg_page, expected 0
Binary files orig/vmlinux.o.thinlto.o593 and patched/vmlinux.o.thinlto.o593 differ
create-diff-object: ERROR: vmlinux.o.thinlto.o593: kpatch_bundle_symbols: 244: symbol do_task_stat at offset 16 within section .text.do_task_stat, expected 0
Binary files orig/vmlinux.o.thinlto.o809 and patched/vmlinux.o.thinlto.o809 differ
create-diff-object: ERROR: vmlinux.o.thinlto.o809: kpatch_bundle_symbols: 244: symbol selinux_set_mnt_opts at offset 16 within section .text.selinux_set_mnt_opts, expected 0
ERROR: 9 error(s) encountered. Check /home/jpoimboe/.kpatch/build.log for more details.

The cause for the errors is the new CONFIG_CALL_DEPTH_TRACKING feature which enables CONFIG_PREFIX_SYMBOLS, which relies on objtool to create a __pfx_<func> symbol for the 16 bytes of NOPS preceding each function.

The problem is that objtool didn't run on the vmlinux.o.thinlto.o* objects.

Objtool has become an integral part of the x86 toolchain and removing it from patch module objects could create all kinds of surprises. So I'm thinking this approach may not work and we may have to add vmlinux.o support to CDO after all.

@liu-song-6
Copy link
Contributor Author

The problem is that objtool didn't run on the vmlinux.o.thinlto.o* objects.

Hmm... This does sound like a big problem. :(

Objtool has become an integral part of the x86 toolchain and removing it from patch module objects could create all kinds of surprises. So I'm thinking this approach may not work and we may have to add vmlinux.o support to CDO after all.

How should we enable CDOing vmlinux.o? @jpoimboe IIRC, you plan to work on this soon?

@jpoimboe
Copy link
Member

jpoimboe commented Mar 1, 2023

How should we enable CDOing vmlinux.o? @jpoimboe IIRC, you plan to work on this soon?

I could certainly do it, though I'm probably not free for at least a few weeks.

@liu-song-6
Copy link
Contributor Author

How should we enable CDOing vmlinux.o? @jpoimboe IIRC, you plan to work on this soon?

I could certainly do it, though I'm probably not free for at least a few weeks.

At the moment, I am hoping this version will work for our 5.19 based kernel. If it doesn't work, we would probably need CDOing vmlinux.o soon..

@jpoimboe
Copy link
Member

jpoimboe commented Mar 2, 2023

At the moment, I am hoping this version will work for our 5.19 based kernel.

If you rely on the ORC unwinder, static calls or return thunks then it may have problems...

@liu-song-6
Copy link
Contributor Author

At the moment, I am hoping this version will work for our 5.19 based kernel.

If you rely on the ORC unwinder, static calls or return thunks then it may have problems...

Hmm... we do use most of these. Let me see how bad it would be.. Maybe we can run objtool on these thinlto files before CDOing?

@jpoimboe
Copy link
Member

jpoimboe commented Mar 2, 2023

At the moment, I am hoping this version will work for our 5.19 based kernel.

If you rely on the ORC unwinder, static calls or return thunks then it may have problems...

Hmm... we do use most of these. Let me see how bad it would be.. Maybe we can run objtool on these thinlto files before CDOing?

Yes, maybe (but to be clear I'd rather avoid this type of Makefile hacking for kpatch upstream)

@liu-song-6
Copy link
Contributor Author

Yes, maybe (but to be clear I'd rather avoid this type of Makefile hacking for kpatch upstream)

Agreed. CDOing vmlinux.o is a much cleaner approach.

We already use some non-upstream kpatch commits for PGO. If this hack works for short term, we will do the same.

@liu-song-6
Copy link
Contributor Author

I am testing this change that run objtool on thinlto files: liu-song-6@1203c9f .

I understand we don't want this in upstream, but just in case someone would like to test/use it before we can CDO vmlinux.o.

@github-actions
Copy link

This PR has been open for 60 days with no activity and no assignee. It will be closed in 7 days unless a comment is added.

@github-actions github-actions bot added the stale label Aug 20, 2023
@github-actions
Copy link

This PR was closed because it was inactive for 7 days after being marked stale.

@github-actions github-actions bot closed this Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants