Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use staticx for Aarch64 exe builds #355

Merged
merged 10 commits into from
May 2, 2022
Merged

Use staticx for Aarch64 exe builds #355

merged 10 commits into from
May 2, 2022

Conversation

Jongy
Copy link
Contributor

@Jongy Jongy commented May 1, 2022

Closes: #342

Description

The problem starts with #342 - we needed to solve the glibc issue after upgrading the base image to CentOS 8.

staticx is the easiest solution because that's what we already do for x86_64. However, simply enabling staticx with this patch:

diff --git a/pyi.Dockerfile b/pyi.Dockerfile
index 8f855fe..952226b 100644
--- a/pyi.Dockerfile
+++ b/pyi.Dockerfile
@@ -211,7 +211,7 @@ COPY ./scripts/list_needed_libs.sh ./scripts/list_needed_libs.sh
 # and make staticx pack them as well.
 # using scl here to get the proper LD_LIBRARY_PATH set
 # TODO: use staticx for aarch64 as well; currently it doesn't generate correct binaries when run over Docker emulation.
-RUN if [ $(uname -m) != "aarch64" ]; then source scl_source enable devtoolset-8 llvm-toolset-7 && libs=$(./scripts/list_needed_libs.sh) && staticx $libs dist/gprofiler dist/gprofiler; fi
+RUN libs=$(./scripts/list_needed_libs.sh) && staticx $libs dist/gprofiler dist/gprofiler
 
 FROM scratch AS export-stage

Results in this binary:

ubuntu@ip-172-31-21-72:~/gprofiler$ ./build/aarch64/gprofiler  --version
gprofiler: dl-call-libc-early-init.c:37: _dl_call_libc_early_init: Assertion `sym != NULL' failed.
Aborted (core dumped)

Not good.

gdb shows this trace:

gprofiler: dl-call-libc-early-init.c:37: _dl_call_libc_early_init: Assertion `sym != NULL' failed.

Program received signal SIGABRT, Aborted.
0x000000000040917c in raise ()
(gdb) bt
#0  0x000000000040917c in raise ()
#1  0x0000000000400380 in abort ()
#2  0x0000000000404f28 in __assert_fail_base ()
#3  0x0000000000404f90 in __assert_fail ()
#4  0x000000000045b9e0 in _dl_call_libc_early_init ()
#5  0x0000000000459b34 in dl_open_worker ()
#6  0x00000000004346c4 in _dl_catch_exception ()
#7  0x00000000004594c0 in _dl_open ()
#8  0x0000000000433e7c in do_dlopen ()
#9  0x00000000004346c4 in _dl_catch_exception ()
#10 0x0000000000434790 in _dl_catch_error ()
#11 0x0000000000433ecc in dlerror_run ()
#12 0x0000000000434368 in __libc_dlopen_mode ()
#13 0x00000000004305cc in __nss_lookup_function ()
#14 0x0000000000430664 in __nss_lookup ()
#15 0x000000000042b0fc in getpwnam_r ()
#16 0x000000000042adb4 in getpwnam ()
#17 0x00000000004027ac in th_get_uid ()
#18 0x00000000004019ec in tar_extract_file ()
#19 0x0000000000401d00 in tar_extract_all ()
#20 0x0000000000400f90 in extract_archive ()
#21 0x0000000000400510 in main ()
(gdb) 

Tar extraction calling into getpwnam which ends up loading DSOs... sigh.
We don't need this functionality.

So, this PR:

  1. Adds a patch for latest staticx version that disables using getpwnam and getgrnam.
  2. Builds staticx from source with the patch for Aarch64
  3. Runs staticx also for Aarch64.

Known issues

The exe build doesn't pass on x86_64 anymore.

 => ERROR [build-stage 53/53] RUN if [ $(uname -m) != "aarch64" ]; then source scl_source enable devtoolset-8 llvm-toolset-7 ; fi; libs=$(./scripts/list_needed_libs.sh) && staticx $libs dist/gprof  0.7s
------
 > [build-stage 53/53] RUN if [ $(uname -m) != "aarch64" ]; then source scl_source enable devtoolset-8 llvm-toolset-7 ; fi; libs=$(./scripts/list_needed_libs.sh) && staticx $libs dist/gprofiler dist/gprofiler:
#106 0.690 ldd failed: ldd: exited with unknown exit code (139)

I suspect this to be a bug with the QEMU that emulates the usermode environment for Aarch64 binaries. It doesn't set up a correct environment for ldd.

Before we merge this PR, I plan to inspect if now after #350 I can provision an Aarch64 runner and separate the Aarch64 build steps to run on it instead. This will also make the build faster :) but mostly it will not crash haha.
Alternatively we need to fix the build when running on x86_64. Maybe upgrade QEMU? We all run multiarch/qemu-user-static:latest so IDK....

How Has This Been Tested?

All current binaries used in Aarch64 builds of gProfiler are static - only PyPerf is dynamically built, hence this PR is also in preparation for #287 which finally uses PyPerf in Aarch64.
For the mean time, the only thing "tested" here is that gProfiler's Python starts - staticx provides us with libc. So the tests I have performed for Aarch64 are just gprofiler --version. For x86_64, I tested PyPerf on appropriate kernel versions as well, and not just gprofiler --version.

  • Aarch64 new glibc - > 2.28
  • Aarch64 old glibc - < 2.28
  • x86_64 very old (CentOS 6)
  • x86_64 gprofiler & PyPerf

@Jongy Jongy added the enhancement New feature or request label May 1, 2022
@Jongy Jongy requested a review from LiorMosk May 1, 2022 00:06
LiorMosk
LiorMosk previously approved these changes May 1, 2022
@Jongy
Copy link
Contributor Author

Jongy commented May 1, 2022

I think this will fix the build.

@Jongy Jongy requested a review from LiorMosk May 1, 2022 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Aarch64 older than CentOS 8 (glibc 2.28) not supported
2 participants