Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make and pkg-config toolchains are not reproducible #1313

Open
nmattia opened this issue Nov 5, 2024 · 2 comments
Open

Make and pkg-config toolchains are not reproducible #1313

nmattia opened this issue Nov 5, 2024 · 2 comments

Comments

@nmattia
Copy link

nmattia commented Nov 5, 2024

When using (a recent) rules_foreign_cc I'm getting determinism issues in the toolchains. In my case the checksum of both the make and pkgconfig binaries change every time they're built:

$ bazel build //my_c_target
...
$ sha256sum bazel-out/k8-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/bin/make
e00a81041a33af3ed7885449b88129c9f83661bcbec6cc84e51c73f77d93340f  bazel-out/k8-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/bin/make
$ bazel clean && bazel build //my_c_target
$ sha256sum bazel-out/k8-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/bin/make
95bf6a1b30b77689370e4e73a6f85a2684a6522db36d54d2744093d30532acb9  bazel-out/k8-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/bin/make

Note how each build produced a different make executable.

I've noticed that the build logs (Configure.log, BootstrapGNUMake.log, etc) introduce non-determinism too as they include full sandbox paths in their outputs (in commands and in "Can not copy X" lines).

The workaround for me is to use preinstalled toolchains:

register_toolchains(
    "@rules_foreign_cc//toolchains:preinstalled_pkgconfig_toolchain",
    "@rules_foreign_cc//toolchains:preinstalled_make_toolchain",
)

and remove build logs:

diff --git a/foreign_cc/private/framework.bzl b/foreign_cc/private/framework.bzl
index 33129b8..7326107 100644
--- a/foreign_cc/private/framework.bzl
+++ b/foreign_cc/private/framework.bzl
@@ -616,7 +616,7 @@ def wrap_outputs(ctx, lib_name, configure_name, script_text, env_prelude, build_
     cleanup_on_success_function = create_function(
         ctx,
         "cleanup_on_success",
-        "rm -rf $$BUILD_TMPDIR$$ $$EXT_BUILD_DEPS$$",
+        "rm -rf $$BUILD_TMPDIR$$ $$EXT_BUILD_DEPS$$ && echo > $$BUILD_LOG$$",
     )
     cleanup_on_failure_function = create_function(
         ctx,
nmattia added a commit to dfinity/ic that referenced this issue Nov 5, 2024
This includes a couple of fixes to make the replica build more
deterministic.

This in theory allows anyone building `//rs/replica` with Bazel in the
Docker image to get a bit-for-bit reproducible replica executable, and
this should also improve Bazel cache hit rates.

This fixes for the following issues:

* Non-reproducible `jemalloc` build: the `tikv-jemalloc-sys` crate
  vendors `jemalloc` and builds it as part of `build.rs`. Unfortunately
  that build in not deterministic. To make it more deterministic we
  build `jemalloc` separately, which also speeds up rebuilds as the C
  code does not need to be rebuilt when rust versions change. We only
  enable this in Linux; this also includes a patch to support this in
  `rules_rust`: bazelbuild/rules_rust#2981

* Non-reproducible `rules_rust` build log: this includes a backport of a
  fix that disables build logs in `rules_rust` by default (backported
  because our build is not compatible with the latest `rules_rust`)
  bazelbuild/rules_rust#2974

* Non-reproducible make & pkgconfig toolchains: some toolchains packaged
  by `rules_foreign_cc` cause build determinism issues so instead we use
  the ones installed in the container and remove build logs: bazel-contrib/rules_foreign_cc#1313

* Non-reproducible obj file generation in cc-rs: the `cc-rs` crate used
  in many C builds, including the (ASM) build of `ring`'s crypto bits,
  generates object files that include the Bazel sandbox full path:
  rust-lang/cc-rs#1271

* Non-reproducible codegen: `cranelift-isle` and
  `cranelift-codegen-meta` include references to source files as
  absolute paths that include the Bazel sandbox path:
  bytecodealliance/wasmtime#9553
nmattia added a commit to dfinity/ic that referenced this issue Nov 5, 2024
This includes a couple of fixes to make the replica build more
deterministic.

This in theory allows anyone building `//rs/replica` with Bazel in the
Docker image to get a bit-for-bit reproducible replica executable, and
this should also improve Bazel cache hit rates.

This fixes for the following issues:

* Non-reproducible `jemalloc` build: the `tikv-jemalloc-sys` crate
  vendors `jemalloc` and builds it as part of `build.rs`. Unfortunately
  that build in not deterministic. To make it more deterministic we
  build `jemalloc` separately, which also speeds up rebuilds as the C
  code does not need to be rebuilt when rust versions change. We only
  enable this in Linux; this also includes a patch to support this in
  `rules_rust`: bazelbuild/rules_rust#2981

* Non-reproducible `rules_rust` build log: this includes a backport of a
  fix that disables build logs in `rules_rust` by default (backported
  because our build is not compatible with the latest `rules_rust`)
  bazelbuild/rules_rust#2974

* Non-reproducible make & pkgconfig toolchains: some toolchains packaged
  by `rules_foreign_cc` cause build determinism issues so instead we use
  the ones installed in the container and remove build logs: bazel-contrib/rules_foreign_cc#1313

* Non-reproducible obj file generation in cc-rs: the `cc-rs` crate used
  in many C builds, including the (ASM) build of `ring`'s crypto bits,
  generates object files that include the Bazel sandbox full path:
  rust-lang/cc-rs#1271

* Non-reproducible codegen: `cranelift-isle` and
  `cranelift-codegen-meta` include references to source files as
  absolute paths that include the Bazel sandbox path:
  bytecodealliance/wasmtime#9553
github-merge-queue bot pushed a commit to dfinity/ic that referenced this issue Nov 5, 2024
This includes a couple of fixes to make the replica build more
deterministic.

This in theory allows anyone building `//rs/replica` with Bazel in the
Docker image to get a bit-for-bit reproducible replica executable, and
this should also improve Bazel cache hit rates.

This fixes for the following issues:

* Non-reproducible `jemalloc` build: the `tikv-jemalloc-sys` crate
vendors `jemalloc` and builds it as part of `build.rs`. Unfortunately
that build in not deterministic. To make it more deterministic we build
`jemalloc` separately, which also speeds up rebuilds as the C code does
not need to be rebuilt when rust versions change. We only enable this in
Linux; this also includes a patch to support this in `rules_rust`:
bazelbuild/rules_rust#2981

* Non-reproducible `rules_rust` build log: this includes a backport of a
fix that disables build logs in `rules_rust` by default (backported
because our build is not compatible with the latest `rules_rust`)
bazelbuild/rules_rust#2974

* Non-reproducible make & pkgconfig toolchains: some toolchains packaged
by `rules_foreign_cc` cause build determinism issues so instead we use
the ones installed in the container and remove build logs:
bazel-contrib/rules_foreign_cc#1313

* Non-reproducible obj file generation in cc-rs: the `cc-rs` crate used
in many C builds, including the (ASM) build of `ring`'s crypto bits,
generates object files that include the Bazel sandbox full path:
rust-lang/cc-rs#1271

* Non-reproducible codegen: `cranelift-isle` and
`cranelift-codegen-meta` include references to source files as absolute
paths that include the Bazel sandbox path:
bytecodealliance/wasmtime#9553
alin-at-dfinity pushed a commit to dfinity/ic that referenced this issue Nov 7, 2024
This includes a couple of fixes to make the replica build more
deterministic.

This in theory allows anyone building `//rs/replica` with Bazel in the
Docker image to get a bit-for-bit reproducible replica executable, and
this should also improve Bazel cache hit rates.

This fixes for the following issues:

* Non-reproducible `jemalloc` build: the `tikv-jemalloc-sys` crate
vendors `jemalloc` and builds it as part of `build.rs`. Unfortunately
that build in not deterministic. To make it more deterministic we build
`jemalloc` separately, which also speeds up rebuilds as the C code does
not need to be rebuilt when rust versions change. We only enable this in
Linux; this also includes a patch to support this in `rules_rust`:
bazelbuild/rules_rust#2981

* Non-reproducible `rules_rust` build log: this includes a backport of a
fix that disables build logs in `rules_rust` by default (backported
because our build is not compatible with the latest `rules_rust`)
bazelbuild/rules_rust#2974

* Non-reproducible make & pkgconfig toolchains: some toolchains packaged
by `rules_foreign_cc` cause build determinism issues so instead we use
the ones installed in the container and remove build logs:
bazel-contrib/rules_foreign_cc#1313

* Non-reproducible obj file generation in cc-rs: the `cc-rs` crate used
in many C builds, including the (ASM) build of `ring`'s crypto bits,
generates object files that include the Bazel sandbox full path:
rust-lang/cc-rs#1271

* Non-reproducible codegen: `cranelift-isle` and
`cranelift-codegen-meta` include references to source files as absolute
paths that include the Bazel sandbox path:
bytecodealliance/wasmtime#9553
@jjmaestro
Copy link
Contributor

I actually tested this further, just to learn more about reproducible builds, and here's what I found!

  • I followed @nmattia's steps of building-cleaning-building a couple of times in one of the projects I'm working on that uses rules_foreign_cc
  • Each time, I collected info from the binary:
    • readelf --all --string-dump=.rodata bazel-out/(...)/make
  • Then, I compared the readelf info with diff --brauN
--- logs/make1.readelf2 2024-11-07 12:36:54.673277847 +0000
+++ logs/make2.readelf2 2024-11-07 12:38:59.881880125 +0000
@@ -3691,7 +3691,7 @@
   [  3798]  unknown output-sync type '%s'
   [  37b8]  make
   [  37c0]  true
-  [  37c8]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/10/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/share/locale
+  [  37c8]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/71/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/share/locale
   [  38b0]  getcwd
   [  38b8]  .VARIABLES
   [  38c8]  .RECIPEPREFIX
@@ -3943,7 +3943,7 @@
   [  5788]  $(MAKEFILES)
   [  5798]  GNUmakefile
   [  57a8]  Makefile
-  [  57b8]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/10/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/include
+  [  57b8]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/71/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/include
   [  58a0]  /usr/gnu/include
   [  58b8]  /usr/local/include
   [  58d0]  /usr/include
@@ -3999,7 +3999,7 @@
   [  5fc0]  '%s' is up to date.
   [  5fd8]  /lib
   [  5fe0]  /usr/lib
-  [  5ff0]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/10/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/lib
+  [  5ff0]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/71/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/lib
   [  60d0]  (%.o)
   [  60d8]   | 
   [  60e0]   .WAIT
@@ -4123,7 +4123,7 @@
 Displaying notes found in: .note.gnu.build-id
   Owner                Data size       Description
   GNU                  0x00000014      NT_GNU_BUILD_ID (unique build ID bitstring)
-    Build ID: e194acfe778de03c63dca3eb94acd48df4370f97
+    Build ID: 7f851e5b2a8d6c29336feffa291b0ae5ac5e6d4d
 
 Displaying notes found in: .note.gnu.gold-version
   Owner                Data size       Description

The binaries have different build IDs which I think is caused by the differences in the paths stored in them.

Finally, note that the previous diff is a "nice one". If you are repeating these steps, you might also get a longer path caused by a longer sandbox ID which will bump the offsets and generate a much longer and harder to read diff, where the differences in strings, etc, are much harder to find. E.g., in the following case, (...)/sandbox/processwrapper-sandbox/10/(...) VS (...)/sandbox/processwrapper-sandbox/521/(...):

--- logs/make1.readelf2 2024-11-07 12:36:54.673277847 +0000
+++ logs/make2.readelf2 2024-11-07 12:40:13.279110434 +0000
@@ -53,10 +53,10 @@
   [14] .fini             PROGBITS         000000000002fa18  0002fa18
        0000000000000014  0000000000000000  AX       0     0     4
   [15] .rodata           PROGBITS         000000000002fa30  0002fa30
-       0000000000006afa  0000000000000000   A       0     0     16
-  [16] .eh_frame         PROGBITS         0000000000036530  00036530
+       0000000000006b0a  0000000000000000   A       0     0     16
+  [16] .eh_frame         PROGBITS         0000000000036540  00036540
        000000000000527c  0000000000000000   A       0     0     8
-  [17] .eh_frame_hdr     PROGBITS         000000000003b7ac  0003b7ac
+  [17] .eh_frame_hdr     PROGBITS         000000000003b7bc  0003b7bc
        0000000000000bb4  0000000000000000   A       0     0     4
   [18] .data.rel.ro[...] PROGBITS         000000000004f2d0  0003f2d0
        0000000000000420  0000000000000000  WA       0     0     16
@@ -103,14 +103,14 @@
                  0x000000000000001b 0x000000000000001b  R      0x1
       [Requesting program interpreter: /lib/ld-linux-aarch64.so.1]
   LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
-                 0x000000000003c360 0x000000000003c360  R E    0x10000
+                 0x000000000003c370 0x000000000003c370  R E    0x10000
   LOAD           0x000000000003f2d0 0x000000000004f2d0 0x000000000004f2d0
                  0x00000000000023f8 0x0000000000005bf8  RW     0x10000
   DYNAMIC        0x000000000003f700 0x000000000004f700 0x000000000004f700
                  0x0000000000000210 0x0000000000000210  RW     0x8
   NOTE           0x0000000000000254 0x0000000000000254 0x0000000000000254
                  0x0000000000000044 0x0000000000000044  R      0x4
-  GNU_EH_FRAME   0x000000000003b7ac 0x000000000003b7ac 0x000000000003b7ac
+  GNU_EH_FRAME   0x000000000003b7bc 0x000000000003b7bc 0x000000000003b7bc

(...)

   1063: 0000000000009cd0     0 FUNC    LOCAL  HIDDEN    11 _init
@@ -3691,428 +3691,428 @@
   [  3798]  unknown output-sync type '%s'
   [  37b8]  make
   [  37c0]  true
-  [  37c8]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/10/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/share/locale
-  [  38b0]  getcwd
-  [  38b8]  .VARIABLES
-  [  38c8]  .RECIPEPREFIX
-  [  38d8]  target-specific order-only second-expansion else-if shortest-stem undefine oneshell nocomment grouped-target extra-prereqs notintermediate shell-export archives jobserver jobserver-fifo output-sync check-symlink load
-  [  39b8]  .FEATURES

(...)

-  [  4160]  shuffle
-  [  4168]  jobserver-style
-  [  4178]    -b, -m                      Ignored for compatibility.\n
-  [  41b8]    -B, --always-make           Unconditionally make all targets.\n
-  [  4200]    -C DIRECTORY, --directory=DIRECTORY\n
+  [  37c8]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/521/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/share/locale
+  [  38b8]  getcwd
+  [  38c0]  .VARIABLES
+  [  38d0]  .RECIPEPREFIX
+  [  38e0]  target-specific order-only second-expansion else-if shortest-stem undefine oneshell nocomment grouped-target extra-prereqs notintermediate shell-export archives jobserver jobserver-fifo output-sync check-symlink load
+  [  39c0]  .FEATURES

(...)

-  [  5768]  ...
-  [  5770]  Reading makefiles...\n
-  [  5788]  $(MAKEFILES)
-  [  5798]  GNUmakefile
-  [  57a8]  Makefile
-  [  57b8]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/10/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/include
-  [  58a0]  /usr/gnu/include
-  [  58b8]  /usr/local/include
-  [  58d0]  /usr/include
-  [  58e0]  stat: 
-  [  58e8]  lstat: 

(...)

-  [  5f78]  Using default commands for '%s'.\n
-  [  5fa0]  Nothing to be done for '%s'.
-  [  5fc0]  '%s' is up to date.
-  [  5fd8]  /lib
-  [  5fe0]  /usr/lib
-  [  5ff0]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/10/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/lib
-  [  60d0]  (%.o)
-  [  60d8]   | 
-  [  60e0]   .WAIT
-  [  60e8]  warning: ignoring prerequisites on suffix rule definition
-  [  6129]  # Implicit Rules

(...)

+  [  5770]  ...
+  [  5778]  Reading makefiles...\n
+  [  5790]  $(MAKEFILES)
+  [  57a0]  GNUmakefile
+  [  57b0]  Makefile
+  [  57c0]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/521/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/include
+  [  58a8]  /usr/gnu/include
+  [  58c0]  /usr/local/include
+  [  58d8]  /usr/include
+  [  58e8]  stat: 
+  [  58f0]  lstat: 

(...)

+  [  5f80]  Using default commands for '%s'.\n
+  [  5fa8]  Nothing to be done for '%s'.
+  [  5fc8]  '%s' is up to date.
+  [  5fe0]  /lib
+  [  5fe8]  /usr/lib
+  [  5ff8]  /postgres/.cache/bazel/_bazel_postgres/a08c2e4811c846650b733c6fc815a920/sandbox/processwrapper-sandbox/521/execroot/_main/bazel-out/aarch64-opt-exec-ST-d57f47055a04/bin/external/rules_foreign_cc~/toolchains/private/make/lib
+  [  60d8]  (%.o)
+  [  60e0]   | 
+  [  60e8]   .WAIT
+  [  60f0]  warning: ignoring prerequisites on suffix rule definition
+  [  6131]  # Implicit Rules

(...)

+  [  6a88]  tmpfile: %s
+  [  6b03]   
+  [  6b05]  "
 
 
 Displaying notes found in: .note.ABI-tag
@@ -4123,7 +4123,7 @@
 Displaying notes found in: .note.gnu.build-id
   Owner                Data size       Description
   GNU                  0x00000014      NT_GNU_BUILD_ID (unique build ID bitstring)
-    Build ID: e194acfe778de03c63dca3eb94acd48df4370f97
+    Build ID: b2af58606dd42b02f2128c8bd0a74875a674fac9
 
 Displaying notes found in: .note.gnu.gold-version
   Owner                Data size       Description

@jjmaestro
Copy link
Contributor

@nmattia also, I've asked in Bazel's Slack, check this thread, TIL about not just the build ID that I mentioned in my previous comment but also GCC's file-prefix-map / debug-prefix-map and found:

Maybe this flag could be used to make the make build reproducible... :-?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants