Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

haskell.compiler.ghc*: fix cross-built native GHC #243619

Closed

Conversation

AlexandreTunstall
Copy link
Contributor

@AlexandreTunstall AlexandreTunstall commented Jul 15, 2023

Description of changes

This PR fixes all versions of GHC older than 9.6 to allow cross-compiling GHC itself (build ≠  host = target). In addition to that, it adds an overridable option to force building an unregisterised version.

Changes have also been made to 9.6 and newer, but they are merely tentative because Hadrian cannot currently cross-compile GHC.

Due to bugs in cc-wrapper and binutils-wrapper, cross-compiling GHC still doesn't produce binaries usable without emulation or further fixes, but this will at the very least allow users to port GHC to platforms that Nixpkgs doesn't support natively.

Things done
  • Built on platform(s)
    • x86_64-linux
      • pkgsCross.aarch64-multiplatform.haskell.compiler.integer-simple.ghc{884,8107}
      • pkgsCross.aarch64-multiplatform.haskell.compiler.native-bignum.ghc{902,924,925,926,927,928,942,943,944,945}
      • pkgsCross.riscv64.haskell.compiler.integer-simple.ghc884
      • pkgsCross.riscv64.haskell.compiler.native-bignum.ghc945
      • haskell.compiler.native-bignum.ghc945
      • haskell.compiler.native-bignum.ghc962
      • There are way too many affected derivations for me to try all of them; it's time-consuming and gobbles up tons of disk space.
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • [N/A] For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
    • x86_64-linux
      • pkgsCross.aarch64-multiplatform.haskell.compiler.integer-simple.ghc{884,8107}
      • pkgsCross.aarch64-multiplatform.haskell.compiler.native-bignum.ghc{902,924,925,926,927,928,942,943,944}
      • pkgsCross.aarch64-multiplatform.haskell.compiler.native-bignum.ghc945
      • pkgsCross.riscv64.haskell.compiler.integer-simple.ghc884
      • pkgsCross.riscv64.haskell.compiler.native-bignum.ghc945
      • haskell.compiler.native-bignum.ghc945
      • haskell.compiler.native-bignum.ghc962
  • 23.11 Release Notes (or backporting 23.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@ghost
Copy link

ghost commented Jul 16, 2023

Hi, thanks for writing this PR!

This is a really really big PR (not your fault; ghc still uses copy-paste-tweak style instead of switch-on-version style) in monolithic single-commit form which makes it very hard to review; would you mind breaking it up into a separate commit for each mechanical change (like "search and replace FOO with BAR" or "replace libffi with targetPackages.libffi")? Then we can focus our review effort on the remaining commits with non-mechanical changes, which should be short and focused.

Also, have you tried building anything that uses Template Haskell? Last time I tried to get build,host=x86_64-linux target=aarch64-linux to work using qemu-user for Template Haskell, I hit a known GHC bug in their custom linker that caused segfaults. I can dig up the GHC bug link if you like.

Unfortunately Template Haskell is widely used in the Haskell ecosystem; there are a couple of pervasive dependencies (like Vty) that use it, so everything that uses those needs it.

@sternenseemann
Copy link
Member

Also, have you tried building anything that uses Template Haskell? Last time I tried to get build,host=x86_64-linux target=aarch64-linux to work using qemu-user for Template Haskell, I hit a known GHC bug in their custom linker that caused segfaults. I can dig up the GHC bug link if you like.

That is kind of orthogonal to this PR which gets build != host, host == target to work properly (I recall that building with build, host, target distinct is impossible). -fexternal-interpreter is something we should still tackle, but it makes more sense to do so in a different context, I'd say.

@ghost
Copy link

ghost commented Jul 16, 2023

That is kind of orthogonal to this PR which gets build != host, host == target to work properly

Ah, so, first line of the PR should probably be adjusted to:

This PR fixes all versions of GHC older than 9.6 to allow both building GHC as a cross-compiler and cross-compiling GHC itself.

@AlexandreTunstall
Copy link
Contributor Author

would you mind breaking it up into a separate commit for each mechanical change (like "search and replace FOO with BAR" or "replace libffi with targetPackages.libffi")?

@amjoseph-nixpkgs I assume by this you mean split it up by change (e.g. apply X change to each version) and not split it up by version (e.g. fix 9.4.5), right? Sure.

Also, have you tried building anything that uses Template Haskell?

I have encountered a few errors that mentioned -fexternal-interpreter while trying to use a riscv64 cross-compiled GHC to bootstrap GHC 9.6.2 natively, but so far they've all been in test suites (so I disabled checks to get around them). My goal with this isn't to build a fully-functional GHC, but just something to bootstrap GHC on otherwise unsupported platforms. As far as I'm aware, building GHC does not require TH support.

I haven't tried using the cross-compilers to build any TH-using code. As sternenseemann mentioned, this is more about fixing cross-compiling GHC itself. I assume Nixpkgs already supports building GHC as a cross-compiler, but if it doesn't, this PR also makes that possible.

@ghost
Copy link

ghost commented Jul 16, 2023

@amjoseph-nixpkgs I assume by this you mean split it up by change (e.g. apply X change to each version) and not split it up by version (e.g. fix 9.4.5), right? Sure.

Correct!

My goal with this isn't to build a fully-functional GHC, but just something to bootstrap GHC on otherwise unsupported platforms.

Ah, thanks, now I get it! Yes this will be very useful, because the ancient GHC-supplied binaries we've been using to bootstrap on powerpc64le have ceased working. So I'd like to use this there too. Awesome!

@AlexandreTunstall
Copy link
Contributor Author

pkgsCross.aarch64-multiplatform.pkgsBuildHost.haskell.compiler.native-bignum.ghc945 seems to build fine on Nixpkgs already, so I've amended the PR description.

@AlexandreTunstall
Copy link
Contributor Author

@amjoseph-nixpkgs I have split this into 11 separate (hopefully atomic) commits.

I've also spent much of my week trying to compile GHC 9.6.2 natively on RISC-V using an unregisterised GHC 9.4.5 cross-compiled using these changes, and I've figured out what it takes.

pkgs.haskell.compiler.ghc962.override (old: rec {
  bootPkgs = pkgs.haskell.packages.ghc945.override {
    buildHaskellPackages = bootPkgs;

    ghc = let
      passthru = {
        targetPrefix = "";
        enableShared = false;
        hasHaddock = false;

        llvmPackages = pkgs.llvmPackages_12;

        haskellCompilerName = "ghc-9.4.5";
      };
    in passthru // {
      version = "9.4.5";

      outPath = builtins.storePath boot/ghc;

      inherit passthru;

      meta = {
        license = lib.licenses.bsd3;
        platforms = [ "riscv64-linux" ];
      };
    };

    overrides = self: super: {
      mkDerivation = args: super.mkDerivation ({
        enableLibraryProfiling = false;
      } // args);

      # These test suites don't compile with boot GHC
      alex = pkgs.haskell.lib.compose.dontCheck super.alex;
      data-array-byte = pkgs.haskell.lib.compose.dontCheck super.data-array-byte;
      doctest = pkgs.haskell.lib.compose.dontCheck super.doctest;
      hashable = pkgs.haskell.lib.compose.dontCheck super.hashable;
      optparse-applicative = pkgs.haskell.lib.compose.dontCheck super.optparse-applicative;
      QuickCheck = pkgs.haskell.lib.compose.dontCheck super.QuickCheck;
      temporary = pkgs.haskell.lib.compose.dontCheck super.temporary;
      vector = pkgs.haskell.lib.compose.dontCheck super.vector;
    };
  };
})

And I needed to patch Nixpkgs so that Hadrian wouldn't try to use RTS flags only supported by the threaded runtime.
I couldn't work out how to do this with override/overrideAttrs.

diff --git a/pkgs/development/tools/haskell/hadrian/default.nix b/pkgs/development/tools/haskell/hadrian/default.nix
index 5911c34982b..da4d194e220 100644
--- a/pkgs/development/tools/haskell/hadrian/default.nix
+++ b/pkgs/development/tools/haskell/hadrian/default.nix
@@ -29,6 +29,8 @@ mkDerivation {
     # Additionally we need to recompile it on every change of UserSettings.hs.
     # See https://gitlab.haskell.org/ghc/ghc/-/merge_requests/1190
     "-O0"
+    # Don't use threaded-only RTS flags at runtime
+    "-f-threaded"
   ];
   isLibrary = false;
   isExecutable = true;

I'd like to make it possible to use a non-threaded GHC to build GHC, but I'm not sure how. Should it be as above? Should it be an overridable option to GHC which is propagated to Hadrian? Should it be a flag on GHC derivations similar to hasHaddock? Anyway, that's a change for another PR.

@AlexandreTunstall
Copy link
Contributor Author

Is it possible to have Hydra cross-compile GHC for all platforms?

It could potentially simplify bootstrapping on unsupported platforms by substituting the derivation instead of having to manually build it on a supported platform, copy the closure to the host, and write some rather elaborate code to convince Nixpkgs to use it as a boot compiler.

In fact, we could even configure Nixpkgs to do so by default on unsupported platforms, so that users can just nix-shell -p ghc and get a working compiler.

@Janik-Haag Janik-Haag added the 12. first-time contribution This PR is the author's first one; please be gentle! label Jul 29, 2023
@sternenseemann
Copy link
Member

It could potentially simplify bootstrapping on unsupported platforms by substituting the derivation instead of having to manually build it on a supported platform, copy the closure to the host, and write some rather elaborate code to convince Nixpkgs to use it as a boot compiler.

We can via release-cross.nix. Relying on substitution would be bad form here, rather we should maintain a derivation that downloads a known good build artefact using an ordinary fetcher, fixes runtime dependencies with patchelf etc. Not sure yet if we need to upload something to tarballs.nixos.org for this or we can fetch from the normal binary cache for this.

Copy link
Member

@sternenseemann sternenseemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I took so long to get back to you! This looks great, I'll be sure to play around with it a bit as well. It is also somehow simpler than I expected, e.g. it is not necessary to set any make variables telling it to build stage 2?

All suggestions are only noted once, since the changes are duplicated (very neatly, I must say) over the expressions.

I normally dislike tentative changes that don't work yet, but I guess the changes to the hadrian expression are alright as the retain some measure of consistency with the old expressions!

Also since this PR was opened, 9.4.6.nix was added, so we'll need to remember to take care of that at the end.

pkgs/development/compilers/ghc/8.10.7.nix Show resolved Hide resolved
@@ -250,6 +251,9 @@ stdenv.mkDerivation (rec {

postPatch = "patchShebangs .";

# GHC is unable to build a cross-compiler without this set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific why this is needed? I'm guessing the build->build CC needs to be exposed as $CC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I have no idea myself.

I figured out it was needed by diffing the environment variables between stdenv.mkDerivation and pkgsBuildTarget.stdenv.mkDerivation (IIRC).

@@ -132,6 +132,7 @@ let
pkgsBuildTarget.targetPackages.stdenv.cc
] ++ lib.optional useLLVM buildTargetLlvmPackages.llvm;

buildCC = pkgsBuildHost.stdenv.cc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use buildPackages.stdenv.cc here, not because it makes an actual difference, but because it makes everything a bit clearer, since it is the convention.

Alternatively, for consistency with the others, pkgsBuildBuild.targetPackages.stdenv.cc.

@@ -278,6 +282,9 @@ stdenv.mkDerivation (rec {
# LLVM backend on Darwin needs clang: https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/codegens.html#llvm-code-generator-fllvm
export CLANG="${buildTargetLlvmPackages.clang}/bin/${buildTargetLlvmPackages.clang.targetPrefix}clang"
'' + ''
export CC_STAGE0="${buildCC}/bin/${buildCC.targetPrefix}cc"
export LD_STAGE0="${buildCC.bintools}/bin/${buildCC.bintools.targetPrefix}ld${lib.optionalString useLdGold ".gold"}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't reuse useGold here, since it is about the targetCC only. There is a possibility we are compiling with mismatched bintools (LLVM/GNU), so we'd need to consider this separately here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't believe that "use gold" isn't an option we can pass to the bintools-wrapper. It seems like that would be the right place for this kind of setting rather than repeating it in every package. But this is a criticism of bintools-wrapper, not a criticism of ghc or this PR.

Copy link
Member

@sternenseemann sternenseemann Aug 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can now (#239247), but we want to use gold even if it is available, but not the default; so the setting wouldn't be global. (This does mean that we can simplify useGold a bit though.)

@@ -326,7 +326,8 @@ stdenv.mkDerivation (rec {
# `--with` flags for libraries needed for RTS linker
configureFlags = [
"--datadir=$doc/share/doc/ghc"
"--with-curses-includes=${ncurses.dev}/include" "--with-curses-libraries=${ncurses.out}/lib"
"--with-curses-includes=${pkgsBuildHost.ncurses.dev}/include"
"--with-curses-libraries=${pkgsBuildHost.ncurses.out}/lib"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this matter? Isn't terminfo disabled as soon as there is any kind of cross compilation happening? Then it would make more sense to pass these flags only conditionally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is supposedly disabled, but something still ends up using ncurses.

sed -i $out/lib/${targetPrefix}${passthru.haskellCompilerName}/settings \
-e "s!$CC!${installCC}/bin/${installCC.targetPrefix}cc!g" \
-e "s!$CXX!${installCC}/bin/${installCC.targetPrefix}c++!g" \
-e "s!$LD!${installCC.bintools}/bin/${installCC.bintools.targetPrefix}ld${lib.optionalString useLdGold ".gold"}!g" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here technically a separate useGold condition makes sense, but I don't think bintools can in practice differ between pkgsHostTarget and pkgsBuildTarget.

@@ -134,6 +134,7 @@ let

buildCC = pkgsBuildHost.stdenv.cc;
targetCC = builtins.head toolsForTarget;
installCC = pkgsHostTarget.gcc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkgsHostTarget.targetPackages.stdenv.cc is nicer, because it respects the default compiler and doesn't statically require gcc.

# option will force it to do an unregistered build when set to true.
# See https://gitlab.haskell.org/ghc/ghc/-/wikis/building/unregisterised
# Registerised RV64 cross-compiler currently produces programs that segfault
enableUnregisterised ? !stdenv.buildPlatform.isRiscV64 && stdenv.targetPlatform.isRiscV64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we check stdenv.hostPlatform.isRiscV64? Or does it also segfault if the native compiler is cross-compiled? Is there an issue we can link to?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my testing, I found that a registerised stage 1 cross-compiler outputs programs that segfault, meaning that both stage 1 (outputs broken programs) and stage 2 (is itself broken) are unusable. I don't see the need to check hostPlatform, since targetPlatform is RiscV64 whether we want stage 1 or 2.

There isn't an upstream issue yet, but I've been meaning to create one. I'm currently doing a native build of 9.2.8 to verify that it isn't already fixed in 9.6.2 (the only version I've tested on native so far). I'll add a link to it in the code once I create it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've since discovered that native RISC-V compilers also segfault, so I'll change this to stdenv.hostPlatform.isRiscV64 || stdenv.targetPlatform.isRiscV64 (hostPlatform to ensure stage 2 GHC doesn't crash, targetPlatform to ensure compiled Haskell programs don't crash).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'll also include a link to a GHC issue.

@sternenseemann
Copy link
Member

We can via release-cross.nix.

To clarify, one option would be to use release-cross.nix which tests the master branch and add it here:

gnuCommon = lib.recursiveUpdate common {
buildPackages.gcc = nativePlatforms;
coreutils = nativePlatforms;
haskell.packages.ghcHEAD.hello = nativePlatforms;
haskellPackages.hello = nativePlatforms;
};

The question is of course if this makes sense for all platforms gnuCommon is applied to.

Alternatively we can make a new section in release-haskell.nix which tests the haskell-updates branch. We tests some obscure GHC variants there already and it is monitored more closely by the Haskell maintainers, so probably a better fit. Also changes usually go through haskell-updates first…

@AlexandreTunstall
Copy link
Contributor Author

The first commit, which runs autoreconfHook for some versions and patches ./configure for others. This creates a real footgun. Let's run autoreconfHook everywhere and not patch ./configure.

This is no longer necessary due to the removal of 8.8.4.

Details? Bug report? A long-form explanation of what has been accomplished and what does/doesn't work should go in the commit message for the last commit.

See #173952, which is a fix for binutils-wrapper. cc-wrapper has the same problem: they reference the build platform's shell in the wrapper script when they should be referencing the host platform's shell.

This results in GHC not being runnable on the host platform without emulating the build platform.

I'm having trouble understanding this. If the cross-built GHC doesn't work, how does it "allow users to port GHC to platforms that Nixpkgs doesn't support natively"?

Emulation can be used as a workaround.

Please add an entry to tests.cross.sanity for a good build=x86_64 example of a build that should succeed with this PR? That will guard against anybody undoing all your hard work later on!

Done. I haven't limited it to x86_64, as I assume that cross-compiling should work on any build platform.

@skeuchel
Copy link
Contributor

skeuchel commented Mar 29, 2024

This is great. Thanks for doing this! I hope we can use this to get "trusted" bootstrap tarballs for new platforms (riscv64) without waiting for GHCHQ to release bindists.

I have not yet built any of it but looked over some of the changes.

Unregisterised riscv64 builds

This is a pity but of course unavoidable in general for new platforms. I would still like to get newer e.g. >= 9.6 registerised builds working if possible. I think @AlexandreTunstall had some trouble with registerised cross-compiled ghcs. Can you comment what exactly went wrong and which versions you tried? In my experience native registerised builds for >= 9.6 work out of the box. Also Debian ships with a native 9.4.7 registerised build which I believe they got working by building against llvm 15. Might be useful if you do any testing since that is less painful than starting from an unregisterised one. Here is the info of that build:

$ ghc --info | egrep -i 'cross|boot|link|llvm|interp|ffi|register|ways'
 ,("C compiler link flags","-fuse-ld=bfd")
 ,("cross compiling","NO")
 ,("target has RTS linker","NO")
 ,("Unregisterised","NO")
 ,("LLVM target","riscv64-unknown-linux")
 ,("LLVM llc command","llc-15")
 ,("LLVM opt command","opt-15")
 ,("LLVM clang command","clang")
 ,("Use interpreter","YES")
 ,("RTS ways","debug thr thr_debug thr_p dyn debug_dyn thr_dyn thr_debug_dyn thr_debug_p debug_p")
 ,("Use LibFFI","YES")
 ,("Booter version","9.4.7")
 ,("Have interpreter","YES")
 ,("Target default backend","LLVM")

Interpreter in riscv64 hadrian builds

The move to hadrian initially disabled the interpreter on riscv64. This is a bit minor if you just want a compiler to bootstrap. There are two upstream MRs enabling that again:

@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Mar 29, 2024
# See https://gitlab.haskell.org/ghc/ghc/-/wikis/building/unregisterised
# Registerised RV64 compiler produces programs that segfault
# See https://gitlab.haskell.org/ghc/ghc/-/issues/23957
enableUnregisterised ? stdenv.hostPlatform.isRiscV64 || stdenv.targetPlatform.isRiscV64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registerised riscv64 builds of ghc >=9.6 work for me natively (build = host = target) and this is the default behaviour before this PR. Can you preserve that for all hadrian-based builds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that true even when building more complex programs like Pandoc?

From my testing with 9.6.2, registerised initially appeared to work until I tried compiling something more complex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give building with LLVM 15 a try.

In the meantime, I'll undo this change in common-hadrian.nix.

Copy link
Contributor Author

@AlexandreTunstall AlexandreTunstall Mar 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For older versions of GHC, LLVM 15 is not officially supported, so I don't want to take the risk of subtly breaking them while trying to fix registerised builds.

Especially now that 9.6 is the default version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that true even when building more complex programs like Pandoc?

From my testing with 9.6.2, registerised initially appeared to work until I tried compiling something more complex.

Yes I got pandoc and shellcheck installed and tested basic functionality. Only cachix failed to compile with "error: cycle detected in build of '/nix/store/[..]-cachix-1.7.drv' in the references of output 'bin' from output 'out'" which I have yet to look into. I will try to rebuild a new registerised 9.6.4 from your branch and will report back.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9.6.2 still segfaults on my old (~August) NixOS build when using LLVM 15.

[ 1 of 16] Compiling Language.Haskell.HsColour.Classify ( Language/Haskell/HsColour/Classify.hs, dist/build/Language/Haskell/HsColour/Classify.o, dist/bu>
/nix/store/mjlk72gj834v6lmbk9j9pvby88s686cz-stdenv-linux/setup: line 1597:   202 Segmentation fault      (core dumped) ./Setup build

This was bootstrapped using a native unregisterised 9.2.8, which was itself booted from a cross-compiled 8.10.7 compiled with the original version of this PR. All native packages were compiled for RV64GC_Zba_Zbb using gcc13Stdenv.

Some (likely useless) kernel logs:

ghc_worker[772846]: unhandled signal 11 code 0x1 at 0x0000004e3584b582
CPU: 2 PID: 772846 Comm: ghc_worker Not tainted 6.5.0 #1-NixOS
Hardware name: StarFive VisionFive 2 v1.3B (DT)
epc : 0000004e3584b582 ra : 0000003ff3e06900 sp : 00000004e3ffa730
 gp : 000000000006ea80 tp : 00000004e3fff8e0 t0 : 0000000000004000
 t1 : 0000003ff3e00628 t2 : 0000003ff6f86598 s0 : 00000004f148d652
 s1 : 0000003fedb8afd8 a0 : 0000000000000000 a1 : 0000003fed8dd038
 a2 : 0000000000000002 a3 : 0000000000000002 a4 : 0000000000000032
 a5 : 09e1854e3584b583 a6 : 0000000000000000 a7 : 0000003ff7ffdd50
 s2 : 00000004f311b9f8 s3 : 00000004f16d93c8 s4 : 0000003fed8dc5e0
 s5 : 00000004f31140c0 s6 : 00000004f148d638 s7 : 00000004f36d2870
 s8 : 00000004f2ce1000 s9 : 00000000000003bd s10: 0000003fedb59058
 s11: 00000004f31140c0 t3 : 0000003fed2f0514 t4 : 0000000000000001
 t5 : 0000003fed9c6e2a t6 : 000000000000002e
status: 8000000200006020 badaddr: 0000004e3584b582 cause: 000000000000000c

I'll try 9.6.4 on a more up-to-date Nixpkgs to see if the issue still happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discovered that the 6.5 kernel has a regression that causes some applications (notably rustc) to segfault. I've upgraded my kernel to 6.6, which doesn't have that issue, but the GHC 9.6.2 I've previously built still segfaults.

(The 9.6.4 build is still ongoing.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 9.6.4 build is fully working. I have successfully built ShellCheck and nix-output-monitor with it, and not a single segfault has been seen.

Now I need to try rebuilding 9.6.2 to properly rule out the Linux kernel as the cause and try building 9.2 and 9.4 registerised to see if I can get those to work. This is going to take a lot of time.

@skeuchel
Copy link
Contributor

skeuchel commented Mar 29, 2024

So I cross-compiled integer-simple.ghc8107 and native-bignum.ghc{902,925,926,927,928,945,946,947,948} from x86_64 to riscv64. Everything built just fine. However, there are x86_64 specific store paths leaking into the result. I tried to test basic functionality on a riscv64 machine.

The host platform's llvm is referenced. Is that even used for an unregisterised build?

$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 8.10.7
$ ghc --info | grep -i llvm
 ,("LLVM target","riscv64-unknown-linux")
 ,("LLVM llc command","/nix/store/vcxvkk05f3fb3175bk8b9r4pa849l6cd-llvm-12.0.1/bin/llc")
 ,("LLVM opt command","/nix/store/vcxvkk05f3fb3175bk8b9r4pa849l6cd-llvm-12.0.1/bin/opt")
 ,("LLVM clang command","clang")
$ file /nix/store/vcxvkk05f3fb3175bk8b9r4pa849l6cd-llvm-12.0.1/bin/llc
/nix/store/vcxvkk05f3fb3175bk8b9r4pa849l6cd-llvm-12.0.1/bin/llc: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/1rm6sr6ixxzipv5358x0cmaw8rs84g2j-glibc-2.38-44/lib/ld-linux-x86-64.so.2, BuildID[sha1]=8a085589d1caa5f6d7eb2bc59ec420fe188151d7, for GNU/Linux 3.10.0, not stripped
$ file /nix/store/vcxvkk05f3fb3175bk8b9r4pa849l6cd-llvm-12.0.1/bin/opt
/nix/store/vcxvkk05f3fb3175bk8b9r4pa849l6cd-llvm-12.0.1/bin/opt: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/1rm6sr6ixxzipv5358x0cmaw8rs84g2j-glibc-2.38-44/lib/ld-linux-x86-64.so.2, BuildID[sha1]=7903cd1677264d41f42a61aa0faec1b20827042e, for GNU/Linux 3.10.0, not stripped

Compiling a simple hello world fails because it references the host platform's C compiler

$ ghc --info | grep 'C compiler command'
 ,("C compiler command","/nix/store/haai65bgzwfq3m954f3148yl76y2wgcc-gcc-wrapper-13.2.0/bin/cc")
$ ghc --make main.hs
[1 of 1] Compiling Main             ( main.hs, main.o )
/nix/store/haai65bgzwfq3m954f3148yl76y2wgcc-gcc-wrapper-13.2.0/bin/cc: runInteractiveProcess: posix_spawnp: invalid argument (Exec format error)
$ head -n1 /nix/store/haai65bgzwfq3m954f3148yl76y2wgcc-gcc-wrapper-13.2.0/bin/cc
#! /nix/store/5lr5n3qa4day8l1ivbwlcby2nknczqkq-bash-5.2p26/bin/bash
$ file /nix/store/5lr5n3qa4day8l1ivbwlcby2nknczqkq-bash-5.2p26/bin/bash
/nix/store/5lr5n3qa4day8l1ivbwlcby2nknczqkq-bash-5.2p26/bin/bash: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /nix/store/1rm6sr6ixxzipv5358x0cmaw8rs84g2j-glibc-2.38-44/lib/ld-linux-x86-64.so.2, BuildID[sha1]=3b212210bbdd58aef6576efc0c8bf8808e0ff53d, for GNU/Linux 3.10.0, not stripped

@AlexandreTunstall
Copy link
Contributor Author

AlexandreTunstall commented Mar 29, 2024

However, there are x86_64 specific store paths leaking into the result.

Ouch, that is correct. Fortunately, it is possible to work around those issues by ensuring that you have a suitable LLVM in your PATH. Still not an issue, but this explanation is wrong; here's why.

[nixos@nixos:~/system]$ /etc/nixos/boot/ghc-8.10.7/bin/ghc --info
 ,("LLVM llc command","/nix/store/ivvw5czh7yl0flc3q0hi45v7hqfqirww-llvm-12.0.1/bin/llc")
 ,("LLVM opt command","/nix/store/ivvw5czh7yl0flc3q0hi45v7hqfqirww-llvm-12.0.1/bin/opt")

[nixos@nixos:~/system]$ /nix/store/ivvw5czh7yl0flc3q0hi45v7hqfqirww-llvm-12.0.1/bin/llc --version
-bash: /nix/store/ivvw5czh7yl0flc3q0hi45v7hqfqirww-llvm-12.0.1/bin/llc: cannot execute binary file: Exec format error

[nixos@nixos:~/system]$ ghc --info | grep -e llvm
 ,("LLVM llc command","/nix/store/3x5fkjh20mcf4lhs0ji297x8avlj81py-llvm-12.0.1/bin/llc")
 ,("LLVM opt command","/nix/store/3x5fkjh20mcf4lhs0ji297x8avlj81py-llvm-12.0.1/bin/opt")

[nixos@nixos:~/system]$ export PATH=/nix/store/3x5fkjh20mcf4lhs0ji297x8avlj81py-llvm-12.0.1/bin:$PATH

[nixos@nixos:~]$ cd ~/ghc-test/

[nixos@nixos:~/ghc-test]$ /etc/nixos/boot/ghc-8.10.7/bin/ghc Hello.hs
[1 of 1] Compiling Main             ( Hello.hs, Hello.o )
Linking Hello ...

[nixos@nixos:~/ghc-test]$ ./Hello
Hello, world!

I can't remember why I didn't fix that when I first wrote the PR, so I'll look into it.

@AlexandreTunstall
Copy link
Contributor Author

Compiling a simple hello world fails because it references the host platform's C compiler

This is because of the aforementioned issue with cc-wrapper and binutils-wrapper. The C compiler used is the host platform's, but the wrappers use the build platform's bash, causing it to fail.

I have a (potentially hacky) fix in my ghc-cross-usable branch. I'm not looking to merge it in this PR as it's a completely unrelated change.

@AlexandreTunstall
Copy link
Contributor Author

Can you comment what exactly went wrong and which versions you tried?

I have no fresh memories of trying older versions, but there are more details in the corresponding GHC issue: https://gitlab.haskell.org/ghc/ghc/-/issues/23957

The [build] platform's llvm is referenced. Is that even used for an unregisterised build?

You're probably right about this, which is why I never noticed that the LLVM in the ghc --info isn't correct.

I tried my earlier example of a working cross-compiled GHC without LLVM in PATH and it still worked. That GHC was built with patched cc-wrapper and binutils-wrapper.

This is to ensure that Haskell users on platforms that lack official
bindists still have a convenient means of getting GHC running natively.

In my admittedly somewhat limited testing on RISC-V, GHC 8.10.7 is able
to bootstrap native builds for 9.2.8 and 9.4.5. GHC 9.2.8 and 9.4.5 are
unable to bootstrap themselves and 9.6.2 when cross-compiled.

If you're looking at this commit to see whether you can safely upgrade
the compiler used here to remove 8.10, please try cross-compiling 9.0 or
later and then booting a native GHC with it.
@skeuchel
Copy link
Contributor

skeuchel commented Apr 4, 2024

To test the changes to 9.4.8 and the tentative changes made to 9.6, I've successfully tested the following

This is what I currently care about ;)

@skeuchel
Copy link
Contributor

skeuchel commented Apr 4, 2024

As for the wrappers, looking through the code I can see that the settings file

postInstall = ''
# Make the installed GHC use the host platform's tools.
sed -i $out/lib/${targetPrefix}${passthru.haskellCompilerName}/settings \
-e "s!$CC!${installCC}/bin/${installCC.targetPrefix}cc!g" \
-e "s!$CXX!${installCC}/bin/${installCC.targetPrefix}c++!g" \
-e "s!$LD!${installCC.bintools}/bin/${installCC.bintools.targetPrefix}ld${lib.optionalString useLdGold ".gold"}!g" \
-e "s!$AR!${installCC.bintools.bintools}/bin/${installCC.bintools.targetPrefix}ar!g" \
-e "s!$RANLIB!${installCC.bintools.bintools}/bin/${installCC.bintools.targetPrefix}ranlib!g"
# Install the bash completion file.
install -D -m 444 utils/completion/ghc.bash $out/share/bash-completion/completions/${targetPrefix}ghc
'';
is supposed to be populated with the host platform's tools (installCC comes from pkgsHostTarget).

Do I understand correctly that the wrappers used in pkgsHostTarget are to blame? So (without looking at the details) this is fixed by #173952 for binutils?

If this is the case, then this seems a bit orthogonal and I would not block the PR because of this, but I'll let the maintainer decide.

Also is the following still true or outdated, at least for the tools patched in the postInstall?

# C compiler, bintools and LLVM are used at build time, but will also leak into
# the resulting GHC's settings file and used at runtime. This means that we are
# currently only able to build GHC if hostPlatform == buildPlatform.

As mentioned before, at least LLVM seems to be still leaking, so maybe the postInstall should also patch the settings file to use LLVM from pkgsHostTaret?

@sternenseemann
Copy link
Member

#305392 has been merged into #339272. Further testing feedback/testing appreciated. Currently ironing out some regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: cross-compilation Building packages on a different platform than they will be used on 6.topic: haskell 10.rebuild-darwin: 501+ 10.rebuild-darwin: 5001+ 10.rebuild-linux: 501+ 10.rebuild-linux: 5001+ 12. first-time contribution This PR is the author's first one; please be gentle!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants