-
Couldn't load subscription status.
- Fork 59
Description
Problem
Power users who want to get the most out of their program may want to tune the optimisations done by LLVM; for example, @exaexa determined based on assembly output that LLVM's auto-vectorisation pass was not skipping epilogue vectorisation in a loop where this would have been highly beneficial. Tuning epilogue-vectorization-minimum-VF (link) made their program faster.
Relatedly, accelerate-llvm currently does not set any fast math flags in the generated LLVM IR. However, this results in (relative) slowness in some applications; most damningly, sum is not vectorised in Accelerate at the moment for this reason.
Solution
It may be good for accelerate-llvm to offer a way for users to influence LLVM's optimisation passes. For the fast math flags (clang -ffast-math seems to influence only how C is lowered to LLVM IR, not how IR itself is optimised, so to get fast-math behaviour, the accelerate-llvm codegen needs to be changed), even though we could make a default decision that is different from "fully safe", the user may still want to tune this.
There are multiple possible API designs here:
- Per kernel; this requires extensive additional annotation support. Robbert worked on this (see post below), but this has not yet been merged.
- Per Acc program, using an additional
runNvariant that takes a record with various settings. It would be good to ensure that users cannot rely on this record to have a particular number of fields, so that we can add more options in the future without breaking clients. A possible design here is likeRequestanddefaultRequestin http-client. - Per Haskell program, using additional
+ACCflags, e.g.+ACC -Xclang -ffast-math -ACC, mirroring the API inclangitself for passing options to (I think!)collect2. - Per Haskell program, alternative: using an additional environment variable; the llvm-pretty branch already responds to
ACCELERATE_LLVM_CLANG_PATH, and we could add e.g.ACCELERATE_LLVM_CLANG_OPTIONS="-ffast-math". I don't like this because it does not offer an obvious way to pass options containing spaces.