Skip to content

Allow influencing Clang/LLVM options and code generation #105

@tomsmeding

Description

@tomsmeding

Problem
Power users who want to get the most out of their program may want to tune the optimisations done by LLVM; for example, @exaexa determined based on assembly output that LLVM's auto-vectorisation pass was not skipping epilogue vectorisation in a loop where this would have been highly beneficial. Tuning epilogue-vectorization-minimum-VF (link) made their program faster.

Relatedly, accelerate-llvm currently does not set any fast math flags in the generated LLVM IR. However, this results in (relative) slowness in some applications; most damningly, sum is not vectorised in Accelerate at the moment for this reason.

Solution
It may be good for accelerate-llvm to offer a way for users to influence LLVM's optimisation passes. For the fast math flags (clang -ffast-math seems to influence only how C is lowered to LLVM IR, not how IR itself is optimised, so to get fast-math behaviour, the accelerate-llvm codegen needs to be changed), even though we could make a default decision that is different from "fully safe", the user may still want to tune this.

There are multiple possible API designs here:

  1. Per kernel; this requires extensive additional annotation support. Robbert worked on this (see post below), but this has not yet been merged.
  2. Per Acc program, using an additional runN variant that takes a record with various settings. It would be good to ensure that users cannot rely on this record to have a particular number of fields, so that we can add more options in the future without breaking clients. A possible design here is like Request and defaultRequest in http-client.
  3. Per Haskell program, using additional +ACC flags, e.g. +ACC -Xclang -ffast-math -ACC, mirroring the API in clang itself for passing options to (I think!) collect2.
  4. Per Haskell program, alternative: using an additional environment variable; the llvm-pretty branch already responds to ACCELERATE_LLVM_CLANG_PATH, and we could add e.g. ACCELERATE_LLVM_CLANG_OPTIONS="-ffast-math". I don't like this because it does not offer an obvious way to pass options containing spaces.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions