Fix sccache for CTK 11.1 and properly track compilations in stats #2285

trxcllnt · 2024-11-08T23:36:49Z

This PR has some fixes I neglected to add to #2247.

nvcc in CUDA toolkit v11.1 didn't add the -D__CUDA_ARCH_LIST__= definition, so 5271494 expands the list of defines that indicate an nvcc host compiler invocation.
f160a0a and 1591089 report compilation type (local or dist) and duration for forced-no-cache, forced-recache, and compilation failures. It also counts and reports total compilations performed, not just compilations due to cache misses.
ccfc60b ensures compilations with --verbose are never dist-compiled, since the verbose output is parsed by tools like CMake and must reflect the local toolchain.
bdaf35e adds more clang flags so using clang as a CUDA compiler with -Xclang doesn't fail

Question for @sylvestre related to the last point -- do you know which bits of the clang toolchain (or CTK?) sccache should package when using clang as a device compiler? I am seeing errors like the following when attempting to dist-compile with ClangCUDA, but I'm not sure which files define the __nvvm_* symbols:

In file included from build/libcudacxx/test/internal_headers/headers/__barrier_async_contract_fulfillment.h.cu:1:
In file included from <built-in>:1:
In file included from /usr/lib/llvm-18/lib/clang/18/include/__clang_cuda_runtime_wrapper.h:73:
/usr/lib/llvm-18/lib/clang/18/include/__clang_cuda_builtin_vars.h:53:180: error: use of undeclared identifier '__nvvm_read_ptx_sreg_tid_x'
   53 |   __declspec(property(get = __fetch_builtin_x)) unsigned int x; static inline __attribute__((always_inline)) __attribute__((device)) unsigned int __fetch_builtin_x(void) { return __nvvm_read_ptx_sreg_tid_x(); };
...

…ersions

…led compilations

…since tools like CMake parse the output and expect to see client paths not dist-server paths

sylvestre · 2024-11-12T13:52:07Z

sorry, i don't know

…rsed by inputs/outputs even if the preprocessor, cicc, and ptxas commands are out of order.

trxcllnt · 2024-11-14T20:37:22Z

This doesn't seem to be an issue with sccache. It appears clang can't compile its own preprocessor output:

#!/usr/bin/env bash

# Basic CUDA example from https://godbolt.org/
cat <<EOF >/tmp/test.cu
__global__ void square(int* array, int n) {
    int tid = blockDim.x * blockIdx.x + threadIdx.x;
    if (tid < n)
        array[tid] = array[tid] * array[tid];
}
EOF

# Preprocess
clang++ -x cuda -E --cuda-gpu-arch=sm_80 --cuda-path=/usr/local/cuda -Wno-unknown-cuda-version /tmp/test.cu > /tmp/test.cui

# Compile (fails)
clang++ -x cuda-cpp-output --cuda-gpu-arch=sm_80 --cuda-path=/usr/local/cuda -Wno-unknown-cuda-version -o /tmp/test.cu.o /tmp/test.cui

trxcllnt · 2024-11-19T17:43:06Z

cc: @robertmaynard for review

…e server environment

trxcllnt · 2024-11-29T19:03:00Z

This PR is ready to merge.

sylvestre · 2024-11-29T19:23:46Z

Given that the initial implementation had issues and you are fixing them here, could you please add tests to cover these missing cases?
thanks

trxcllnt · 2024-11-29T22:24:09Z

@sylvestre I added tests to ensure -v|--verbose and clang-cuda compilations aren't dist-compiled, and added CUDA Toolkit v11.1 to the CI jobs to ensure we accommodate nvcc changes introduced between CTK v11.1 and v11.8.

CTK v11.1 is the oldest version we test in CUDA Core Compute Libraries. I wouldn't mind testing earlier versions, but CTK v10.2 is only available up to ubuntu18.04, GH has removed ubuntu18.04 runners, so we'd have to containerize the test jobs.

trxcllnt added 5 commits November 8, 2024 10:48

check for more host-compiler nvcc defines to accommodate older nvcc v…

5271494

…ersions

ensure dist_type is reported for failed and uncached compilations

f160a0a

report total compilation count and compile times for uncached and fai…

1591089

…led compilations

compiler invocations with -v or --verbose must not be dist-compiled, …

ccfc60b

…since tools like CMake parse the output and expect to see client paths not dist-server paths

add more clang flags

bdaf35e

trxcllnt added 2 commits November 14, 2024 00:11

hash --gen_module_id_file and --module_id_file_name arguments

e93f775

Normalize nvcc subcommand order for CTK <12.0, ensuring the DAG is pa…

b1acd9d

…rsed by inputs/outputs even if the preprocessor, cicc, and ptxas commands are out of order.

trxcllnt force-pushed the fix/cuda11.1-and-stats branch from 30a1a7f to b1acd9d Compare November 14, 2024 01:27

always add --gen_module_id_file if --module_id_file_name is specified

2d058b5

robertmaynard approved these changes Nov 19, 2024

View reviewed changes

add --default-stream arg, fix parsing concatenated form of nvcc -t1

5469482

trxcllnt force-pushed the fix/cuda11.1-and-stats branch from 3b0ae6c to 5469482 Compare November 22, 2024 21:40

trxcllnt changed the title ~~Fix sccache for CTK 11.1 and properly track compilations in stats~~ [do not merge] Fix sccache for CTK 11.1 and properly track compilations in stats Nov 25, 2024

trxcllnt added 4 commits November 26, 2024 13:46

read NVCC_{PREPEND,APPEND}_FLAGS from the compile environment, not th…

44d2db9

…e server environment

revert adding --gen_module_id_file when --module_id_file_name is present

0ddf354

include the output file name in nvcc trace logs

512853e

don't use leading digit in renamed file names

ac6372a

trxcllnt force-pushed the fix/cuda11.1-and-stats branch from 02f84c0 to ac6372a Compare November 26, 2024 23:10

trxcllnt changed the title ~~[do not merge] Fix sccache for CTK 11.1 and properly track compilations in stats~~ Fix sccache for CTK 11.1 and properly track compilations in stats Nov 27, 2024

trxcllnt added 3 commits November 29, 2024 11:42

don't generate a dist-compile command for clang-cuda

cdf635a

add test to ensure -v|--verbose are never dist-compiled

d935d21

test CTK 11.1 in CI

e7b529e

trxcllnt force-pushed the fix/cuda11.1-and-stats branch from 6b9faab to e7b529e Compare November 29, 2024 22:17

sylvestre merged commit a1c33d4 into mozilla:main Dec 3, 2024
59 checks passed

BrewTestBot mentioned this pull request Dec 9, 2024

sccache 0.9.0 Homebrew/homebrew-core#200608

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix sccache for CTK 11.1 and properly track compilations in stats #2285

Fix sccache for CTK 11.1 and properly track compilations in stats #2285

trxcllnt commented Nov 8, 2024

sylvestre commented Nov 12, 2024

trxcllnt commented Nov 14, 2024 •

edited

Loading

trxcllnt commented Nov 19, 2024

trxcllnt commented Nov 29, 2024

sylvestre commented Nov 29, 2024

trxcllnt commented Nov 29, 2024

Fix sccache for CTK 11.1 and properly track compilations in stats #2285

Fix sccache for CTK 11.1 and properly track compilations in stats #2285

Conversation

trxcllnt commented Nov 8, 2024

sylvestre commented Nov 12, 2024

trxcllnt commented Nov 14, 2024 • edited Loading

trxcllnt commented Nov 19, 2024

trxcllnt commented Nov 29, 2024

sylvestre commented Nov 29, 2024

trxcllnt commented Nov 29, 2024

trxcllnt commented Nov 14, 2024 •

edited

Loading