-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA code compile error: function template partial specialization is not allowed #68772
Comments
@llvm/issue-subscribers-c-1 Author: Jiang, Zhiwei (zhiweij1)
Smaller reproducer:
```
template<class T1, class T2>
struct A {
template <class T3>
void foo(T3 x);
};
template<typename T1, typename T2> }
|
I think out-of-line member function template definition looks the following way: template<typename T1, typename T2>
template<typename T3>
void A<T1, T2>::foo(T3 x) {} So Clang is correct to reject your code. |
Hi @Endilll , Actually, the key point here is that the original code is compliable witn nvcc (and it is treated as CUDA language). I changed the title from "C++" => "CUDA" |
I'm not sure we are bug-compatible with nvcc (or want to be) when it clashes with the C++ standard. That said, I think your code falls into conforming extension territory. |
We don't currently aim for compatibility with NVCC within Clang, though given that we aim for compatibility with GCC and MSVC, I would suspect an RFC with a plan for NVCC compatibility would be well-received once we knew the shape of what the goal was (ABI compat sufficient, or do we need bug for bug compat, etc). CC @erichkeane on this point. The official CUDA documentation lists C++ language extensions, but makes no mention of this being a language extension. So I think this is very plausibly a bug in NVCC that their headers are relying on. Have you reported this to NVidia and gotten their stance on it? |
I would presume that this is something that is a biproduct of using a different CFE that doesn't catch this (EDG) and is not an intentional extension. At the moment, there is no plan to get CUDA to be NVCC compatible, I don't believe we make any guarantees at all. AFAIK, Google did the implementation for the purposes of some internal products, so I doubt we'll see any additional work on it. I would think this is a situation where filing a bug against the FasterTransform library to see if they would use a correct type of C++ would be the best course of action. |
Agreed. I'm going to close this as not a bug, but if we do decide we want NVCC CUDA compatibility at some point in the future or if @zhiweij1 comes back with more information on why this is needed regardless, we can reopen the issue at that point. |
@AaronBallman @erichkeane This is probably an edg bug https://godbolt.org/z/81osqz3nd - it's not a partial specialization, it's just lets you repeat the parameter list exactly, probably a missing diag |
That was my conclusion as well (though I phrased it less explicitly). |
I have reported this issue in nvidia forum https://forums.developer.nvidia.com/t/nvcc-bug-incorrect-template-method-definition-syntax-can-pass-nvcc-compilation/269184 and in FasterTransformer community NVIDIA/FasterTransformer#765 |
Smaller reproducer:
https://godbolt.org/z/xn7vre9db
The original code comes from https://github.com/NVIDIA/FasterTransformer/blob/afdf9a9eb86f15363c0249117d166d6b45dbb371/src/fastertransformer/kernels/cutlass_kernels/fpA_intB_gemm/fpA_intB_gemm_template.h#L426
It can be built with nvcc successfully.
The text was updated successfully, but these errors were encountered: