Skip to content

Commit

Permalink
FX: has_inf_or_nan cuda invalid configuration argument
Browse files Browse the repository at this point in the history
  • Loading branch information
a710128 committed May 1, 2022
1 parent 81f1a03 commit cb06d14
Show file tree
Hide file tree
Showing 3 changed files with 3 additions and 1 deletion.
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ RUN pip3 install torch==1.10.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN pip3 install numpy -i https://pypi.tuna.tsinghua.edu.cn/simple
RUN apt install iputils-ping opensm libopensm-dev libibverbs1 libibverbs-dev -y --no-install-recommends
ENV TORCH_CUDA_ARCH_LIST=6.1;7.0;7.5
ENV BMP_AVX512=1
ENV BMT_AVX512=1
ADD other_requirements.txt other_requirements.txt
RUN pip3 install --upgrade pip && pip3 install -r other_requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
ADD . .
Expand Down
1 change: 1 addition & 0 deletions csrc/cuda/adam.cu
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ void adam_launcher(
float bias_correction2
) {
int32_t n = param_fp32.numel();
if (n <= 0) return;
auto g_ptr = reinterpret_cast<half*>(g_fp16.data_ptr<at::Half>());
auto m_ptr = reinterpret_cast<half*>(m_fp16.data_ptr<at::Half>());
auto v_ptr = v_fp32.data_ptr<float>();
Expand Down
1 change: 1 addition & 0 deletions csrc/cuda/has_inf_nan.cu
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,7 @@ void has_nan_inf_launcher(
torch::Tensor out
) {
int n = g_fp16.numel();
if (n <= 0) return;
auto g_ptr = reinterpret_cast<half*>(g_fp16.data_ptr<at::Half>());
auto mid_ptr = mid.data_ptr<uint8_t>();
auto stream = at::cuda::getCurrentCUDAStream();
Expand Down

0 comments on commit cb06d14

Please sign in to comment.