Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include stdio.h #770

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

JihaoXin
Copy link

As #744 mentioned, we met compilation failure because of the printf.
The reason might be nvcc in newer CUDA does not automatically include the <stdio.h>.
To address this, we should explicitly include them when they are used.
I'm using Ubuntu 20.04, CUDA 12.1, CMake 3.26.4 compiled with sm80 (A100) only.
Error Message:

root@50e3724dbf8f:/workspace/mage/third_party/FasterTransformer/build# make -j12
[  0%] Built target cuda_driver_wrapper
[  1%] Built target logger
[  2%] Built target nvtx_utils
[  2%] Built target cuda_utils
[  2%] Built target cutlass_preprocessors
[  3%] Built target custom_ar_kernels
[  3%] Built target add_residual_kernels
[  3%] Built target activation_kernels
[  3%] Built target bert_preprocess_kernels
[  4%] Built target transpose_int8_kernels
[  5%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/unfused_attention_kernels.dir/unfused_attention_kernels.cu.o
[  6%] Built target layernorm_kernels
[  6%] Built target matrix_vector_multiplication
[  6%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/layout_transformer_int8_kernels.dir/layout_transformer_int8_kernels.cu.o
[  7%] Built target word_list
[  7%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/quantization_int8_kernels.dir/quantization_int8_kernels.cu.o
[  7%] Built target cutlass_heuristic
[  7%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/calibrate_quantize_weight_kernels.dir/calibrate_quantize_weight_kernels.cu.o
[  7%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/gen_relative_pos_bias.dir/gen_relative_pos_bias.cu.o
[  7%] Built target layernorm_int8_kernels
[  7%] Built target activation_int8_kernels
[  7%] Built target ban_bad_words
[  8%] Built target stop_criteria
[  8%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/softmax_int8_kernels.dir/softmax_int8_kernels.cu.o
[  8%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/logprob_kernels.dir/logprob_kernels.cu.o
[  8%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/matrix_transpose_kernels.dir/matrix_transpose_kernels.cu.o
[  8%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_112.cu.o
[  8%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/add_bias_transpose_kernels.dir/add_bias_transpose_kernels.cu.o
[  8%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/longformer_kernels.dir/longformer_kernels.cu.o
[  8%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/online_softmax_beamsearch_kernels.dir/online_softmax_beamsearch_kernels.cu.o
[  8%] Linking CUDA device code CMakeFiles/quantization_int8_kernels.dir/cmake_device_link.o
/workspace/mage/third_party/FasterTransformer/src/fastertransformer/kernels/decoder_masked_multihead_attention_utils.h(1743): error: identifier "printf" is undefined
      printf("[ERROR] still no have implementation for vec_from_smem_transpose under __nv_fp8_e4m3 \n");
      ^

/workspace/mage/third_party/FasterTransformer/src/fastertransformer/kernels/decoder_masked_multihead_attention_utils.h(1852): error: identifier "printf" is undefined
      printf("[ERROR] still no have implementation for vec_from_smem_transpose under __nv_fp8_e4m3 \n");
      ^

[  8%] Linking CUDA static library ../../../lib/libquantization_int8_kernels.a
[  8%] Linking CUDA device code CMakeFiles/layout_transformer_int8_kernels.dir/cmake_device_link.o
[  8%] Built target quantization_int8_kernels
[  8%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoding_kernels.dir/decoding_kernels.cu.o
[  8%] Linking CUDA static library ../../../lib/liblayout_transformer_int8_kernels.a
[  9%] Linking CUDA device code CMakeFiles/matrix_transpose_kernels.dir/cmake_device_link.o
[  9%] Built target layout_transformer_int8_kernels
[ 10%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/gpt_kernels.dir/gpt_kernels.cu.o
[ 10%] Linking CUDA static library ../../../lib/libmatrix_transpose_kernels.a
[ 10%] Built target matrix_transpose_kernels
[ 10%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/beam_search_penalty_kernels.dir/beam_search_penalty_kernels.cu.o
[ 10%] Linking CUDA device code CMakeFiles/add_bias_transpose_kernels.dir/cmake_device_link.o
[ 11%] Linking CUDA static library ../../../lib/libadd_bias_transpose_kernels.a
[ 11%] Built target add_bias_transpose_kernels
[ 11%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/beam_search_topk_kernels.dir/beam_search_topk_kernels.cu.o
2 errors detected in the compilation of "/workspace/mage/third_party/FasterTransformer/src/fastertransformer/kernels/unfused_attention_kernels.cu".
make[2]: *** [src/fastertransformer/kernels/CMakeFiles/unfused_attention_kernels.dir/build.make:77: src/fastertransformer/kernels/CMakeFiles/unfused_attention_kernels.dir/unfused_attention_kernels.cu.o] Error 2
make[1]: *** [CMakeFiles/Makefile2:3129: src/fastertransformer/kernels/CMakeFiles/unfused_attention_kernels.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 11%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_128.cu.o
[ 12%] Linking CUDA device code CMakeFiles/calibrate_quantize_weight_kernels.dir/cmake_device_link.o
[ 12%] Linking CUDA static library ../../../lib/libcalibrate_quantize_weight_kernels.a
[ 12%] Built target calibrate_quantize_weight_kernels
[ 13%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_144.cu.o
[ 13%] Linking CUDA device code CMakeFiles/gen_relative_pos_bias.dir/cmake_device_link.o
[ 13%] Linking CUDA static library ../../../lib/libgen_relative_pos_bias.a
[ 13%] Built target gen_relative_pos_bias
[ 13%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_160.cu.o
[ 13%] Linking CUDA device code CMakeFiles/softmax_int8_kernels.dir/cmake_device_link.o
[ 13%] Linking CUDA static library ../../../lib/libsoftmax_int8_kernels.a
[ 13%] Built target softmax_int8_kernels
[ 13%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_192.cu.o
[ 13%] Linking CUDA device code CMakeFiles/logprob_kernels.dir/cmake_device_link.o
[ 13%] Linking CUDA static library ../../../lib/liblogprob_kernels.a
[ 13%] Built target logprob_kernels
[ 13%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_224.cu.o
[ 13%] Linking CUDA device code CMakeFiles/longformer_kernels.dir/cmake_device_link.o
[ 13%] Linking CUDA static library ../../../lib/liblongformer_kernels.a
[ 14%] Linking CUDA device code CMakeFiles/beam_search_penalty_kernels.dir/cmake_device_link.o
[ 14%] Built target longformer_kernels
[ 14%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_256.cu.o
[ 14%] Linking CXX static library ../../../lib/libbeam_search_penalty_kernels.a
[ 14%] Built target beam_search_penalty_kernels
[ 14%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_32.cu.o
[ 14%] Linking CUDA device code CMakeFiles/decoding_kernels.dir/cmake_device_link.o
[ 14%] Linking CUDA static library ../../../lib/libdecoding_kernels.a
[ 14%] Built target decoding_kernels
[ 14%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_48.cu.o
[ 14%] Linking CUDA device code CMakeFiles/gpt_kernels.dir/cmake_device_link.o
[ 14%] Linking CUDA static library ../../../lib/libgpt_kernels.a
[ 14%] Built target gpt_kernels
[ 14%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_64.cu.o
[ 15%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_80.cu.o
[ 15%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention/decoder_masked_multihead_attention_96.cu.o
[ 15%] Building CUDA object src/fastertransformer/kernels/CMakeFiles/decoder_masked_multihead_attention.dir/decoder_masked_multihead_attention.cu.o
[ 15%] Linking CUDA device code CMakeFiles/beam_search_topk_kernels.dir/cmake_device_link.o
[ 15%] Linking CUDA static library ../../../lib/libbeam_search_topk_kernels.a
[ 15%] Built target beam_search_topk_kernels
[ 15%] Linking CUDA device code CMakeFiles/decoder_masked_multihead_attention.dir/cmake_device_link.o
[ 15%] Linking CUDA static library ../../../lib/libdecoder_masked_multihead_attention.a
[ 15%] Built target decoder_masked_multihead_attention
[ 15%] Linking CUDA device code CMakeFiles/online_softmax_beamsearch_kernels.dir/cmake_device_link.o
[ 15%] Linking CUDA static library ../../../lib/libonline_softmax_beamsearch_kernels.a
[ 15%] Built target online_softmax_beamsearch_kernels
make: *** [Makefile:136: all] Error 2

@nacc
Copy link

nacc commented Oct 31, 2023

There appear to be related issues with fprintf, stderr, etc.

@shannonphu
Copy link

this helped fix the build for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants