You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
should be cudaD2Dcpy(weights_ptr[0], other.weights_ptr[0], max_seq_len_ * hidden_units_);
instead of cudaD2Dcpy(weights_ptr[0], other.weights_ptr[0], max_seq_len_ * vocab_size_);
Branch/Tag/Commit
main
Docker Image Version
nvcr.io/nvidia/pytorch:22.12-py3
GPU name
A10
CUDA Driver
535.54.03
Reproduced Steps
Abnormal Phenomena:
in
FasterTransformer/src/fastertransformer/kernels/decoding_kernels.cu
Line 137 in df4a753
FasterTransformer/src/fastertransformer/kernels/decoding_kernels.cu
Line 134 in df4a753
So I think
FasterTransformer/src/fastertransformer/models/decoding/DecodingWeight.h
Line 101 in df4a753
cudaD2Dcpy(weights_ptr[0], other.weights_ptr[0], max_seq_len_ * hidden_units_);
instead of
cudaD2Dcpy(weights_ptr[0], other.weights_ptr[0], max_seq_len_ * vocab_size_);
There are two similar situations
FasterTransformer/src/fastertransformer/models/decoding/DecodingWeight.h
Line 77 in df4a753
FasterTransformer/src/fastertransformer/models/decoding/DecodingWeight.h
Line 118 in df4a753
I have pull a pr to try to fix it. @byshiue
The text was updated successfully, but these errors were encountered: