Vicuna-1.5 Quantized using AWQ Not Working - CUDA Illegal Memory Access #2264

mmaaz60 · 2023-08-18T20:21:14Z

Dear Team,

I followed the instructions at https://github.com/mit-han-lab/llm-awq#usage to quantize Vicuna-13B-1.5 model and follow the instructions at https://github.com/lm-sys/FastChat/blob/55b2f8fdb0e0b80d64e043e9fc9018641bf7289f/docs/awq.md to perform inference. I am getting CUDA Illegal Memory Access Error.

Invalid response object from API: '{"object":"error","message":"**NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.**\\n\\n(CUDA error: an illegal memory access was encountered\\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\\nCompile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.\\n)","code":50001}' (HTTP response code was 400)

Any pointers would be greatly appreciated. Thanks

The text was updated successfully, but these errors were encountered:

merrymercy · 2023-08-21T07:15:24Z

cc @tonylins @kentang-mit @ys-2020

digisomni · 2023-09-06T19:07:08Z

If you haven't gotten help for this yet and still wanna try quant'd models with FastChat, you can give my PR a whirl: #2365

On your model worker you have to set gptq-transformers-bits 4 or whatever bits you're using and gptq-transformers-disable-exllama. Should work then.

surak · 2023-10-22T11:57:43Z

@mmaaz60 Have you tried @digisomni 's suggestion? Did it work out for you?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vicuna-1.5 Quantized using AWQ Not Working - CUDA Illegal Memory Access #2264

Vicuna-1.5 Quantized using AWQ Not Working - CUDA Illegal Memory Access #2264

mmaaz60 commented Aug 18, 2023

merrymercy commented Aug 21, 2023

digisomni commented Sep 6, 2023

surak commented Oct 22, 2023

Vicuna-1.5 Quantized using AWQ Not Working - CUDA Illegal Memory Access #2264

Vicuna-1.5 Quantized using AWQ Not Working - CUDA Illegal Memory Access #2264

Comments

mmaaz60 commented Aug 18, 2023

merrymercy commented Aug 21, 2023

digisomni commented Sep 6, 2023

surak commented Oct 22, 2023