You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Invalid response object from API: '{"object":"error","message":"**NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.**\\n\\n(CUDA error: an illegal memory access was encountered\\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1.\\nCompile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.\\n)","code":50001}' (HTTP response code was 400)
Any pointers would be greatly appreciated. Thanks
The text was updated successfully, but these errors were encountered:
Dear Team,
I followed the instructions at https://github.com/mit-han-lab/llm-awq#usage to quantize Vicuna-13B-1.5 model and follow the instructions at https://github.com/lm-sys/FastChat/blob/55b2f8fdb0e0b80d64e043e9fc9018641bf7289f/docs/awq.md to perform inference. I am getting CUDA Illegal Memory Access Error.
Any pointers would be greatly appreciated. Thanks
The text was updated successfully, but these errors were encountered: