You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to run RoSA finetuning on Nvidia Quadro RTX 6000. The GPU architecture doesn't support bfloat16 so I tried to load the model in 4 bits (similar to the suggestion for Colab T4 GPU). The finetuning is completed but at the time of loading the model and running inference, I get a Runtime Error: No Kernel Image Available for execution. Is there any workaround for this? FFT is not working for me (Running out of GPU RAM).
The text was updated successfully, but these errors were encountered:
Can you please try adding the argument --dtype fp32 to the evaluation commands here and here and see whether the issue is resolved?
As you mentioned, this is probably duo to your GPU not supporting BF16, and currently the default dtype to load the model for evaluation is bf16 (here). You should be able to change this default value locally to fp32 to fix the problem.
A more permanent solution would be to read the default dtype from the MODEL_PRECISION argument in the config file.
I tried modifying the dtype in config.sh but I run out of memory sadly for fp32. When I try setting the precision to 4bit, I am able to finetune the model but something goes wrong at the end. The adapters aren't saved and I get the runtime error I described above. I searched for the error and some sites suggest that this happens when the model is trained for a GPU architecture that it is not compatible with.
I'm trying to run RoSA finetuning on Nvidia Quadro RTX 6000. The GPU architecture doesn't support bfloat16 so I tried to load the model in 4 bits (similar to the suggestion for Colab T4 GPU). The finetuning is completed but at the time of loading the model and running inference, I get a Runtime Error: No Kernel Image Available for execution. Is there any workaround for this? FFT is not working for me (Running out of GPU RAM).
The text was updated successfully, but these errors were encountered: