-
-
Notifications
You must be signed in to change notification settings - Fork 79
Falcon 40B : too slow and random answers #204
Comments
Hi @ArnaudHureaux! I haven't used RunPod before, and there could be multiple reasons for this issue:
We will attempt to reproduce the issue using tiiuae/falcon-40b on our local machine later. |
Hi, |
On my case, the answer was totally random with message like "był AbramsPlayEvent磨}$,ocempreferred LaceKUZOOOoodlesWCHawaiiVEsecured cardvue ..." ?? I didn't have this comportment on other implementation, so i think that the problem is from the implementation ? |
Using only hugging face :
or :
|
If it helps, Inference with a short prompt:
GPUs when loaded:
|
Am I the only one who encountered an error saying I need to install the "einops" library when trying to deploy the Falcon 40B model ? This library is not part of the requirements.txt of the 0.19.0 version |
|
Hi,
When i deployed the Falcon 40B model on the Basaran WebUI i had :
-random answers, by example, when i said "hi", i get : " był AbramsPlayEvent磨}$,ocempreferred LaceKUZOOOoodlesWCHawaiiVEsecured cardvue ..."
-a very slow inference, whereas i was using a RunPod server costing $10 per hour with 4 GPU A100 80GB
I tried to custom the setting like that :
kwargs = {
"local_files_only": local_files_only,
"trust_remote_code": trust_remote_code,
"torch_dtype": torch.bfloat16,
"device_map": "auto"
}
Any idea how i could handle this issue ?
Thanks (and congrat for this beautiful webui !)
The text was updated successfully, but these errors were encountered: