You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have recently encountered frequent OOM issues while running MMLU evaluation tasks using lm_eval with vllm as the backend. After some investigation, I found that this issue arises because the sampling parameters used by lm_eval are inconsistent with the default parameters used by vllm to estimate peak memory usage. In fact, the memory used by lm_eval during actual execution is much higher than the peak memory estimated by vllm (for the Meta-Llama-8B-Instruct model, using batch size auto, 10GB vs 50GB).
I have already reported this issue to vllm to see if they can provide a way for lm_eval to configure the default sampling parameters to resolve this issue. In the meantime, modifying the default sampling parameters in vllm could temporarily bypass this OOM issue, and I believe this workaround could be helpful for others encountering the same problem.
The text was updated successfully, but these errors were encountered:
I have recently encountered frequent OOM issues while running MMLU evaluation tasks using lm_eval with vllm as the backend. After some investigation, I found that this issue arises because the sampling parameters used by lm_eval are inconsistent with the default parameters used by vllm to estimate peak memory usage. In fact, the memory used by lm_eval during actual execution is much higher than the peak memory estimated by vllm (for the Meta-Llama-8B-Instruct model, using batch size auto, 10GB vs 50GB).
I have already reported this issue to vllm to see if they can provide a way for lm_eval to configure the default sampling parameters to resolve this issue. In the meantime, modifying the default sampling parameters in vllm could temporarily bypass this OOM issue, and I believe this workaround could be helpful for others encountering the same problem.
The text was updated successfully, but these errors were encountered: