Loading checkpoint shards take forever: llama-13b #962
Unanswered
Cherchercher
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
What does "Loading checkpoint shards" do and do we need it? If not straightly needed, how can i skip this?
Running model as is to test simple queries but it takes forever to "load checkpoint shards"
docker run --rm -it -p 3000:3000 ghcr.io/bentoml/openllm start llama --model-id huggyllama/llama-13b --backend pt
Beta Was this translation helpful? Give feedback.
All reactions