Loading checkpoint shards take forever: llama-13b #962

Unanswered

Cherchercher asked this question in Q&A

Cherchercher
Apr 18, 2024

What does "Loading checkpoint shards" do and do we need it? If not straightly needed, how can i skip this?

Running model as is to test simple queries but it takes forever to "load checkpoint shards"

docker run --rm -it -p 3000:3000 ghcr.io/bentoml/openllm start llama --model-id huggyllama/llama-13b --backend pt

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment