You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered the same issue. There were no problems when using Docker or the GPU operator in Kubernetes, but the issue appeared after migrating to HAMi.
What happened:
使用text-embeddings-inference提供的镜像
ghcr.io/huggingface/text-embeddings-inference:89-1.6
部署 bge-m3,启动时卡住。我不太确定这个问题是否和hami相关,但脱离hami仅使用docker run是没有问题的:
docker run :
docker run --name bge-m3-tei -v /data/bge-m3:/data/bge-m3 -e "NVIDIA_VISIBLE_DEVICES=0" ghcr.io/huggingface/text-embeddings-inference:89-1.6 --model-id /data/bge-m3 --port 56246 --hostname 0.0.0.0 --tokenization-workers 1
Container logs:
如果使用Deployment部署,则会卡在
Starting FlashBert model on Cuda(CudaDevice(DeviceId(1)))
。但是从nvidia-smi
来看,模型应该已经加载到显存中,占用的显存也与docker run相当(约1516MiB)。Deployment info:
pod log:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
/etc/docker/daemon.json
)sudo journalctl -r -u kubelet
)Environment:
The text was updated successfully, but these errors were encountered: