You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Stuck at unloading model after a Job required xx pages error then you try to load a new model.
Then you Ctrl + C, it shows: "waiting connection to close(Ctrl+C to force quit)"
Reproduction steps
Load a model with max_seq_len low. Then try query above it you'll get an error. Now you load a new model, it will stuck at unloading model.
Expected behavior
Job properly cancelled and load the new model.
Logs
No response
Additional context
Detailed lost but happened multiple times already and maybe later reproduced again.
Acknowledgements
I have looked for similar issues before submitting this one.
I have read the disclaimer, and this issue is related to a code bug. If I have a question, I will use the Discord server.
I understand that the developers have lives and my issue will be answered when possible.
I understand the developers of this program are human, and I will ask my questions politely.
The text was updated successfully, but these errors were encountered:
Able to reproduce locally. Interestingly it seems like if you first unload the model manually via the /v1/model/unload endpoint before trying to load the new model, it doesn't get stuck like this.
OS
Windows
GPU Library
CUDA 12.x
Python version
3.12
Describe the bug
Stuck at unloading model after a Job required xx pages error then you try to load a new model.
Then you Ctrl + C, it shows: "waiting connection to close(Ctrl+C to force quit)"
Reproduction steps
Load a model with max_seq_len low. Then try query above it you'll get an error. Now you load a new model, it will stuck at unloading model.
Expected behavior
Job properly cancelled and load the new model.
Logs
No response
Additional context
Detailed lost but happened multiple times already and maybe later reproduced again.
Acknowledgements
The text was updated successfully, but these errors were encountered: