Replies: 1 comment
-
So we can definitely fix this one case. In general keeping the server in a valid state when we get invalid requests is a good idea but it will require a bit of vigilance / handling issues on a case-by-case basis. I'll send a fix for this one case, but if you notice other sequences which put it in an irrecoverable state let us know and we can fix them as well. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
It seems that currently, if there is any sort of error that occurs in the runtime of the server, then you have to restart the server.
For example, if I send an invalid request without specifying the model name, then I would run into a connection error or a NotFoundError, but then if I send a request with the correct model name again, then I would encounter the following error:
mlx_lm/server.py", line 557, in handle_chat_completions
prompt = self.tokenizer.encode(prompt)
AttributeError: 'NoneType' object has no attribute 'encode'
Currently, the only way to solve this is to restart the mlx-lm server, which is rather inconvenient.
It might be great if the server could continue to work if the valid request is sent, even after it is sent after an invalid request.
I am not sure if this behavioral design is intentional, but I would like to bring this up for an open discussion.
Beta Was this translation helpful? Give feedback.
All reactions