[Bug/Feature request] mlx-lm server has to be restarted every time a runtime error occurs #887

cbschen1 · 2024-07-13T05:53:41Z

cbschen1
Jul 13, 2024

It seems that currently, if there is any sort of error that occurs in the runtime of the server, then you have to restart the server.

For example, if I send an invalid request without specifying the model name, then I would run into a connection error or a NotFoundError, but then if I send a request with the correct model name again, then I would encounter the following error:

mlx_lm/server.py", line 557, in handle_chat_completions
prompt = self.tokenizer.encode(prompt)
AttributeError: 'NoneType' object has no attribute 'encode'

Currently, the only way to solve this is to restart the mlx-lm server, which is rather inconvenient.

It might be great if the server could continue to work if the valid request is sent, even after it is sent after an invalid request.

I am not sure if this behavioral design is intentional, but I would like to bring this up for an open discussion.

awni · 2024-07-15T21:36:58Z

awni
Jul 15, 2024
Maintainer

So we can definitely fix this one case. In general keeping the server in a valid state when we get invalid requests is a good idea but it will require a bit of vigilance / handling issues on a case-by-case basis.

I'll send a fix for this one case, but if you notice other sequences which put it in an irrecoverable state let us know and we can fix them as well.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug/Feature request] mlx-lm server has to be restarted every time a runtime error occurs #887

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

[Bug/Feature request] mlx-lm server has to be restarted every time a runtime error occurs #887

cbschen1 Jul 13, 2024

Replies: 1 comment

awni Jul 15, 2024 Maintainer

cbschen1
Jul 13, 2024

awni
Jul 15, 2024
Maintainer