You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am facing weird issue while running the multi-model-server in a docker image. So the setup for inference is as follows
I have two process where one process is multi model server and then the other one is listening messages from a queue. The idea is that whenever user triggers a inference, the message will be received by the queue process. This process will then download the image, registers the models in multi-model-server and then run the prediction. So communication between queue based service and mms service is through 0.0.0.0. This whole thing runs in one docker container.
I hope you understood the setup, so the problem I am facing is whenever I register models it works for 2. If we increase the number of models to 3, POST API (used for registration) doesn't respond at all.
My queue based process logs look like (after the last line nothing gets logged)
2023-08-17 19:59:33.188 | INFO | manage:put:92 - test_v1.33 not registeted. Registering model = test_v1.33 with initial workers = 1
2023-08-17 19:59:33.188 | INFO | manage:put:94 - Current cache = 2, max capacity = 5
2023-08-17 19:59:33.188 | INFO | utils.api_handler:post:124 - Doing post request at api url = http://0.0.0.0:8081/models/ with payload = {'url': 'test_v1.33', 'model_name': 'test_v1.33', 'initial_workers': 1, 'response_timeout': 120, 'synchronous': True}
If the registration is successful, then the usual logs look like this
2023-08-17 19:58:02.536 | INFO | manage:put:92 - Test_detection_12 not registeted. Registering model = Test_detection_12 with initial workers = 1
2023-08-17 19:58:02.536 | INFO | manage:put:94 - Current cache = 1, max capacity = 5
2023-08-17 19:58:02.536 | INFO | utils.api_handler:post:124 - Doing post request at api url = http://0.0.0.0:8081/models/ with payload = {'url': 'Test_detection_12', 'model_name': 'Test_detection_12', 'initial_workers': 1, 'response_timeout': 120, 'synchronous': True}
2023-08-17 19:58:04.666 | INFO | utils.api_handler:post:126 - Received status code = 200, reason = OK
As you can see, If the registration is successful, then last line with status_code = 200 is usually logged. So to understand more, I checked the logs mms_log.log. They are given below
As you can see Backend response time: 11064, it means that the model is registered, but it follows with message worker pid is not available yet. After this there will be no log in the multi-model-server and it looks like the mms is stuck. I feel this is the issue and I wanted to understand
why this message is getting logged ?
What does it mean and how to resolve ?
Why doesn't this message didn't log when I was registering the model for first 2 times which were successfull (it was giving this message only after registering 2 times)
Also I went inside the docker container using docker exec command and there I ran the following command but I am not getting any response and it's stuck
curl http://0.0.0.0:8080/ping
Do let me know if you need more information.
Thanks.
The text was updated successfully, but these errors were encountered:
Hello,
I am facing weird issue while running the multi-model-server in a docker image. So the setup for inference is as follows
I hope you understood the setup, so the problem I am facing is whenever I register models it works for 2. If we increase the number of models to 3, POST API (used for registration) doesn't respond at all.
My queue based process logs look like (after the last line nothing gets logged)
If the registration is successful, then the usual logs look like this
As you can see, If the registration is successful, then last line with
status_code = 200
is usually logged. So to understand more, I checked the logsmms_log.log
. They are given belowAs you can see
Backend response time: 11064
, it means that the model is registered, but it follows with message worker pid is not available yet. After this there will be no log in themulti-model-server
and it looks like the mms is stuck. I feel this is the issue and I wanted to understandMy
config.properties
file is given belowAlso I went inside the docker container using
docker exec
command and there I ran the following command but I am not getting any response and it's stuckDo let me know if you need more information.
Thanks.
The text was updated successfully, but these errors were encountered: