[Bug] topk is larger #441

hatrexltd · 2023-09-20T06:16:06Z

Checklist

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version.

Describe the bug

When there are too many simultaneous requests, errors occur and the server crashes. How to fix this problem ?

Reproduction

1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024. 1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | [WARNING] topk (471476112) is larger than max supported number (1024) for token 5 clip to max supported number 1024.
1|triton | what(): [TM][ERROR] CUDA runtime error: out of memory /lmdeploy/src/turbomind/utils/allocator.h:223
(24GB/80GB per GPU used)

Error traceback

No response

lvhan028 · 2023-09-20T06:31:37Z

Could you share the reproduction method?

hatrexltd · 2023-09-20T06:36:46Z

Could you share the reproduction method?

python3 -m lmdeploy.serve.openai.api_server ./workspace 0.0.0.0 10000 --instance_num 32 --tp 8
Model : Llama 2 (70B)
To reproduce the bug, you have to send lots of requests (5-20 per second)

lvhan028 · 2023-09-20T09:55:08Z

Which version are you using?
If you are using v0.0.7 or v0.0.8, I suggest upgrading to v0.0.9

hatrexltd · 2023-09-24T12:32:31Z

Even with v0.0.9, I still experience crashes with the same error

lvhan028 · 2023-09-25T06:41:01Z

Can you share your client code to help us reproduce it?

hatrexltd · 2023-09-25T17:09:38Z

Can you share your client code to help us reproduce it?

Llama 70B HF
python3 -m lmdeploy.serve.openai.api_server ./workspace 0.0.0.0 43000 --instance_num 32 --tp 8
LMDeploy 0.0.9
No modifications

Lmdeploy only uses 25% of VRAM. Is it possible to make it use more, like for vllm to support more concurrent requests? I only have this problem when there are too many requests.

hatrexltd · 2023-09-25T17:18:07Z

I will try this #460

hatrexltd · 2023-09-27T08:30:03Z

The problem is actually still present. It occurs when too many requests are made in a period of time, even after updating to LMDeploy 0.1.0

hatrexltd · 2023-09-27T08:30:35Z

For example, if you make 5 concurrent requests over a period of 1s, you will get the error.

lvhan028 · 2023-09-28T03:56:57Z

@AllentDan Could you help investigate it?

AllentDan · 2023-09-28T04:01:21Z

@hatrexltd Which api did you use? And how did you send the requests?

github-actions · 2023-10-06T02:04:51Z

This issue is marked as stale because it has been marked as invalid or awaiting response for 7 days without any further response. It will be closed in 5 days if the stale label is not removed or if there is no further response.

github-actions · 2023-10-12T02:03:18Z

This issue is closed because it has been stale for 5 days. Please open a new issue if you have similar issues or you have any new updates now.

hatrexltd changed the title ~~[Bug]~~ [Bug] topk is larger Sep 20, 2023

lvhan028 added the awaiting response label Sep 22, 2023

hatrexltd closed this as completed Sep 25, 2023

hatrexltd reopened this Sep 27, 2023

lvhan028 assigned AllentDan Sep 28, 2023

github-actions bot added the Stale label Oct 6, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] topk is larger #441

[Bug] topk is larger #441

hatrexltd commented Sep 20, 2023 •

edited

Loading

lvhan028 commented Sep 20, 2023

hatrexltd commented Sep 20, 2023

lvhan028 commented Sep 20, 2023

hatrexltd commented Sep 24, 2023

lvhan028 commented Sep 25, 2023

hatrexltd commented Sep 25, 2023

hatrexltd commented Sep 25, 2023

hatrexltd commented Sep 27, 2023

hatrexltd commented Sep 27, 2023

lvhan028 commented Sep 28, 2023

AllentDan commented Sep 28, 2023

github-actions bot commented Oct 6, 2023

github-actions bot commented Oct 12, 2023

[Bug] topk is larger #441

[Bug] topk is larger #441

Comments

hatrexltd commented Sep 20, 2023 • edited Loading

Checklist

Describe the bug

Reproduction

Error traceback

lvhan028 commented Sep 20, 2023

hatrexltd commented Sep 20, 2023

lvhan028 commented Sep 20, 2023

hatrexltd commented Sep 24, 2023

lvhan028 commented Sep 25, 2023

hatrexltd commented Sep 25, 2023

hatrexltd commented Sep 25, 2023

hatrexltd commented Sep 27, 2023

hatrexltd commented Sep 27, 2023

lvhan028 commented Sep 28, 2023

AllentDan commented Sep 28, 2023

github-actions bot commented Oct 6, 2023

github-actions bot commented Oct 12, 2023

hatrexltd commented Sep 20, 2023 •

edited

Loading