GptManager occupies the entire CPU #917

Masterlk · 2024-01-19T06:32:35Z

I use pybings of tensorrt-llm, and try with GptManager for InflightBatching feature. but it's seems the initialization of GptManager occupies the entire CPU, and it hasn't even accepted a request yet.

My expectation is that when there are no requests, the CPU utilization should be 0. so what's the possible reason?

I use the main branch, the newest version, and this code can reproduce the problem：

import argparse
from asyncio import run
from pathlib import Path
import asyncio
from tensorrt_llm.engine import AsyncLLMEngine

async def main(model_dir, tokenizer_dir):
    engine = AsyncLLMEngine(model_dir, tokenizer_dir)
    await asyncio.sleep(100)


if __name__ == "__main__":
    model_dir = "./1-gpu"
    tokenizer_dir = "../module/mistral/data/tokenizer"
    run(main(model_dir, tokenizer_dir))

The text was updated successfully, but these errors were encountered:

byshiue · 2024-01-19T08:16:23Z

Does the program occupy all threads or only single thread? GptManager will create a thread to wait and collect the requests.

Masterlk · 2024-01-19T09:10:08Z

Does the program occupy all threads or only single thread? GptManager will create a thread to wait and collect the requests.

I use top -Hp xxx to see all threads of the process, and found a single thread occupy the cpu to 100%

byshiue · 2024-01-19T09:23:28Z

Then I think it is expected.

Masterlk · 2024-01-19T09:46:41Z

Then I think it is expected.

If the thread only wait and collect the requests, then why it occupy the entire cpu core ? Theoretically, if no requests come in, waiting should not occupy CPU resources

byshiue · 2024-01-22T09:55:35Z

It uses a while loop to wait the request coming.

lyc728 · 2024-01-23T07:46:31Z

pybings

import argparse
from asyncio import run
from pathlib import Path
import asyncio
from tensorrt_llm.engine import AsyncLLMEngine

async def main(model_dir, tokenizer_dir):
engine = AsyncLLMEngine(model_dir, tokenizer_dir)
await asyncio.sleep(100)

if name == "main":
model_dir = "./1-gpu"
tokenizer_dir = "../module/mistral/data/tokenizer"
run(main(model_dir, tokenizer_dir))

Hello, do you have the complete code? I don't see InflightBatching implemented by tensorrt-llm. May I ask where is your code?

HUSTHY · 2024-06-25T03:42:16Z

pybings

import argparse from asyncio import run from pathlib import Path import asyncio from tensorrt_llm.engine import AsyncLLMEngine

async def main(model_dir, tokenizer_dir): engine = AsyncLLMEngine(model_dir, tokenizer_dir) await asyncio.sleep(100)

if name == "main": model_dir = "./1-gpu" tokenizer_dir = "../module/mistral/data/tokenizer" run(main(model_dir, tokenizer_dir))

Hello, do you have the complete code? I don't see InflightBatching implemented by tensorrt-llm. May I ask where is your code?

the InflightBatching is implemented when built trt engine and for AsyncLLMEngine use

nv-guomingz · 2024-11-15T12:23:06Z

Hi @Masterlk do u still have further issue or question now? If not, we'll close it soon.

HUSTHY · 2024-11-15T12:23:37Z

已收到！谢谢！ ——黄洋

HUSTHY · 2024-12-04T10:17:44Z

已收到！谢谢！ ——黄洋

byshiue self-assigned this Jan 19, 2024

byshiue added the triaged Issue has been triaged by maintainers label Jan 19, 2024

nv-guomingz added the stale label Nov 15, 2024

nv-guomingz closed this as completed Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GptManager occupies the entire CPU #917

GptManager occupies the entire CPU #917

Masterlk commented Jan 19, 2024 •

edited

Loading

byshiue commented Jan 19, 2024

Masterlk commented Jan 19, 2024

byshiue commented Jan 19, 2024

Masterlk commented Jan 19, 2024

byshiue commented Jan 22, 2024

lyc728 commented Jan 23, 2024

HUSTHY commented Jun 25, 2024

nv-guomingz commented Nov 15, 2024

HUSTHY commented Nov 15, 2024 via email

HUSTHY commented Dec 4, 2024 via email

GptManager occupies the entire CPU #917

GptManager occupies the entire CPU #917

Comments

Masterlk commented Jan 19, 2024 • edited Loading

byshiue commented Jan 19, 2024

Masterlk commented Jan 19, 2024

byshiue commented Jan 19, 2024

Masterlk commented Jan 19, 2024

byshiue commented Jan 22, 2024

lyc728 commented Jan 23, 2024

HUSTHY commented Jun 25, 2024

nv-guomingz commented Nov 15, 2024

HUSTHY commented Nov 15, 2024 via email

HUSTHY commented Dec 4, 2024 via email

Masterlk commented Jan 19, 2024 •

edited

Loading