Skip to content

Simulate how llm serving engines like vllm make use of python asyncio.Queue to achieve dynamic batching: batch generation at the iteration level.

Notifications You must be signed in to change notification settings

hitpoint6/llm-continuous-batching-simulator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Simulate how llm serving engines like vllm make use of python asyncio.Queue to achieve dynamic batching: batch generation at the iteration level.

Screenshot 2024-06-03 at 9 51 20 AM

About

Simulate how llm serving engines like vllm make use of python asyncio.Queue to achieve dynamic batching: batch generation at the iteration level.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages