-
Red Hat
- San Jose, California
Pinned Loading
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
IBM/text-generation-inference
IBM/text-generation-inference PublicIBM development fork of https://github.com/huggingface/text-generation-inference
-
-
-
netty/netty
netty/netty PublicNetty project - an event-driven asynchronous network application framework
-
IBM/kv-utils
IBM/kv-utils PublicAbstracted helper classes providing consistent key-value store functionality, with zookeeper and etcd3 implementations
941 contributions in the last year
Day of Week | April Apr | May May | June Jun | July Jul | August Aug | September Sep | October Oct | November Nov | December Dec | January Jan | February Feb | March Mar | |||||||||||||||||||||||||||||||||||||||||
Sunday Sun | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Monday Mon | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Tuesday Tue | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Wednesday Wed | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Thursday Thu | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Friday Fri | |||||||||||||||||||||||||||||||||||||||||||||||||||||
Saturday Sat |
Less
No contributions.
Low contributions.
Medium-low contributions.
Medium-high contributions.
High contributions.
More
Activity overview
Loading
Contribution activity
April 2025
Created 2 commits in 1 repository
Created a pull request in vllm-project/vllm that received 33 comments
[V1] DP scale-out (2/N): Decouple engine process management and comms
This decouples the management of engine processes from the IPC, and adds support for a mix of local and/or remote engines (where remote are running…
+474
−206
lines changed
•
33
comments
Opened 4 other pull requests in 1 repository
vllm-project/vllm
2
open
2
merged
-
[V1][DP] More robust DP/EP dummy request coordination
This contribution was made on Apr 8
-
[V1][BugFix] Exit properly if engine core fails during startup
This contribution was made on Apr 6
-
[BugFix][Frontend] Fix
LLM.chat()
tokenizationThis contribution was made on Apr 5 -
[V1] DP scale-out (1/N): Use zmq ROUTER/DEALER sockets for input queue
This contribution was made on Apr 1
Reviewed 15 pull requests in 1 repository
vllm-project/vllm
15 pull requests
-
[V1][Performance] Implement custom serializaton for MultiModalKwargs
This contribution was made on Apr 9
-
[BugFix] logger is not callable
This contribution was made on Apr 9
-
[Hardware] add platform-specific request validation api
This contribution was made on Apr 9
-
[V1][Bugfix]: vllm v1 verison metric num_gpu_blocks is None
This contribution was made on Apr 8
-
[V1] DP scale-out (2/N): Decouple engine process management and comms
This contribution was made on Apr 7
-
[core] do not send error across process
This contribution was made on Apr 7
-
[Bugfix] fix use-ep bug to enable ep by dp/tp size > 1
This contribution was made on Apr 7
-
[V1] DP scale-out (1/N): Use zmq ROUTER/DEALER sockets for input queue
This contribution was made on Apr 6
-
[V1][Minor] Optimize get_cached_block
This contribution was made on Apr 6
-
[TPU] Switch Test to Non-Sliding Window
This contribution was made on Apr 3
-
example: add async example for offline inference
This contribution was made on Apr 3
-
Use custom address for listening socket
This contribution was made on Apr 3
-
[V1][TPU] TPU-optimized top-p implementation (avoids scattering).
This contribution was made on Apr 2
-
[BugFix] fix speculative decoding memory leak when speculation is disabled
This contribution was made on Apr 2
-
[BugFix] make sure socket close
This contribution was made on Apr 1
1
contribution
in private repositories
Apr 4