njhill

Follow

Nick Hill njhill

Follow

124 followers · 1 following

Red Hat
San Jose, California

Achievements

Achievements

Organizations

Pinned Loading

vllm-project/vllm Public

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 43.9k 6.7k
IBM/text-generation-inference Public

IBM development fork of https://github.com/huggingface/text-generation-inference

Python 60 33
IBM/etcd-java Public

Alternative etcd3 java client

Java 159 44
kserve/modelmesh Public

Distributed Model Serving Framework

Java 162 74
netty/netty Public

Netty project - an event-driven asynchronous network application framework

Java 34k 16.1k
IBM/kv-utils Public

Abstracted helper classes providing consistent key-value store functionality, with zookeeper and etcd3 implementations

Java 5 2

941 contributions in the last year

Learn how we count contributions

Less

More

Activity overview

Contributed to vllm-project/vllm, IBM/vllm, opendatahub-io/vllm-gaudi and 18 other repositories

Contribution activity

April 2025

Created 2 commits in 1 repository

vllm-project/vllm 2 commits

Created a pull request in vllm-project/vllm that received 33 comments

[V1] DP scale-out (2/N): Decouple engine process management and comms

This decouples the management of engine processes from the IPC, and adds support for a mix of local and/or remote engines (where remote are running…

+474 −206 lines changed • 33 comments

Opened 4 other pull requests in 1 repository

vllm-project/vllm 2 open 2 merged

[V1][DP] More robust DP/EP dummy request coordination
This contribution was made on Apr 8
[V1][BugFix] Exit properly if engine core fails during startup
This contribution was made on Apr 6
[BugFix][Frontend] Fix LLM.chat() tokenization
This contribution was made on Apr 5
[V1] DP scale-out (1/N): Use zmq ROUTER/DEALER sockets for input queue
This contribution was made on Apr 1

Reviewed 15 pull requests in 1 repository

vllm-project/vllm 15 pull requests

[V1][Performance] Implement custom serializaton for MultiModalKwargs
This contribution was made on Apr 9
[BugFix] logger is not callable
This contribution was made on Apr 9
[Hardware] add platform-specific request validation api
This contribution was made on Apr 9
[V1][Bugfix]: vllm v1 verison metric num_gpu_blocks is None
This contribution was made on Apr 8
[V1] DP scale-out (2/N): Decouple engine process management and comms
This contribution was made on Apr 7
[core] do not send error across process
This contribution was made on Apr 7
[Bugfix] fix use-ep bug to enable ep by dp/tp size > 1
This contribution was made on Apr 7
[V1] DP scale-out (1/N): Use zmq ROUTER/DEALER sockets for input queue
This contribution was made on Apr 6
[V1][Minor] Optimize get_cached_block
This contribution was made on Apr 6
[TPU] Switch Test to Non-Sliding Window
This contribution was made on Apr 3
example: add async example for offline inference
This contribution was made on Apr 3
Use custom address for listening socket
This contribution was made on Apr 3
[V1][TPU] TPU-optimized top-p implementation (avoids scattering).
This contribution was made on Apr 2
[BugFix] fix speculative decoding memory leak when speculation is disabled
This contribution was made on Apr 2
[BugFix] make sure socket close
This contribution was made on Apr 1

1 contribution in private repositories Apr 4