Enable mypy checking on V1 code #11105

markmc · 2024-12-11T14:38:39Z

Some of these definitely deserve closer scrutiny e.g.:

~~commit 7974f1d - was there a bug with the hash_block_tokens() call?~~
~~commit 226fd552089ea8f0aeb49c8c2a4cc62f2b32742a - feels we'll be fighting an uphill battle with mypy and this args/kwargs pattern~~
~~commit 8a6d88c42df22d9ef8d1deb444a9335117b91dc6 - MultiprocExecutor.workers and UniprocExecutor.worker can be set to None, so we have to litter the code with is not None assertions~~

github-actions · 2024-12-11T14:38:53Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

vllm/v1/core/kv_cache_manager.py

vllm/v1/core/kv_cache_utils.py

vllm/v1/engine/core_client.py

mergify · 2024-12-12T07:18:19Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @markmc.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

vllm/v1/worker/gpu_model_runner.py

markmc · 2024-12-12T12:01:05Z

@DarkLight1337 @WoosukKwon note that I've rebased and fixed a bunch of new issues. Those new fixes are in the 5 latest commits

markmc · 2024-12-12T17:41:07Z

rebased again to fix conflicts

@comaniac would you mind reviewing commit 9190f342ffc4edfae851c4f0216ffc1231cde4ec ?

For reference, in #9972 you said:

Enhance the block hash type from int to Tuple[int, Tuple[int]], which is (hash_value, (toke_ids,)). This guarantees no hash conflicts

Pasting the commit message:

The current block hash format is:

  (hash((parent_hash, *token_id_list)), token_id_list)

i.e. it is a tuple with two parts:

A hash of a tuple combining the parent hash with token ids
A tuple containing the same token ids

and this (hash, token_ids) tuple itself must be hashable in order to be a dictionary key.

This is suspicious - why include the same token IDs twice in the same hash?

And then this code:

def hash_request_tokens(block_size: int,
                        token_ids: List[int]) -> List[BlockHashType]:
    ...
    ret = []
    parent_block_hash = None
    for ...:
        tuple(token_ids[start:end])
        ...
        block_hash = hash_block_tokens(parent_block_hash, block_token_ids)
        parent_block_hash = block_hash
    ...

looks wrong because hash_block_tokens() expects an integer hash for the parent_block_hash argument:

def hash_block_tokens(parent_block_hash: Optional[int],
                      curr_block_token_ids: Tuple[int]) -> BlockHashType:

It appears like it would be perfectly sufficient to use the integer hash of the (parent_hash + token_ids) tuple.

If we do this, then the curr_block_token_ids argument no longer needs to be a tuple - but it does need to be a sliceable sequence, so let's make sure ConstantList satisfies that requirement.

comaniac · 2024-12-12T22:31:38Z

This is suspicious - why include the same token IDs twice in the same hash?

This is to avoid potential hash collision, because Python dict guarantees no collision if the dict key is tuple.
But yeah the type is incorrect as you said. I'll submit a fix PR in minutes.

markmc · 2024-12-13T07:31:39Z

This is suspicious - why include the same token IDs twice in the same hash?

This is to avoid potential hash collision, because Python dict guarantees no collision if the dict key is tuple. But yeah the type is incorrect as you said. I'll submit a fix PR in minutes.

Thanks for the fix! Rebased again to resolve merge conflicts and fixed a few mypy errors with this prefix caching code.

markmc · 2024-12-13T19:17:42Z

Moving to draft while I debug the test failures

Fixes: vllm/v1/core/scheduler.py:157: error: Item "None" of "Optional[list[KVCacheBlock]]" has no attribute "__iter__" (not iterable) [union-attr] Now: Found 21 errors in 3 files (checked 30 source files) Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/engine/detokenizer.py:258: error: Argument "stop_reason" to "add_tokens" of "IncrementalDetokenizer" has incompatible type "Union[int, str, None]"; expected "Optional[str]" [arg-type] Now: Found 20 errors in 2 files (checked 30 source files) Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/attention/backends/flash_attn.py:162: error: Value of type "Optional[Any]" is not indexable [index] Now: Found 19 errors in 1 file (checked 30 source files) Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/worker/gpu_model_runner.py:164: error: Incompatible types in assignment (expression has type "NewRequestData", variable has type "RunningRequestData") [assignment] vllm/v1/worker/gpu_model_runner.py:166: error: "RunningRequestData" has no attribute "sampling_params" [attr-defined] vllm/v1/worker/gpu_model_runner.py:175: error: "RunningRequestData" has no attribute "prompt_token_ids" [attr-defined] vllm/v1/worker/gpu_model_runner.py:176: error: "RunningRequestData" has no attribute "prompt" [attr-defined] vllm/v1/worker/gpu_model_runner.py:177: error: "RunningRequestData" has no attribute "mm_inputs" [attr-defined] vllm/v1/worker/gpu_model_runner.py:178: error: "RunningRequestData" has no attribute "mm_positions" [attr-defined] vllm/v1/worker/gpu_model_runner.py:181: error: "RunningRequestData" has no attribute "block_ids"; maybe "new_block_ids"? [attr-defined] vllm/v1/worker/gpu_model_runner.py:188: error: Incompatible types in assignment (expression has type "ResumedRequestData", variable has type "RunningRequestData") [assignment] vllm/v1/worker/gpu_model_runner.py:192: error: "RunningRequestData" has no attribute "block_ids"; maybe "new_block_ids"? [attr-defined] Now: Found 10 errors in 1 file (checked 30 source files) Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/worker/gpu_model_runner.py:230: error: Invalid index type "Optional[str]" for "dict[str, int]"; expected type "str" [index] Now: Found 9 errors in 1 file (checked 30 source files) Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/worker/gpu_model_runner.py:365: error: Argument 1 to "append" of "list" has incompatible type "tuple[str, int]"; expected "tuple[int, int]" [arg-type] vllm/v1/worker/gpu_model_runner.py:381: error: Incompatible types in assignment (expression has type "int", variable has type "str") [assignment] Now: Found 6 errors in 1 file (checked 30 source files) Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/worker/gpu_model_runner.py:394: error: Invalid index type "Optional[str]" for "dict[str, int]"; expected type "str" [index] vllm/v1/worker/gpu_model_runner.py:395: error: Invalid index type "Optional[str]" for "dict[str, CachedRequestState]"; expected type "str" [index] vllm/v1/worker/gpu_model_runner.py:489: error: Invalid index type "Optional[str]" for "dict[str, CachedRequestState]"; expected type "str" [index] vllm/v1/worker/gpu_model_runner.py:491: error: Invalid index type "Optional[str]" for "dict[str, int]"; expected type "str" [index] vllm/v1/worker/gpu_model_runner.py:810: error: Invalid index type "Optional[str]" for "dict[str, int]"; expected type "str" [index] Now: Found 1 error in 1 file (checked 30 source files) Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/executor/abstract.py:47: error: Bracketed expression "[...]" is not valid as a type [valid-type] vllm/v1/executor/multiproc_executor.py:106: error: Bracketed expression "[...]" is not valid as a type [valid-type] vllm/v1/executor/multiproc_executor.py:128: error: "float" not callable [operator] vllm/v1/executor/multiproc_executor.py:175: error: Incompatible types in assignment (expression has type "None", variable has type "list[WorkerProcHandle]") [assignment] vllm/v1/executor/uniproc_executor.py:78: error: Incompatible types in assignment (expression has type "None", variable has type "Worker") [assignment] vllm/v1/engine/llm_engine.py:115: error: Incompatible types in assignment (expression has type "type[UniprocExecutor]", variable has type "type[MultiprocExecutor]") [assignment] vllm/v1/engine/async_llm.py:138: error: Incompatible types in assignment (expression has type "type[UniprocExecutor]", variable has type "type[MultiprocExecutor]") [assignment] Found 7 errors in 5 files (checked 33 source files) Signed-off-by: Mark McLoughlin <[email protected]>

As per the similar v0 code. Fixes: vllm/v1/engine/__init__.py:46: error: Unexpected keyword argument "array_like" for "__init_subclass__" of "object" [call-arg] vllm/v1/engine/__init__.py:46: error: Unexpected keyword argument "omit_defaults" for "__init_subclass__" of "object" [call-arg] vllm/v1/engine/__init__.py:46: error: Unexpected keyword argument "gc" for "__init_subclass__" of "object" [call-arg] vllm/v1/engine/__init__.py:58: error: Unexpected keyword argument "array_like" for "__init_subclass__" of "object" [call-arg] vllm/v1/engine/__init__.py:58: error: Unexpected keyword argument "omit_defaults" for "__init_subclass__" of "object" [call-arg] vllm/v1/engine/__init__.py:58: error: Unexpected keyword argument "gc" for "__init_subclass__" of "object" [call-arg] Signed-off-by: Mark McLoughlin <[email protected]>

Signed-off-by: Woosuk Kwon <[email protected]>

Fixes: vllm/v1/utils.py:110: error: Need type annotation for "cache" [var-annotated] Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/engine/mm_input_mapper.py:52: error: Missing return statement [return] Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/engine/mm_input_mapper.py:76: error: Argument 1 to "len" has incompatible type "Optional[list[str]]"; expected "Sized" [arg-type] vllm/v1/engine/mm_input_mapper.py:77: error: Argument 1 to "len" has incompatible type "Optional[list[str]]"; expected "Sized" [arg-type] vllm/v1/engine/mm_input_mapper.py:82: error: Need type annotation for "ret_hashes" [var-annotated] vllm/v1/engine/mm_input_mapper.py:92: error: Value of type "Optional[list[str]]" is not indexable [index] vllm/v1/engine/mm_input_mapper.py:110: error: Argument 1 to "put" of "LRUDictCache" has incompatible type "Optional[str]"; expected "str" [arg-type] vllm/v1/engine/mm_input_mapper.py:116: error: Item "None" of "Optional[list[str]]" has no attribute "append" [union-attr] vllm/v1/engine/mm_input_mapper.py:116: error: Argument 1 to "append" of "list" has incompatible type "Optional[str]"; expected "str" [arg-type] vllm/v1/engine/mm_input_mapper.py:119: error: Incompatible return value type (got "tuple[list[Any], Optional[list[str]]]", expected "list[Any]") [return-value] vllm/v1/engine/core.py:99: error: Argument 1 to "process_inputs" of "MMInputMapperServer" has incompatible type "Optional[list[Optional[Any]]]"; expected "list[Optional[Any]]" [arg-type] Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/executor/multiproc_executor.py:190: error: Item "None" of "Optional[list[WorkerProcHandle]]" has no attribute "__iter__" (not iterable) [union-attr] Signed-off-by: Mark McLoughlin <[email protected]>

Seems this will never contain None values: Fixes: vllm/v1/engine/processor.py:143: error: Argument 5 to "EngineCoreRequest" has incompatible type "Optional[list[str]]"; expected "Optional[list[Optional[str]]]" [arg-type] Signed-off-by: Mark McLoughlin <[email protected]>

BlockHashType token IDs should be Tuple[int, ...] - it's not just one int. Also, pass the token IDs as a Sequence[int] and convert to a tuple only when we need it in hash_block_tokens(). Make ConstantList a Sequence subclass to allow this. Fixes: vllm/v1/core/kv_cache_utils.py:206: error: Argument 2 to "hash_block_tokens" has incompatible type "tuple[int, ...]"; expected "tuple[int]" [arg-type] vllm/v1/core/kv_cache_manager.py:89: error: Argument 2 to "hash_request_tokens" has incompatible type "ConstantList[int]"; expected "list[int]" [arg-type] vllm/v1/core/kv_cache_manager.py:401: error: Argument 2 to "hash_block_tokens" has incompatible type "tuple[int, ...]"; expected "tuple[int]" [arg-type] Signed-off-by: Mark McLoughlin <[email protected]>

Not very satisfactory, but other attempts caused regressions. Fixes: vllm/v1/engine/core_client.py:158: error: "make_engine_core_process" of "EngineCoreProc" gets multiple values for keyword argument "input_path" [misc] vllm/v1/engine/core_client.py:158: error: "make_engine_core_process" of "EngineCoreProc" gets multiple values for keyword argument "output_path" [misc] vllm/v1/engine/core_client.py:158: error: "make_engine_core_process" of "EngineCoreProc" gets multiple values for keyword argument "ready_path" [misc] Signed-off-by: Mark McLoughlin <[email protected]>

@tlrmchlsmth

Suggestion from @tlrmchlsmth. Signed-off-by: Mark McLoughlin <[email protected]>

@tlrmchlsmth

Suggestion from @tlrmchlsmth. Signed-off-by: Mark McLoughlin <[email protected]>

@tlrmchlsmth

Suggestion by @tlrmchlsmth. Signed-off-by: Mark McLoughlin <[email protected]>

Fixes: vllm/v1/engine/core_client.py:196: error: Incompatible types in assignment (expression has type "None", variable has type "EngineCoreProcHandle") [assignment] Signed-off-by: Mark McLoughlin <[email protected]>

WoosukKwon · 2024-12-14T17:54:54Z

@markmc Finally merged! Thanks for your patience and great work! Really appreciate it.

markmc requested review from WoosukKwon, robertgshaw2-redhat, njhill, ywang96, comaniac and alexm-redhat as code owners December 11, 2024 14:38

markmc force-pushed the v1-mypy branch 4 times, most recently from f339531 to 7cb644e Compare December 11, 2024 15:04

DarkLight1337 reviewed Dec 11, 2024

View reviewed changes

mergify bot added the needs-rebase label Dec 12, 2024

WoosukKwon reviewed Dec 12, 2024

View reviewed changes

vllm/v1/worker/gpu_model_runner.py Outdated Show resolved Hide resolved

WoosukKwon reviewed Dec 12, 2024

View reviewed changes

vllm/v1/worker/gpu_model_runner.py Show resolved Hide resolved

markmc force-pushed the v1-mypy branch from 28c9004 to 86ebb2f Compare December 12, 2024 11:35

mergify bot removed the needs-rebase label Dec 12, 2024

markmc force-pushed the v1-mypy branch from 86ebb2f to 4c30fb1 Compare December 12, 2024 11:59

markmc force-pushed the v1-mypy branch from 4c30fb1 to 9190f34 Compare December 12, 2024 17:37

comaniac mentioned this pull request Dec 12, 2024

[Misc][V1] Fix type in v1 prefix caching #11151

Merged

markmc force-pushed the v1-mypy branch from 9190f34 to bd310cf Compare December 13, 2024 07:30

markmc marked this pull request as draft December 13, 2024 19:17

markmc force-pushed the v1-mypy branch from bd310cf to 8d022d8 Compare December 13, 2024 21:11

markmc marked this pull request as ready for review December 13, 2024 21:56

markmc and others added 21 commits December 14, 2024 03:46

v1: copy flash atten forward pass assertion from v0

c3d8ba7

Fixes: vllm/v1/attention/backends/flash_attn.py:162: error: Value of type "Optional[Any]" is not indexable [index] Now: Found 19 errors in 1 file (checked 30 source files) Signed-off-by: Mark McLoughlin <[email protected]>

[V1] Run mypy on

c96d77f

Signed-off-by: Woosuk Kwon <[email protected]>

v1: add type hints to LRUDictCache

b2d3729

Fixes: vllm/v1/utils.py:110: error: Need type annotation for "cache" [var-annotated] Signed-off-by: Mark McLoughlin <[email protected]>

v1: remove return type from cache_hit_ratio()

8e85f42

Fixes: vllm/v1/engine/mm_input_mapper.py:52: error: Missing return statement [return] Signed-off-by: Mark McLoughlin <[email protected]>

v1: add another MultiprocExecutor.workers assertion

7839436

Fixes: vllm/v1/executor/multiproc_executor.py:190: error: Item "None" of "Optional[list[WorkerProcHandle]]" has no attribute "__iter__" (not iterable) [union-attr] Signed-off-by: Mark McLoughlin <[email protected]>

v1: Make MultiprocExecutor.workers non-optional

4276cab

Suggestion from @tlrmchlsmth. Signed-off-by: Mark McLoughlin <[email protected]>

v1: Make UniprocExecutor.worker non-optional

fdf070d

Suggestion from @tlrmchlsmth. Signed-off-by: Mark McLoughlin <[email protected]>

v1: fix MultiprocExecutor.collective_rpc() return type

e953f47

Suggestion by @tlrmchlsmth. Signed-off-by: Mark McLoughlin <[email protected]>

v1: MPClient.proc-handle can be None

d140a50

Fixes: vllm/v1/engine/core_client.py:196: error: Incompatible types in assignment (expression has type "None", variable has type "EngineCoreProcHandle") [assignment] Signed-off-by: Mark McLoughlin <[email protected]>

markmc force-pushed the v1-mypy branch from 75cd514 to d140a50 Compare December 14, 2024 08:58

WoosukKwon merged commit 6d917d0 into vllm-project:main Dec 14, 2024
51 checks passed

DarkLight1337 mentioned this pull request Dec 18, 2024

Enable mypy type checking #3680

Closed

markmc deleted the v1-mypy branch February 5, 2025 11:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Enable mypy checking on V1 code #11105

Enable mypy checking on V1 code #11105

Uh oh!

markmc commented Dec 11, 2024 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 11, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Dec 12, 2024

Uh oh!

Uh oh!

Uh oh!

markmc commented Dec 12, 2024

Uh oh!

markmc commented Dec 12, 2024

Uh oh!

comaniac commented Dec 12, 2024 •

edited

Loading

Uh oh!

markmc commented Dec 13, 2024

Uh oh!

markmc commented Dec 13, 2024

Uh oh!

Uh oh!

WoosukKwon commented Dec 14, 2024

Uh oh!

Uh oh!

Uh oh!

Enable mypy checking on V1 code #11105

Enable mypy checking on V1 code #11105

Uh oh!

Conversation

markmc commented Dec 11, 2024 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Dec 11, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Dec 12, 2024

Uh oh!

Uh oh!

Uh oh!

markmc commented Dec 12, 2024

Uh oh!

markmc commented Dec 12, 2024

Uh oh!

comaniac commented Dec 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

markmc commented Dec 13, 2024

Uh oh!

markmc commented Dec 13, 2024

Uh oh!

Uh oh!

WoosukKwon commented Dec 14, 2024

Uh oh!

Uh oh!

markmc commented Dec 11, 2024 •

edited by github-actions bot

Loading

comaniac commented Dec 12, 2024 •

edited

Loading