Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable mypy checking on V1 code #11105

Merged
merged 31 commits into from
Dec 14, 2024
Merged

Enable mypy checking on V1 code #11105

merged 31 commits into from
Dec 14, 2024

Conversation

markmc
Copy link
Contributor

@markmc markmc commented Dec 11, 2024

As per #10959

Some of these definitely deserve closer scrutiny e.g.:

  • commit 7974f1d - was there a bug with the hash_block_tokens() call?
  • commit 226fd55 - feels we'll be fighting an uphill battle with mypy and this args/kwargs pattern
  • commit 8a6d88c - MultiprocExecutor.workers and UniprocExecutor.worker can be set to None, so we have to litter the code with is not None assertions

Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

  • Add ready label to the PR
  • Enable auto-merge.

🚀

@markmc markmc force-pushed the v1-mypy branch 4 times, most recently from f339531 to 7cb644e Compare December 11, 2024 15:04
vllm/v1/core/kv_cache_manager.py Outdated Show resolved Hide resolved
vllm/v1/core/kv_cache_utils.py Outdated Show resolved Hide resolved
vllm/v1/engine/core_client.py Outdated Show resolved Hide resolved
vllm/v1/engine/core_client.py Outdated Show resolved Hide resolved
vllm/v1/engine/core_client.py Outdated Show resolved Hide resolved
vllm/v1/engine/core_client.py Outdated Show resolved Hide resolved
vllm/v1/engine/core_client.py Outdated Show resolved Hide resolved
Copy link

mergify bot commented Dec 12, 2024

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @markmc.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 12, 2024
@markmc
Copy link
Contributor Author

markmc commented Dec 12, 2024

@DarkLight1337 @WoosukKwon note that I've rebased and fixed a bunch of new issues. Those new fixes are in the 5 latest commits

@markmc
Copy link
Contributor Author

markmc commented Dec 12, 2024

rebased again to fix conflicts

@comaniac would you mind reviewing commit 9190f34 ?

For reference, in #9972 you said:

Enhance the block hash type from int to Tuple[int, Tuple[int]], which is (hash_value, (toke_ids,)). This guarantees no hash conflicts

Pasting the commit message:

The current block hash format is:

  (hash((parent_hash, *token_id_list)), token_id_list)

i.e. it is a tuple with two parts:

  1. A hash of a tuple combining the parent hash with token ids
  2. A tuple containing the same token ids

and this (hash, token_ids) tuple itself must be hashable in order to be a dictionary key.

This is suspicious - why include the same token IDs twice in the same hash?

And then this code:

def hash_request_tokens(block_size: int,
                        token_ids: List[int]) -> List[BlockHashType]:
    ...
    ret = []
    parent_block_hash = None
    for ...:
        tuple(token_ids[start:end])
        ...
        block_hash = hash_block_tokens(parent_block_hash, block_token_ids)
        parent_block_hash = block_hash
    ...

looks wrong because hash_block_tokens() expects an integer hash for the parent_block_hash argument:

def hash_block_tokens(parent_block_hash: Optional[int],
                      curr_block_token_ids: Tuple[int]) -> BlockHashType:

It appears like it would be perfectly sufficient to use the integer hash of the (parent_hash + token_ids) tuple.

If we do this, then the curr_block_token_ids argument no longer needs to be a tuple - but it does need to be a sliceable sequence, so let's make sure ConstantList satisfies that requirement.

@comaniac
Copy link
Collaborator

comaniac commented Dec 12, 2024

This is suspicious - why include the same token IDs twice in the same hash?

This is to avoid potential hash collision, because Python dict guarantees no collision if the dict key is tuple.
But yeah the type is incorrect as you said. I'll submit a fix PR in minutes.

@markmc
Copy link
Contributor Author

markmc commented Dec 13, 2024

This is suspicious - why include the same token IDs twice in the same hash?

This is to avoid potential hash collision, because Python dict guarantees no collision if the dict key is tuple. But yeah the type is incorrect as you said. I'll submit a fix PR in minutes.

Thanks for the fix! Rebased again to resolve merge conflicts and fixed a few mypy errors with this prefix caching code.

@markmc markmc marked this pull request as draft December 13, 2024 19:17
@markmc
Copy link
Contributor Author

markmc commented Dec 13, 2024

Moving to draft while I debug the test failures

@markmc markmc marked this pull request as ready for review December 13, 2024 21:56
markmc and others added 24 commits December 14, 2024 03:46
Fixes:

vllm/v1/engine/async_llm.py:368: error: Incompatible return value type (got "type[Exception]", expected "BaseException")  [return-value]

Now:

Found 30 errors in 6 files (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/core/kv_cache_utils.py:111: error: Incompatible types in assignment (expression has type "Optional[KVCacheBlock]", variable has type "KVCacheBlock")  [assignment]
vllm/v1/core/kv_cache_utils.py:114: error: Incompatible types in assignment (expression has type "Optional[KVCacheBlock]", variable has type "KVCacheBlock")  [assignment]
vllm/v1/core/kv_cache_utils.py:150: error: Incompatible types in assignment (expression has type "Optional[KVCacheBlock]", variable has type "KVCacheBlock")  [assignment]

Now:

Found 27 errors in 6 files (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/core/kv_cache_manager.py:268: error: Incompatible types in assignment (expression has type "reversed[KVCacheBlock]", variable has type "list[KVCacheBlock]")  [assignment]

Now:

Found 22 errors in 4 files (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/core/scheduler.py:157: error: Item "None" of "Optional[list[KVCacheBlock]]" has no attribute "__iter__" (not iterable)  [union-attr]

Now:

Found 21 errors in 3 files (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/engine/detokenizer.py:258: error: Argument "stop_reason" to "add_tokens" of "IncrementalDetokenizer" has incompatible type "Union[int, str, None]"; expected "Optional[str]"  [arg-type]

Now:

Found 20 errors in 2 files (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/attention/backends/flash_attn.py:162: error: Value of type "Optional[Any]" is not indexable  [index]

Now:

Found 19 errors in 1 file (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/worker/gpu_model_runner.py:164: error: Incompatible types in assignment (expression has type "NewRequestData", variable has type "RunningRequestData")  [assignment]
vllm/v1/worker/gpu_model_runner.py:166: error: "RunningRequestData" has no attribute "sampling_params"  [attr-defined]
vllm/v1/worker/gpu_model_runner.py:175: error: "RunningRequestData" has no attribute "prompt_token_ids"  [attr-defined]
vllm/v1/worker/gpu_model_runner.py:176: error: "RunningRequestData" has no attribute "prompt"  [attr-defined]
vllm/v1/worker/gpu_model_runner.py:177: error: "RunningRequestData" has no attribute "mm_inputs"  [attr-defined]
vllm/v1/worker/gpu_model_runner.py:178: error: "RunningRequestData" has no attribute "mm_positions"  [attr-defined]
vllm/v1/worker/gpu_model_runner.py:181: error: "RunningRequestData" has no attribute "block_ids"; maybe "new_block_ids"?  [attr-defined]
vllm/v1/worker/gpu_model_runner.py:188: error: Incompatible types in assignment (expression has type "ResumedRequestData", variable has type "RunningRequestData")  [assignment]
vllm/v1/worker/gpu_model_runner.py:192: error: "RunningRequestData" has no attribute "block_ids"; maybe "new_block_ids"?  [attr-defined]

Now:

Found 10 errors in 1 file (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/worker/gpu_model_runner.py:230: error: Invalid index type "Optional[str]" for "dict[str, int]"; expected type "str"  [index]

Now:

Found 9 errors in 1 file (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/worker/gpu_model_runner.py:365: error: Argument 1 to "append" of "list" has incompatible type "tuple[str, int]"; expected "tuple[int, int]"  [arg-type]
vllm/v1/worker/gpu_model_runner.py:381: error: Incompatible types in assignment (expression has type "int", variable has type "str")  [assignment]

Now:

Found 6 errors in 1 file (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/worker/gpu_model_runner.py:394: error: Invalid index type "Optional[str]" for "dict[str, int]"; expected type "str"  [index]
vllm/v1/worker/gpu_model_runner.py:395: error: Invalid index type "Optional[str]" for "dict[str, CachedRequestState]"; expected type "str"  [index]
vllm/v1/worker/gpu_model_runner.py:489: error: Invalid index type "Optional[str]" for "dict[str, CachedRequestState]"; expected type "str"  [index]
vllm/v1/worker/gpu_model_runner.py:491: error: Invalid index type "Optional[str]" for "dict[str, int]"; expected type "str"  [index]
vllm/v1/worker/gpu_model_runner.py:810: error: Invalid index type "Optional[str]" for "dict[str, int]"; expected type "str"  [index]

Now:

Found 1 error in 1 file (checked 30 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/executor/abstract.py:47: error: Bracketed expression "[...]" is not valid as a type  [valid-type]
vllm/v1/executor/multiproc_executor.py:106: error: Bracketed expression "[...]" is not valid as a type  [valid-type]
vllm/v1/executor/multiproc_executor.py:128: error: "float" not callable  [operator]
vllm/v1/executor/multiproc_executor.py:175: error: Incompatible types in assignment (expression has type "None", variable has type "list[WorkerProcHandle]")  [assignment]
vllm/v1/executor/uniproc_executor.py:78: error: Incompatible types in assignment (expression has type "None", variable has type "Worker")  [assignment]
vllm/v1/engine/llm_engine.py:115: error: Incompatible types in assignment (expression has type "type[UniprocExecutor]", variable has type "type[MultiprocExecutor]")  [assignment]
vllm/v1/engine/async_llm.py:138: error: Incompatible types in assignment (expression has type "type[UniprocExecutor]", variable has type "type[MultiprocExecutor]")  [assignment]
Found 7 errors in 5 files (checked 33 source files)

Signed-off-by: Mark McLoughlin <[email protected]>
As per the similar v0 code.

Fixes:

vllm/v1/engine/__init__.py:46: error: Unexpected keyword argument "array_like" for "__init_subclass__" of "object"  [call-arg]
vllm/v1/engine/__init__.py:46: error: Unexpected keyword argument "omit_defaults" for "__init_subclass__" of "object"  [call-arg]
vllm/v1/engine/__init__.py:46: error: Unexpected keyword argument "gc" for "__init_subclass__" of "object"  [call-arg]
vllm/v1/engine/__init__.py:58: error: Unexpected keyword argument "array_like" for "__init_subclass__" of "object"  [call-arg]
vllm/v1/engine/__init__.py:58: error: Unexpected keyword argument "omit_defaults" for "__init_subclass__" of "object"  [call-arg]
vllm/v1/engine/__init__.py:58: error: Unexpected keyword argument "gc" for "__init_subclass__" of "object"  [call-arg]

Signed-off-by: Mark McLoughlin <[email protected]>
Signed-off-by: Woosuk Kwon <[email protected]>
Fixes:

vllm/v1/utils.py:110: error: Need type annotation for "cache"  [var-annotated]

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/engine/mm_input_mapper.py:52: error: Missing return statement  [return]

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/engine/mm_input_mapper.py:76: error: Argument 1 to "len" has incompatible type "Optional[list[str]]"; expected "Sized"  [arg-type]
vllm/v1/engine/mm_input_mapper.py:77: error: Argument 1 to "len" has incompatible type "Optional[list[str]]"; expected "Sized"  [arg-type]
vllm/v1/engine/mm_input_mapper.py:82: error: Need type annotation for "ret_hashes"  [var-annotated]
vllm/v1/engine/mm_input_mapper.py:92: error: Value of type "Optional[list[str]]" is not indexable  [index]
vllm/v1/engine/mm_input_mapper.py:110: error: Argument 1 to "put" of "LRUDictCache" has incompatible type "Optional[str]"; expected "str"  [arg-type]
vllm/v1/engine/mm_input_mapper.py:116: error: Item "None" of "Optional[list[str]]" has no attribute "append"  [union-attr]
vllm/v1/engine/mm_input_mapper.py:116: error: Argument 1 to "append" of "list" has incompatible type "Optional[str]"; expected "str"  [arg-type]
vllm/v1/engine/mm_input_mapper.py:119: error: Incompatible return value type (got "tuple[list[Any], Optional[list[str]]]", expected "list[Any]")  [return-value]
vllm/v1/engine/core.py:99: error: Argument 1 to "process_inputs" of "MMInputMapperServer" has incompatible type "Optional[list[Optional[Any]]]"; expected "list[Optional[Any]]"  [arg-type]

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/executor/multiproc_executor.py:190: error: Item "None" of "Optional[list[WorkerProcHandle]]" has no attribute "__iter__" (not iterable)  [union-attr]

Signed-off-by: Mark McLoughlin <[email protected]>
Seems this will never contain None values:

Fixes:

vllm/v1/engine/processor.py:143: error: Argument 5 to "EngineCoreRequest" has incompatible type "Optional[list[str]]"; expected "Optional[list[Optional[str]]]"  [arg-type]

Signed-off-by: Mark McLoughlin <[email protected]>
BlockHashType token IDs should be Tuple[int, ...] - it's not just
one int.

Also, pass the token IDs as a Sequence[int] and convert to a tuple
only when we need it in hash_block_tokens().

Make ConstantList a Sequence subclass to allow this.

Fixes:

vllm/v1/core/kv_cache_utils.py:206: error: Argument 2 to "hash_block_tokens" has incompatible type "tuple[int, ...]"; expected "tuple[int]"  [arg-type]
vllm/v1/core/kv_cache_manager.py:89: error: Argument 2 to "hash_request_tokens" has incompatible type "ConstantList[int]"; expected "list[int]"  [arg-type]
vllm/v1/core/kv_cache_manager.py:401: error: Argument 2 to "hash_block_tokens" has incompatible type "tuple[int, ...]"; expected "tuple[int]"  [arg-type]

Signed-off-by: Mark McLoughlin <[email protected]>
Not very satisfactory, but other attempts caused regressions.

Fixes:

vllm/v1/engine/core_client.py:158: error: "make_engine_core_process" of "EngineCoreProc" gets multiple values for keyword argument "input_path"  [misc]
vllm/v1/engine/core_client.py:158: error: "make_engine_core_process" of "EngineCoreProc" gets multiple values for keyword argument "output_path"  [misc]
vllm/v1/engine/core_client.py:158: error: "make_engine_core_process" of "EngineCoreProc" gets multiple values for keyword argument "ready_path"  [misc]

Signed-off-by: Mark McLoughlin <[email protected]>
Suggestion from @tlrmchlsmth.

Signed-off-by: Mark McLoughlin <[email protected]>
Suggestion from @tlrmchlsmth.

Signed-off-by: Mark McLoughlin <[email protected]>
Fixes:

vllm/v1/engine/core_client.py:196: error: Incompatible types in assignment (expression has type "None", variable has type "EngineCoreProcHandle")  [assignment]

Signed-off-by: Mark McLoughlin <[email protected]>
@WoosukKwon WoosukKwon merged commit 6d917d0 into vllm-project:main Dec 14, 2024
51 checks passed
@WoosukKwon
Copy link
Collaborator

@markmc Finally merged! Thanks for your patience and great work! Really appreciate it.

BKitor pushed a commit to BKitor/vllm that referenced this pull request Dec 30, 2024
joennlae pushed a commit to 44ai-labs/vllm that referenced this pull request Jan 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants