[Misc] Consolidate and optimize logic for building padded tensors #6541

DarkLight1337 · 2024-07-18T15:10:18Z

Following #6442 , this PR introduces a small refactor to clean up the code.

cc @peng1999

DarkLight1337

Some explanations

DarkLight1337 · 2024-07-18T15:12:29Z

tests/conftest.py

-_STR_DTYPE_TO_TORCH_DTYPE = {
-    "half": torch.half,
-    "bfloat16": torch.bfloat16,
-    "float": torch.float,
-}
-


I found that this mapping is a duplicate of the one in vllm.utils, so I've removed it.

vllm/model_executor/sampling_metadata.py

DarkLight1337 · 2024-07-18T15:13:46Z

vllm/model_executor/sampling_metadata.py

@@ -466,22 +465,30 @@ def from_lists(cls, temperatures: List[float], top_ps: List[float],
        do_penalties = prompt_tokens or output_tokens

        if do_penalties:


I have merged the if-else blocks based on do_penalties together. Not sure why they were separated in the first place.

DarkLight1337 · 2024-07-18T15:20:07Z

vllm/utils.py


    The padding is applied to the end of each inner list until it reaches
    `max_len`.
    """
-    padded_x = np.zeros([len(x), max_len], dtype=np.int32) + pad
+    padded_x = np.full((len(x), max_len), pad, dtype=dtype)


np.full(..., pad) is more efficient than np.zeros(...) + pad. Try it yourself:

python -m timeit "import numpy as np; np.zeros(100000) + 2" python -m timeit "import numpy as np; np.full(100000, 2)"

I've also fixed the dtype to be consistent with the pytorch one.

vllm/model_executor/sampling_metadata.py

comaniac

LGTM. Just nits

vllm/utils.py

vllm/worker/neuron_model_runner.py

vllm/model_executor/sampling_metadata.py

…lm-project#6541)

…lm-project#6541) Signed-off-by: Alvant <[email protected]>

…lm-project#6541)

Consolidate logic for building padded tensor

8866cd6

DarkLight1337 requested a review from Yard1 July 18, 2024 15:10

DarkLight1337 commented Jul 18, 2024

View reviewed changes

DarkLight1337 changed the title ~~[Misc] Consolidate logic for building padded tensors~~ [Misc] Consolidate and optimize logic for building padded tensors Jul 18, 2024

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 18, 2024

DarkLight1337 commented Jul 18, 2024

View reviewed changes

vllm/model_executor/sampling_metadata.py Show resolved Hide resolved

Fix pin_memory=True not working

38c5ab8

DarkLight1337 force-pushed the refactor-tensor-pad branch from 64382b3 to 38c5ab8 Compare July 18, 2024 15:57

DarkLight1337 added 3 commits July 20, 2024 00:39

Automatically compute max length

cee1823

Add note

c2015fd

Merge branch 'upstream' into refactor-tensor-pad

73a8930

DarkLight1337 requested a review from comaniac July 20, 2024 00:58

comaniac approved these changes Jul 20, 2024

View reviewed changes

vllm/utils.py Outdated Show resolved Hide resolved

vllm/worker/neuron_model_runner.py Show resolved Hide resolved

vllm/model_executor/sampling_metadata.py Show resolved Hide resolved

Address comment

75fc93c

DarkLight1337 enabled auto-merge (squash) July 20, 2024 03:37

DarkLight1337 merged commit 9042d68 into vllm-project:main Jul 20, 2024
72 of 73 checks passed

DarkLight1337 deleted the refactor-tensor-pad branch July 20, 2024 04:17

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024

[Misc] Consolidate and optimize logic for building padded tensors (vl…

98519e9

…lm-project#6541)

gnpinkert pushed a commit to gnpinkert/vllm that referenced this pull request Jul 26, 2024

[Misc] Consolidate and optimize logic for building padded tensors (vl…

9e853a1

…lm-project#6541)

cduk pushed a commit to cduk/vllm-pascal that referenced this pull request Aug 6, 2024

[Misc] Consolidate and optimize logic for building padded tensors (vl…

053e4f3

…lm-project#6541)

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Misc] Consolidate and optimize logic for building padded tensors (vl…

6979fd0

…lm-project#6541) Signed-off-by: Alvant <[email protected]>

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[Misc] Consolidate and optimize logic for building padded tensors (vl…

da98067

…lm-project#6541)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Consolidate and optimize logic for building padded tensors #6541

[Misc] Consolidate and optimize logic for building padded tensors #6541

DarkLight1337 commented Jul 18, 2024 •

edited

Loading

DarkLight1337 left a comment

DarkLight1337 Jul 18, 2024

DarkLight1337 Jul 18, 2024

DarkLight1337 Jul 18, 2024

DarkLight1337 Jul 18, 2024

comaniac left a comment

		@@ -466,22 +465,30 @@ def from_lists(cls, temperatures: List[float], top_ps: List[float],
		do_penalties = prompt_tokens or output_tokens

		if do_penalties:

[Misc] Consolidate and optimize logic for building padded tensors #6541

[Misc] Consolidate and optimize logic for building padded tensors #6541

Conversation

DarkLight1337 commented Jul 18, 2024 • edited Loading

DarkLight1337 left a comment

Choose a reason for hiding this comment

DarkLight1337 Jul 18, 2024

Choose a reason for hiding this comment

DarkLight1337 Jul 18, 2024

Choose a reason for hiding this comment

DarkLight1337 Jul 18, 2024

Choose a reason for hiding this comment

DarkLight1337 Jul 18, 2024

Choose a reason for hiding this comment

comaniac left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Jul 18, 2024 •

edited

Loading