[WIP] Add HPU support to vLLM v1 - cont. #609

kzawora-intel · 2024-12-10T14:06:29Z

Up-to-date variant of #487 after rebase and with functional TP. Diff will become more readable once #605 gets merged.

…nto HEAD

…nto private/kzawora/dec10_v1_hpu

…a/dec10_v1_hpu

kzawora-intel added 30 commits November 12, 2024 18:01

vLLM v1 HPU prototype

2191184

copy gpu model runner code, add hpugraphs support and profile run

fd77180

i am very much struggling

4dadef5

it's hopeless

9db1409

[wip] bypass prefill chunking in v1 scheduler

3b3098c

colonoscopy

c24adb5

prefill runs, decode has deadlock, idk why

2da069e

i'm done for today

932ce93

do better job at prefill chunking detection

fc6a1c2

mixed batch scheduling is still a problem

ff0ed54

Merge remote-tracking branch 'origin/habana_main' into HEAD

50aa6b3

general hpu code rewrite

debec16

add debug stuff, it seems like prefill is functional

0c1d0b6

slight code cleanup

35d3e38

remove garbage changes

491f991

gsm8k now produces 69% acc on llama3.1

e29b84a

add config not warmed up warnings

27b4f32

add bucketinggit add -u .!

087b5d2

llama3.1 now gives 81% in gsm8k without contiguous pa

6fdb6a9

disable contiguous pa by default

8714f9d

async data copy

40ff0ac

add split sampler optimization

28f2ac5

add prompt batching

df7a1d4

padded logits_indices and sampling + documentation

623ed10

update docs

0371c31

fix first-party random and greedy sampler for hpu

d7b2a06

format.sh

c934e60

add warmup w/ sampler (it doesn't work great tho)

e0f4c26

add hpugraph check

58c8f5d

fix async engine, fix sampler corner cases

0c8b075

kzawora-intel added 9 commits November 25, 2024 15:27

Add padding-aware scheduling

fecedb5

Merge remote-tracking branch 'origin/habana_main' into HEAD

2ab1ac8

bucketing refactor, enable contiguous pa, defrag blocks

0d41073

FreeKVCacheBlockHeapQueue bugfixes

5645523

[wip] add prefix caching support (it was actually really hard)

fd62723

fix hpugraphs and long seq corner case

e80f2be

Merge remote-tracking branch 'origin/private/kzawora/dec_10_rebase' i…

63af84c

…nto HEAD

fix uniproc executor

061f037

add multiproc hpu executor

8a28658

kzawora-intel requested review from madamczykhabana, michalkuligowski and mgawarkiewicz as code owners December 10, 2024 14:06

kzawora-intel added 5 commits December 10, 2024 16:23

format.sh

a6db04e

Merge remote-tracking branch 'origin/private/kzawora/dec_10_rebase' i…

7955a5d

…nto private/kzawora/dec10_v1_hpu

Merge remote-tracking branch 'origin/habana_main' into private/kzawor…

64e5dd5

…a/dec10_v1_hpu

Merge remote-tracking branch 'origin/habana_main' into HEAD

6d598d7

whoopsie i forgot to add executors

897b042

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add HPU support to vLLM v1 - cont. #609

[WIP] Add HPU support to vLLM v1 - cont. #609

kzawora-intel commented Dec 10, 2024 •

edited by github-actions bot

Loading

[WIP] Add HPU support to vLLM v1 - cont. #609

Are you sure you want to change the base?

[WIP] Add HPU support to vLLM v1 - cont. #609

Conversation

kzawora-intel commented Dec 10, 2024 • edited by github-actions bot Loading

kzawora-intel commented Dec 10, 2024 •

edited by github-actions bot

Loading