after torch compile with 0.2.0, speed is become very slow #727

MichoChan · 2025-01-09T03:47:54Z

No description provided.

yzh119 · 2025-01-09T04:39:29Z

Yes, that is observed in #709, you can cherry pick the changes there.

MichoChan · 2025-01-09T08:12:43Z

i test, but that hotfix can break graph, when compile using fullgraph, and if not use fullgraph, the compile will succees，and the speed is ok，but i need use fullgraph when using vllm compile fast inductor

youkaichao · 2025-01-09T08:43:08Z

@MichoChan how do you use vllm with torch.compile?

MichoChan · 2025-01-10T02:52:48Z

@MichoChan how do you use vllm with torch.compile?

torch compile in vllm is ok, but when i use vllm compilation impl in my framework，my model code would lead to graph break when compile，then this assert not self._called, "VllmBackend can only be called once", and i use fullgraph with flashinfer 0.20.0.

i find vllm already use custom op register with attention for torch compile, i use this same method with flashinfer 0.16.0, then everything is ok for me.

so flashinfer 0.20.0 can't use torch compile full graph

yzh119 · 2025-01-10T07:16:31Z

so flashinfer 0.20.0 can't use torch compile full graph

Can you explain this? I don't see why fullgraph work for v0.1.6 but not for v0.2.0

MichoChan · 2025-01-10T07:38:36Z

@MichoChan how do you use vllm with torch.compile?

so flashinfer 0.20.0 can't use torch compile full graph

Can you explain this? I don't see why fullgraph work for v0.1.6 but not for v0.2.0

sorry, not 0.20.0, is the master with #709 which cant use fullgraph, i test, and find the #709 could break graph, BatchDecodeMlaWithPagedKVCacheWrapper.run break graph

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

after torch compile with 0.2.0, speed is become very slow #727

after torch compile with 0.2.0, speed is become very slow #727

MichoChan commented Jan 9, 2025

yzh119 commented Jan 9, 2025 •

edited

Loading

MichoChan commented Jan 9, 2025 •

edited

Loading

youkaichao commented Jan 9, 2025

MichoChan commented Jan 10, 2025

yzh119 commented Jan 10, 2025

MichoChan commented Jan 10, 2025 •

edited

Loading

after torch compile with 0.2.0, speed is become very slow #727

after torch compile with 0.2.0, speed is become very slow #727

Comments

MichoChan commented Jan 9, 2025

yzh119 commented Jan 9, 2025 • edited Loading

MichoChan commented Jan 9, 2025 • edited Loading

youkaichao commented Jan 9, 2025

MichoChan commented Jan 10, 2025

yzh119 commented Jan 10, 2025

MichoChan commented Jan 10, 2025 • edited Loading

yzh119 commented Jan 9, 2025 •

edited

Loading

MichoChan commented Jan 9, 2025 •

edited

Loading

MichoChan commented Jan 10, 2025 •

edited

Loading