Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[torch.compile] Add torch inductor pass for fusing silu_and_mul with subsequent scaled_fp8_quant operations #10867

Open
wants to merge 36 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
2e0031a
init
SageMoore Dec 2, 2024
8a957c7
remove backend format changes
SageMoore Dec 3, 2024
2913716
format
SageMoore Dec 3, 2024
11c6fae
move activation_quant_kernels to the quantization dir
SageMoore Dec 3, 2024
2dfecb5
added replacement unit test
SageMoore Dec 4, 2024
702fa46
added kernel unit test
SageMoore Dec 5, 2024
583ff4c
misc cleanup
SageMoore Dec 6, 2024
e5680f7
move activation quant fusion to its own pass
SageMoore Dec 6, 2024
4b775c4
update test
SageMoore Dec 6, 2024
d5ff865
format
SageMoore Dec 6, 2024
c970dec
format
SageMoore Dec 6, 2024
596c445
format
SageMoore Dec 6, 2024
7ab3e18
format
SageMoore Dec 6, 2024
d347431
format
SageMoore Dec 6, 2024
553d99c
format
SageMoore Dec 6, 2024
774559d
format
SageMoore Dec 6, 2024
e2fda7f
format
SageMoore Dec 6, 2024
6915fa2
minor comment fix
SageMoore Dec 9, 2024
6d4b8d0
minor updates
SageMoore Dec 9, 2024
6b631b0
fix fix-functionalization
SageMoore Dec 12, 2024
5b78d80
add opcheck test for fused op
SageMoore Dec 13, 2024
391eea5
fix fix_functionalization tests
SageMoore Dec 13, 2024
546b411
Merge branch 'main' of https://github.com/neuralmagic/vllm into sage/…
SageMoore Dec 13, 2024
0d79c17
fix fix_functionalization again
SageMoore Dec 13, 2024
1041529
Merge branch 'main' of https://github.com/neuralmagic/vllm into sage/…
SageMoore Dec 13, 2024
3198f64
format
SageMoore Dec 13, 2024
58111a9
fixup includes
SageMoore Dec 14, 2024
9a18085
refactor math.hpp
SageMoore Dec 16, 2024
5ae5fe0
Merge branch 'main' of https://github.com/neuralmagic/vllm into sage/…
SageMoore Dec 17, 2024
e051b24
fix amd build
SageMoore Dec 18, 2024
bfdac35
Merge branch 'main' of https://github.com/neuralmagic/vllm into sage/…
SageMoore Dec 19, 2024
8514b0e
review comments and format
SageMoore Dec 19, 2024
ec1290a
fix amd build
SageMoore Dec 19, 2024
008b725
review comments and format
SageMoore Dec 20, 2024
4a0ac7e
minor test fix
SageMoore Dec 20, 2024
554012e
Merge branch 'main' of https://github.com/neuralmagic/vllm into sage/…
SageMoore Jan 2, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
format
Signed-off-by: Sage Moore <[email protected]>
SageMoore committed Dec 6, 2024
commit d347431834e6b9923a3d89696226f4930f2b8966
2 changes: 1 addition & 1 deletion vllm/compilation/fusion.py
Original file line number Diff line number Diff line change
@@ -231,7 +231,7 @@ def process_matches(self, graph: torch.fx.Graph):

fused_node = graph.call_function(
auto_functionalized,
(torch.ops._C.fused_add_rms_norm_static_fp8_quant.default,
(torch.ops._C.fused_add_rms_norm_static_fp8_quant.default,
),
kwargs=kwargs)