File tree Expand file tree Collapse file tree 1 file changed +1
-2
lines changed Expand file tree Collapse file tree 1 file changed +1
-2
lines changed Original file line number Diff line number Diff line change @@ -1915,7 +1915,7 @@ def _import_hop_flex_attention(
19151915 - score_mod: Optional submodule/callable for score modification (imported as function)
19161916 - block_mask: Optional BlockMask tuple containing mask_mod function and runtime tensors
19171917 - scale: Optional float for attention score scaling
1918- - enable_gqa: Boolean for grouped query attention support (TODO: NYI)
1918+ - enable_gqa: Boolean for grouped query attention support
19191919 - kernel_options: Dict of performance tuning options (TODO: NYI)
19201920
19211921 This creates a call to aten.flex_attention with function symbol references for
@@ -1932,7 +1932,6 @@ def _import_hop_flex_attention(
19321932 node .args [:6 ]
19331933 )
19341934
1935- # TODO: Add support for enable_gqa (grouped query attention)
19361935 # This is a boolean flag that enables GQA optimization
19371936 enable_gqa = node .args [6 ] if len (node .args ) > 6 else False
19381937
You can’t perform that action at this time.
0 commit comments