Skip to content

Commit

Permalink
add window_left option
Browse files Browse the repository at this point in the history
  • Loading branch information
ajtejankar committed Oct 25, 2024
1 parent 8540461 commit d880464
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions server/lorax_server/utils/flashinfer_attention.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ def use_prefill_state(
num_kv_heads: int,
head_size: int,
query_dtype: str = "float16",
window_left: int,
):
"""
Context manager to set the active flashinfer prefill state to the given
Expand All @@ -124,6 +125,7 @@ def use_prefill_state(
num_kv_heads=num_kv_heads,
head_dim=head_size,
q_data_type=query_dtype,
# window_left=window_left, TODO
)
yield
finally:
Expand Down

0 comments on commit d880464

Please sign in to comment.