Skip to content

Commit

Permalink
Disable fp8 kv cache for lovelace (#520)
Browse files Browse the repository at this point in the history
  • Loading branch information
tgaddair authored Jun 18, 2024
1 parent 559fc3b commit 49bb52f
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion server/lorax_server/utils/paged_attention.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,8 @@
)

if torch.cuda.is_available():
fp8_supported = torch.cuda.get_device_capability()[0] >= 9 or (torch.cuda.get_device_capability()[0] == 8 and torch.cuda.get_device_capability()[1] >= 9)
# TODO(travis): fix for CUDA 8.9 (Lovelace)
fp8_supported = torch.cuda.get_device_capability()[0] >= 9 #or (torch.cuda.get_device_capability()[0] == 8 and torch.cuda.get_device_capability()[1] >= 9)
else:
fp8_supported = False

Expand Down

0 comments on commit 49bb52f

Please sign in to comment.