Skip to content

Commit

Permalink
Merge branch 'main' into cuda-graph
Browse files Browse the repository at this point in the history
  • Loading branch information
tgaddair authored Jan 4, 2024
2 parents e00a09f + 6bfc3a2 commit 08061d3
Showing 1 changed file with 0 additions and 2 deletions.
2 changes: 0 additions & 2 deletions server/lorax_server/models/flash_causal_lm.py
Original file line number Diff line number Diff line change
Expand Up @@ -941,8 +941,6 @@ def decode(self, generated_ids: Union[torch.Tensor, List[int]]) -> str:
)

def forward(self, batch: FlashCausalLMBatch, adapter_data: AdapterBatchData) -> Tuple[torch.Tensor, torch.Tensor]:
global CACHE_MANAGER

prefill = batch.cu_seqlen_prefill is not None
model = self.model
if (
Expand Down

0 comments on commit 08061d3

Please sign in to comment.