Skip to content

Commit

Permalink
Model: Raise an error if the context length is too large
Browse files Browse the repository at this point in the history
The dynamic generator gave a not-so-helpful exception already which
basically said to not exceed the max sequence length. Instead of
possible undefined behavior, error out.

Signed-off-by: kingbri <[email protected]>
  • Loading branch information
bdashore3 committed Sep 20, 2024
1 parent b30336c commit 75af974
Showing 1 changed file with 2 additions and 3 deletions.
5 changes: 2 additions & 3 deletions backends/exllamav2/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -1228,10 +1228,9 @@ async def generate_gen(
# The first index will always be the positive prompt
context_len = input_ids[0].size(dim=-1)
if context_len > self.config.max_seq_len:
logger.warning(
raise ValueError(
f"Context length {context_len} is greater than max_seq_len "
f"{self.config.max_seq_len}. Generation is truncated and "
"metrics may not be accurate."
f"{self.config.max_seq_len}"
)

# Automatically set max_tokens to fill up the context
Expand Down

0 comments on commit 75af974

Please sign in to comment.