Skip to content

Commit

Permalink
Model: Log metrics before yielding a stop
Browse files Browse the repository at this point in the history
Yielding the finish reason before the logging causes the function to
terminate early. Instead, log before yielding and breaking out of the
generation loop.

Signed-off-by: kingbri <[email protected]>
  • Loading branch information
bdashore3 committed Mar 20, 2024
1 parent 09a4c79 commit b74603d
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions backends/exllamav2/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -975,19 +975,19 @@ def generate_gen_sync(self, prompt: str, **kwargs):
last_chunk_time = now

if eos or generated_tokens == max_tokens:
# Print response
log_response(full_response)

# Print metrics
elapsed_time = last_chunk_time - start_time
context_len = None if ids is None else context_len

log_metrics(
generated_tokens, elapsed_time, context_len, self.config.max_seq_len
)

finish_reason = "length" if generated_tokens == max_tokens else "stop"
generation = {"finish_reason": finish_reason}
yield generation

break

# Print response
log_response(full_response)

# Print metrics
elapsed_time = last_chunk_time - start_time
context_len = None if ids is None else context_len

log_metrics(
generated_tokens, elapsed_time, context_len, self.config.max_seq_len
)

0 comments on commit b74603d

Please sign in to comment.