Skip to content

fix(llm): prevent base llm flow truncation (#522) #659

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

AlankritVerma01
Copy link
Contributor

Fixes: #522

Previously, BaseLlmFlow.run_async would loop indefinitely whenever the LLM response was truncated due to hitting its max_tokens limit. The loop’s break condition only checked for “no events” or last_event.is_final_response(), neither of which fires for a partial/truncated chunk—so the generator re-invoked _run_one_step_async endlessly.

With this change, we:

  • Treat any partial (truncated) event as terminal by adding or last_event.partial to the break condition in run_async.
  • Ensure that when an LLM emits a truncated chunk (partial=True), run_async yields it exactly once and then terminates, avoiding infinite loops.

Test Plan

  • New unit test test_base_llm_flow_truncation.py verifies that:
    • Given a flow that only ever yields one partial event, run_async emits exactly one event and then stops.
  • Run it with:
    pytest tests/unittests/flows/llm_flows/test_base_llm_flow_truncation.py

@AlankritVerma01
Copy link
Contributor Author

Hey @Jacksunwei @hangfei
Please review the PR
Thanks team!

@AlankritVerma01 AlankritVerma01 changed the title Fix/522 base llm flow truncation fix(llm): prevent base llm flow truncation (#522) May 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

base_llm_flow.py has running problem
1 participant