Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bugfix] Multiple fixes to tool streaming with hermes and mistral #10979

Merged
merged 2 commits into from
Dec 12, 2024

Conversation

cedonley
Copy link
Contributor

@cedonley cedonley commented Dec 7, 2024

REDO of PR #10782

FIX #10781
FIX #10589
FIX #10821
FIX #10900

Key fixed issues:

  • UTF-8 escaping mismatches result in argument corruption
  • Final delta often not returned when chunks are small
  • Short initial arguments can corrupt entire arguments return
  • Mistral tool call id generation is inconsistent between stream/non-stream and causes failures

To be reviewed in future PRs:

  • Simplifying overall tool streaming -- reduce reliance on json rewriting and diff's
  • Identify similar issues in other tool parsers
  • When using speculative decoding, final delta can sometimes still be dropped

@K-Mistele @gcalmettes

Copy link

github-actions bot commented Dec 7, 2024

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

  • Add ready label to the PR
  • Enable auto-merge.

🚀

@mergify mergify bot added the frontend label Dec 7, 2024
@cedonley
Copy link
Contributor Author

cedonley commented Dec 7, 2024

Regarding failure in entrypoints-test:

  • Pythonic tool test failure is not a regression
  • Appears to be an intermittent issue with that parser similar to the ones I've fixed with hermes and mistral
    • Final delta of arguments seems to be missing
  • The issue appears in fastcheck runs from other recent non-tool PRs, so it is not caused by this PR
  • My local run of pytest on this branch passes for the same tests

I looked at that parser to see if there was an easy fix, but the logic is somewhat different from the others and I'd rather not try to overload this PR.

Copy link
Contributor

@gcalmettes gcalmettes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this.

@gcalmettes
Copy link
Contributor

I agree that the overall tool streaming extraction logic could/should be simplified, and should be addressed in another PR.

@K-Mistele
Copy link
Contributor

I agree that the overall tool streaming extraction logic could/should be simplified, and should be addressed in another PR.

Yes, I have been of half a mind to rewrite from scratch using finite state machines which seems like the most appropriate & clean approach; just haven't had time to dig into it yet

@K-Mistele K-Mistele mentioned this pull request Dec 9, 2024
1 task
@cedonley
Copy link
Contributor Author

cedonley commented Dec 9, 2024

@K-Mistele Yeah. I was thinking the same thing.

It doesn't help that the OpenAI tool interfaces have changed multiple times and the underlying models don't even stay consistent within families. Going to be hard to completely future-proof this, so may want to let things settle a bit before the rewrite.

@cedonley
Copy link
Contributor Author

@DarkLight1337 @mgoin This is good to merge per reviews from @K-Mistele and @gcalmettes

Thanks!

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing!

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) December 11, 2024 22:37
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 11, 2024
@DarkLight1337
Copy link
Member

Can you merge from main to fix the CI failures?

@DarkLight1337 DarkLight1337 merged commit 7439a8b into vllm-project:main Dec 12, 2024
51 checks passed
@cedonley cedonley deleted the fix_tool_streaming branch December 12, 2024 01:12
Akshat-Tripathi pushed a commit to krai/vllm that referenced this pull request Dec 12, 2024
sleepwalker2017 pushed a commit to sleepwalker2017/vllm that referenced this pull request Dec 13, 2024
BKitor pushed a commit to BKitor/vllm that referenced this pull request Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
frontend ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
4 participants