Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369

Merged
merged 13 commits into from
Jul 19, 2024

Conversation

wooyeonlee0
Copy link
Contributor

This PR adds a simple patch to raise an error to prevent users from hitting the hang error stated in #5814.
This error happens only when the skip speculation feature is activated and there are no generated draft tokens for "all" sequences in a step, and draft_tp > 1.

We may revisit this issue later because it's not resolved completely.

@wooyeonlee0
Copy link
Contributor Author

@cadedaniel @zifeitong @comaniac
This PR is to implement the second option (raise an error) in cade's suggestion.
Could any one of you review this?
Maybe I can revisit later to completely fix it, following the first option.

We need to either fix it or raise an error so that users don't hit this.
#5414 (comment)

Copy link
Collaborator

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Can we enable the test and expect to catch this exception?

@wooyeonlee0 wooyeonlee0 force-pushed the temporal-fix-skip-spec branch from 96ec39e to 02dc475 Compare July 15, 2024 02:11
@wooyeonlee0
Copy link
Contributor Author

@comaniac Thanks for the review :)
I've added the test code that catches the error.

But I'm not sure, the CI seems to have stopped.

@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 15, 2024
@wooyeonlee0
Copy link
Contributor Author

@comaniac The CI test passed :)

Copy link
Collaborator

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 474 to 475
if not self.allow_no_draft_tokens and sum(
proposals.proposal_lens) == 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worry about the overhead that sum brings, but I feel it's fine for now given that it won't be triggered with draft model TP=1. wdyt @cadedaniel?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, is it possible to store this field in proposals when we create proposals? that way we don't need an additional CPU-GPU-CPU sync

"""
self.proposer_worker = proposer_worker
self.scorer_worker = scorer_worker
self.disable_by_batch_size = disable_by_batch_size or float("inf")
self.spec_decode_sampler = spec_decode_sampler
self.allow_no_draft_tokens = allow_zero_draft_token_step
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we mark this private, e.g. _allow_no_draft_tokens? we should have done this for all properties but we missed it

Comment on lines 474 to 475
if not self.allow_no_draft_tokens and sum(
proposals.proposal_lens) == 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, is it possible to store this field in proposals when we create proposals? that way we don't need an additional CPU-GPU-CPU sync

@wooyeonlee0
Copy link
Contributor Author

wooyeonlee0 commented Jul 18, 2024

Thanks for the review! I'm gonna handle it right now :)

@wooyeonlee0 wooyeonlee0 force-pushed the temporal-fix-skip-spec branch 5 times, most recently from 2861e99 to 3876858 Compare July 19, 2024 03:54
@wooyeonlee0
Copy link
Contributor Author

@comaniac
I re-initiated the CI test multiple times, but CI failed in one of the following cases: 'build image' or 'documentation build'.
Link: https://buildkite.com/vllm/ci-aws/builds/5151#0190c4c4-6cde-4a9f-a587-63f29a3b5dbd
Link: https://buildkite.com/vllm/ci-aws/builds/5218#0190c8b9-3ee9-4548-a691-cbe56abcff24

Is there any problem in CI now?

@wooyeonlee0 wooyeonlee0 force-pushed the temporal-fix-skip-spec branch from 3876858 to 4b338d2 Compare July 19, 2024 04:33
@comaniac
Copy link
Collaborator

The failure you posted seems random. I'll monitor the current CI run and manually retry failed ones.

@wooyeonlee0
Copy link
Contributor Author

@cadedaniel @comaniac I've updated the code as your suggestion and I think the code passed the test.
Would you take a look? :)

@comaniac CI has finished. Would you retry the 'documentation-build' test? Thank you!

@cadedaniel cadedaniel enabled auto-merge (squash) July 19, 2024 08:21
@cadedaniel
Copy link
Collaborator

@simon-mo can we get a force merge, doc build seems broken

@simon-mo simon-mo merged commit a921e86 into vllm-project:main Jul 19, 2024
70 of 72 checks passed
@wooyeonlee0 wooyeonlee0 deleted the temporal-fix-skip-spec branch July 19, 2024 13:16
cduk pushed a commit to cduk/vllm-pascal that referenced this pull request Aug 6, 2024
Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants