-
Notifications
You must be signed in to change notification settings - Fork 449
Changes LLMRayActor to use vllm's AsyncLLMEngine #1016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
finbarrtimbers
wants to merge
400
commits into
main
Choose a base branch
from
async-engine
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
400 commits
Select commit
Hold shift + click to select a range
4dd25d6
Attempt at fixing issue
finbarrtimbers 703599e
Moves delete
finbarrtimbers 2c63fe7
Fixed bug in tool code.
finbarrtimbers 78dd885
Ran linter. Tool script seems to run fine.
finbarrtimbers aa26e41
Fixed bug in non-tool use.
finbarrtimbers fcd4d78
Updated code to fix index bug.
finbarrtimbers 02a403b
Undid host network change to tool_grpo_fast.sh.
finbarrtimbers 80194ec
Fixed no host networking command from tool_grpo_fast.sh.
finbarrtimbers 22882af
Fixed incorrect index field.
finbarrtimbers c15f081
Fixed index
finbarrtimbers 02fa45b
Fix request tracking and training_step propagation for non-tool use
finbarrtimbers 5543750
Fix KeyError by accessing training_step before cleanup
finbarrtimbers 293b498
Add assertion to verify all n sub-requests are tracked
finbarrtimbers 96ef484
Fix index field timing issue in non-tool mode with n>1
finbarrtimbers 1092363
Fix CompletionOutput index for non-tool mode with n>1
finbarrtimbers 812699b
Changed loop
finbarrtimbers 790b2cc
Set verbose to true
finbarrtimbers b5d7567
Fixed tracking key bug and added a health check for prefetch worker t…
finbarrtimbers 13db536
changed removal from vllm_active_requests to occur after _finalize
finbarrtimbers 55dfcc9
Fix sub-request tracking race condition in non-tools mode
finbarrtimbers fb9344b
Removes debugging code from `backfill-prompts` (#1008)
finbarrtimbers afbc8ab
Merge branch 'main' into backfill-prompts
finbarrtimbers cd2c7d8
Fixed linter error.
finbarrtimbers 91352b2
Add inflight_updates argument to enable quick pausing/resumption
finbarrtimbers 9562997
Refactor process_from_queue loop to use _should_exit method
finbarrtimbers 9111672
Removed comment.
finbarrtimbers 3279680
Longer benchmark
finbarrtimbers 0fa431d
undid changes to benchmark
finbarrtimbers 10c2b10
Now, uses the async engine.
finbarrtimbers 6279598
Fixed errors
finbarrtimbers b6f576d
Removed host networking from single
finbarrtimbers 02db12e
much simpler flow
finbarrtimbers f89ecbe
Removed tracking variable.
finbarrtimbers 3f24d9e
Cleans up lifecycle.
finbarrtimbers 9aec035
Updated generate_one_completion.
finbarrtimbers 787b6bc
Cleaned up main loop.
finbarrtimbers af08dc1
Refactored _process_request significantly
finbarrtimbers ea42341
Simplified process_from_queue
finbarrtimbers 0fefb63
Updates code
finbarrtimbers 4ed483d
Updated clusters
finbarrtimbers a342d68
updated script
finbarrtimbers e83fb31
updated script
finbarrtimbers 459c9b0
updated script
finbarrtimbers 383ad6c
Merge branch 'main' into async-engine
finbarrtimbers a7cfcee
Updated script
finbarrtimbers cd05a9c
Updated script to match launch_benchmark.sh
finbarrtimbers 1e76887
Fixed bug.
finbarrtimbers 7f41a14
updated priority
finbarrtimbers c58e055
Fixed kv_cache_specs
finbarrtimbers b1e6167
Fixed kv_cache_specs
finbarrtimbers 9e6920b
Added logging
finbarrtimbers 5028e72
Fixed methods
finbarrtimbers dd37616
Ran linter.
finbarrtimbers 43dfecb
Fix blocking ray.get in async actor
finbarrtimbers 076d310
Improve async _should_stop to prevent blocking and duplicate requests
finbarrtimbers 7831265
Added timeouts for all the scripts
finbarrtimbers 706264b
Now, we always run the engine loop.
finbarrtimbers bb0b08f
Fixed engine initialization.
finbarrtimbers 4a4715c
added await to generator.
finbarrtimbers 007f089
Changed loggign for vllm
finbarrtimbers 6658cc1
Fixed logging levels
finbarrtimbers 8195d65
Added more timing
finbarrtimbers aa523a5
Fixed deadlock
finbarrtimbers 70e8778
set logging to debug for vllm
finbarrtimbers fa96dfe
set logging to debug for vllm_utils3.py
finbarrtimbers 94c8185
Fixed timeout bug
finbarrtimbers 48a8324
an attempted fix
finbarrtimbers 03fe974
Add async engine implementation for vLLM
finbarrtimbers 5c9b277
Merge branch 'async-engine-rebased' into async-engine
finbarrtimbers 7a9fd8c
Attempt at fixing
finbarrtimbers 1ce0b89
Add detailed logging to track vLLM generation hang
finbarrtimbers 442525c
fixed error
finbarrtimbers f2d18da
Fixed wait stalling
finbarrtimbers da54862
fix issues
finbarrtimbers 534953e
Add comprehensive logging to debug queue hang issue
finbarrtimbers 8aa3ff9
Add detailed logging to trace request flow through queues
finbarrtimbers 10cce87
Fix process_from_queue hanging issue
finbarrtimbers a43772b
Add assertion to detect race condition in request ID reuse
finbarrtimbers bd9306f
Fix premature exit in vllm_utils3 when using tools
finbarrtimbers 9a21126
Removed metadata from logs
finbarrtimbers 338c987
Add detailed logging to _process_request to debug tool use hang
finbarrtimbers 9754f52
Add detailed logging around tokenizer access to debug hang
finbarrtimbers a52d9b6
Updated endpoint
finbarrtimbers 3763476
Add detailed logging between lines 653-682 to identify exact hang loc…
finbarrtimbers 7828a69
Add detailed logging to debug token_ids access hang
finbarrtimbers 1e3eb39
Fix tuple/list concatenation issue in vllm_utils3.py
finbarrtimbers 9fc1bb9
Fix tuple concatenation issue in tool execution path
finbarrtimbers e03df7d
Add detailed debugging to track types during token concatenation
finbarrtimbers bc303d5
Fix attribute access for max_model_len in tool execution path
finbarrtimbers 6106303
updated to use timeouterror
finbarrtimbers 4a05808
fixed syntax error
finbarrtimbers 8692271
More logging
finbarrtimbers ba528ff
Updated code
finbarrtimbers 765470f
Fixed loop
finbarrtimbers 3833725
Fixed logging
finbarrtimbers 82cf644
Attempted to mirror the synchronous loop.
finbarrtimbers 517f7d3
hold the lock less.
finbarrtimbers 8ec2912
less frequent logging
finbarrtimbers 38ac8e3
Updated behaviour to match
finbarrtimbers c77145c
Fixed max exceeded calls
finbarrtimbers e7f8809
Updated tool path behaviour
finbarrtimbers 09abb3d
Fix request ID collision in async engine for tool continuations
finbarrtimbers 41ac860
Removed the lock
finbarrtimbers a4d78f9
CLeaned up PR.
finbarrtimbers bf9eb4c
Update async engine (#1043)
finbarrtimbers 6268971
Removed debugging code
finbarrtimbers fdbf577
Ran linter
finbarrtimbers c768c5a
Merge branch 'main' into async-engine
finbarrtimbers 46d0ff0
Fixed issue.
finbarrtimbers bba171a
Minimized differences between new code and old.
finbarrtimbers bc6bb3b
Merge branch 'main' into async-engine
finbarrtimbers 70de1f5
Fixed cluster warning in large_test_script.sh.
finbarrtimbers 294b5c1
Cleaned up PR.
finbarrtimbers c33e091
Updated assert threaded actor class.
finbarrtimbers 75dc516
Fixed class.
finbarrtimbers ac07388
Merge branch 'main' into async-engine
finbarrtimbers aa7d0b2
Set default values for large_test_script.sh
finbarrtimbers 7400450
set enforce eager
finbarrtimbers 65cc8f2
now, we set infligth updates false
finbarrtimbers b909b84
Now, we don't set enforce eager.
finbarrtimbers dec49db
Merge branch 'main' into async-engine
finbarrtimbers c1cbe4a
Updated large_test_script.sh
finbarrtimbers b45c123
Fixed env var issue
finbarrtimbers ec252e3
Now we set inflight updates true
finbarrtimbers d44d2a7
trying to start/stop background loop
finbarrtimbers 8697831
Merge branch 'main' into async-engine
finbarrtimbers e78f6fc
Removed start of loop
finbarrtimbers 2a17b18
now, we use sleep/wake_up to make things work.
finbarrtimbers 77f48b8
Set inflight true on single
finbarrtimbers 27494dd
Removed sleep/wakeup code
finbarrtimbers 8d6dad9
Updated code
finbarrtimbers 2642fe3
Fixed typo
finbarrtimbers 4f069a2
Ran linter
finbarrtimbers 3acc95c
Fixed bug
finbarrtimbers 6b6ce66
switched to use the v1 engine.
finbarrtimbers 5d9af9c
updated code
finbarrtimbers fc678b4
Fixed issue where we were calling v0 APIs.
finbarrtimbers f9f9d13
Fixed hanging issue
finbarrtimbers 9863c82
Updated code to remove pause_generation calls.
finbarrtimbers 77f3ffa
updated code
finbarrtimbers b85eb97
Fixed abort issue
finbarrtimbers 36810e6
updated code
finbarrtimbers d395428
Add diagnostic logging and fix vLLM v1 compatibility
finbarrtimbers 25fa2ce
Set vllm logs to debug
finbarrtimbers cdfebaf
Updated vllm version to 10.2.
finbarrtimbers 2d5e9a9
Updated flash attn version
finbarrtimbers 0896efa
Ran uv sync
finbarrtimbers 7c14485
Fix AsyncLLMEngine hanging by creating it within running event loop
finbarrtimbers a38b870
Move _init_engine_async to module-level function
finbarrtimbers 191e6dd
Add comprehensive diagnostic logging for async task tracing
finbarrtimbers 17e647d
Add diagnostic logging to trace process_from_queue exit behavior
finbarrtimbers ec05c5a
Add diagnostic logging for weight sync stop_requested toggle
finbarrtimbers 55517f6
Add diagnostic logging to trace weight broadcast deadlock
finbarrtimbers 5a729b4
Add event loop diagnostic logging to update_weight
finbarrtimbers a599b0d
Fix event loop mismatch in async RPC calls
finbarrtimbers 302e4d5
Add assertions to verify event loop consistency
finbarrtimbers 2eb35de
Fix async RPC deadlock by using sync-to-async bridge
finbarrtimbers 8729027
Tried more fixes
finbarrtimbers 0b3511d
Updated to remove generate thread
finbarrtimbers 87ccb6d
Updated code to add processing
finbarrtimbers 294577d
Chnage architecture
finbarrtimbers 149d6ec
Now we set vllm_insecure
finbarrtimbers d647af4
Set inflight false
finbarrtimbers 30b531d
removed message serialization
finbarrtimbers 8179d30
removed some logs
finbarrtimbers d167793
Another attempt to fix hang
finbarrtimbers 2647209
Merge branch 'main' into async-engine
finbarrtimbers c3f3966
lots of logging changes
finbarrtimbers 2eafb9c
Ran linter.
finbarrtimbers b542744
Reset scripts.
finbarrtimbers 292a272
Undid changes to mason.py
finbarrtimbers c9b9c59
Cleaned up PR.
finbarrtimbers 7c6fbef
Cleaned up PR.
finbarrtimbers d5cd6f7
Cleaned up PR.
finbarrtimbers e019aa8
Cleaned up PR.
finbarrtimbers 2c80999
Removed timeouterrro
finbarrtimbers e72cfa5
Cleaned up PR
finbarrtimbers 5c4d405
Uses async for
finbarrtimbers f33190b
Now, we handle tools.
finbarrtimbers ad6986f
Cleaned assert code.
finbarrtimbers 5d10dc3
Attempty at fixing code.
finbarrtimbers f690312
Cleaned up assert
finbarrtimbers c1f83c0
Another attempt at fixing the bug
finbarrtimbers 4165820
Updated code
finbarrtimbers cc3ba93
Fix tool execution hanging by using dedicated executor instead of asy…
finbarrtimbers daba023
Fix async event loop issue - use get_running_loop() instead of get_ev…
finbarrtimbers 72eb57e
Add logging to check if executor is None during tool execution
finbarrtimbers 1d42889
Add detailed logging to track tool execution and triggering
finbarrtimbers 591d11f
Fix async event loop hanging by using unique request IDs for each ite…
finbarrtimbers bf6728a
Add detailed logging to trace async generation hang
finbarrtimbers d1f0a50
Add detailed logging after tool execution to trace iteration hang
finbarrtimbers b599b15
Add fine-grained logging to debug model config access hang
finbarrtimbers 08db870
Add detailed logging to debug prompt concatenation hang
finbarrtimbers a59d9f0
Cache prompt_token_ids to avoid hang when accessing TokensPrompt prop…
finbarrtimbers 3675f06
Fix undefined variable in assert_threaded_actor
finbarrtimbers 9a9a5c0
Updated code
finbarrtimbers cafc0d2
Set inflight false
finbarrtimbers 49724e8
Fixed duplicate flag
finbarrtimbers 7a49068
Simplified significantly
finbarrtimbers be451be
Removed logs
finbarrtimbers 2cf5750
Simplified threading model
finbarrtimbers 025fb40
Added handling for inflight_updates
finbarrtimbers 97610cd
Inlined generate_one_completion
finbarrtimbers 10f9e1a
Clean up
finbarrtimbers f32879f
More clean up
finbarrtimbers cb10def
Set inflight to true
finbarrtimbers 9708fc4
Cleaned up code.
finbarrtimbers fff7d87
lots of cleanup
finbarrtimbers 03db661
Major refactor
finbarrtimbers e0a0960
More PR cleanup
finbarrtimbers 60897fe
Merge branch 'main' into async-engine
finbarrtimbers 2f592b5
Fixed code
finbarrtimbers d2203de
Cleaned up code.
finbarrtimbers 35b4a1c
Merge branch 'main' into async-engine
finbarrtimbers e42a953
undid changes
finbarrtimbers e51ecce
Removed self.logger
finbarrtimbers 6cdf576
A bunch of changes to minimize differences.
finbarrtimbers 6e2b3d0
Merge branch 'main' into async-engine
finbarrtimbers 84ecb7a
fixed error
finbarrtimbers dcbfdf7
Cleane dup code.
finbarrtimbers 710919c
use mp
finbarrtimbers 8372871
Set multiprocessing
finbarrtimbers f74cc92
Updated code
finbarrtimbers 0994057
trying more changes
finbarrtimbers a963332
fixing logprob issue
finbarrtimbers 0b8d273
Update code
finbarrtimbers 9e90229
Old async engine2 (#1075)
finbarrtimbers 9ec9622
Merge branch 'main' into async-engine
finbarrtimbers 6c58ad6
Merge branch 'main' into async-engine
finbarrtimbers c6eec7a
Merge branch 'main' into async-engine
finbarrtimbers e2824d5
Ran uv sync
finbarrtimbers 8a81bcf
Simpler pyproject.toml
finbarrtimbers 76bbbf0
Updated uv.lock
finbarrtimbers d7f3f78
Updated.
finbarrtimbers eab53dd
Cleaned up code.
finbarrtimbers 8eaeaad
Updated code to correctly calculate kv_cache_info
finbarrtimbers 3e77b18
Updated tests.
finbarrtimbers 599e5b0
Updated code to use public APIs.
finbarrtimbers de8ad2b
Updated kv cache method
finbarrtimbers a8645b1
Another attempt at fixing kv cache specs
finbarrtimbers 353561a
Updated code
finbarrtimbers a7dc5c7
Added health check on loop
finbarrtimbers ff23618
Attempt at fixing
finbarrtimbers 04cca39
updated kv cache code
finbarrtimbers 7b5a7d4
removed recursive health check
finbarrtimbers 16ee997
Merge branch 'main' into async-engine
finbarrtimbers f3aaff1
Merge branch 'main' into async-engine
finbarrtimbers 5a625df
Removed constant
finbarrtimbers 9a04eba
fix tool use
hamishivi c7e403f
prevent concurrency issue
hamishivi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Token ID Mismatch in Mock Request
The
prompt_token_ids
inmock_request_output
is inconsistent with theprompt_token_ids
defined inrequest_metadata
across two test cases. This mismatch in token counts could lead to inaccurate test validation.Additional Locations (1)
open_instruct/test_vllm_utils3.py#L134-L135