Add step_map to track token decoding order in DLLM #4057
+200
−15
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
In DLLM (Disaggregated LLM) mode, tokens are generated in blocks and progressively unmasked in a non-sequential order. Currently, there is no way to track which decoding step each token was revealed in. This information is valuable for:
This PR adds a
step_mapfield to track the decoding step number for each generated token.Modification
This PR introduces a
step_mapfeature that records which step each token was decoded in DLLM mode:Core tracking logic (
lmdeploy/pytorch/strategies/dllm/sequence.py):history_step_mapfield toSchedulerSequenceDLLMto store step numbers_current_stepcounter to track decoding stepsstep_mapandgenerated_step_mapproperties_update_token_ids_decode()to record step numbers when tokens transition from MASKED to UNMASKEDEngine layer (
lmdeploy/pytorch/engine/engine.py):step_mapfield toInferOutputdataclassstep_mapfrom messages in_make_infer_outputs()step_mapthrough response dataInstance layer (
lmdeploy/pytorch/engine/engine_instance.py):step_maptoEngineOutputAPI layer (
lmdeploy/messages.py):step_mapfield toResponsedataclassstep_mapfield toEngineOutputdataclassResponse.__repr__()to display step_mapAsync engine layer (
lmdeploy/serve/async_engine.py):step_mapfield toGenOutdataclass_gen_out_to_response()to pass step_map_append_response()to accumulate step_map across iterationsHow it works:
BC-breaking (Optional)
No breaking changes. This is a backward-compatible addition:
step_mapfield defaults toNonein all dataclassesNoneUse cases (Optional)
Example usage:
Analysis example:
This helps researchers:
Checklist