Skip to content

Commit

Permalink
[Bugfix][TPU] Set readonly=True for non-root devices (vllm-project#6980)
Browse files Browse the repository at this point in the history
Signed-off-by: Alvant <[email protected]>
  • Loading branch information
WoosukKwon authored and Alvant committed Oct 26, 2024
1 parent 4b14188 commit 205f811
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion vllm/worker/tpu_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,10 @@ def init_device(self) -> None:
# Use persistent cache to avoid XLA recompilation.
# NOTE(woosuk): This does not completely eliminate the recompilation
# overhead because dynamo does not cache the compiled results.
xr.initialize_cache(envs.VLLM_XLA_CACHE_PATH, readonly=False)
# NOTE(woosuk): Set readonly=False only for the rank 0 process to avoid
# race conditions.
xr.initialize_cache(envs.VLLM_XLA_CACHE_PATH,
readonly=not self.is_driver_worker)

def load_model(self):
self.model_runner.load_model()
Expand Down

0 comments on commit 205f811

Please sign in to comment.