Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
533d1932
Unverified
Commit
533d1932
authored
Jul 31, 2024
by
Woosuk Kwon
Committed by
GitHub
Jul 31, 2024
Browse files
[Bugfix][TPU] Set readonly=True for non-root devices (#6980)
parent
9f0e69b6
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
1 deletion
+4
-1
vllm/worker/tpu_worker.py
vllm/worker/tpu_worker.py
+4
-1
No files found.
vllm/worker/tpu_worker.py
View file @
533d1932
...
@@ -104,7 +104,10 @@ class TPUWorker(LoraNotSupportedWorkerBase, LocalOrDistributedWorkerBase):
...
@@ -104,7 +104,10 @@ class TPUWorker(LoraNotSupportedWorkerBase, LocalOrDistributedWorkerBase):
# Use persistent cache to avoid XLA recompilation.
# Use persistent cache to avoid XLA recompilation.
# NOTE(woosuk): This does not completely eliminate the recompilation
# NOTE(woosuk): This does not completely eliminate the recompilation
# overhead because dynamo does not cache the compiled results.
# overhead because dynamo does not cache the compiled results.
xr
.
initialize_cache
(
envs
.
VLLM_XLA_CACHE_PATH
,
readonly
=
False
)
# NOTE(woosuk): Set readonly=False only for the rank 0 process to avoid
# race conditions.
xr
.
initialize_cache
(
envs
.
VLLM_XLA_CACHE_PATH
,
readonly
=
not
self
.
is_driver_worker
)
def
load_model
(
self
):
def
load_model
(
self
):
self
.
model_runner
.
load_model
()
self
.
model_runner
.
load_model
()
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment