Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
966c742e
Unverified
Commit
966c742e
authored
Apr 16, 2025
by
Richard Zou
Committed by
GitHub
Apr 15, 2025
Browse files
Disable remote caching when calling compile_fx (#16611)
Signed-off-by:
rzou
<
zou3519@gmail.com
>
parent
0d7d05f4
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
13 additions
and
0 deletions
+13
-0
vllm/compilation/compiler_interface.py
vllm/compilation/compiler_interface.py
+13
-0
No files found.
vllm/compilation/compiler_interface.py
View file @
966c742e
...
@@ -290,6 +290,19 @@ class InductorAdaptor(CompilerInterface):
...
@@ -290,6 +290,19 @@ class InductorAdaptor(CompilerInterface):
# Dynamo metrics context, see method for more details.
# Dynamo metrics context, see method for more details.
stack
.
enter_context
(
self
.
metrics_context
())
stack
.
enter_context
(
self
.
metrics_context
())
# Disable remote caching. When these are on, on remote cache-hit,
# the monkey-patched functions never actually get called.
# vLLM today assumes and requires the monkey-patched functions to
# get hit.
# TODO(zou3519): we're going to replace this all with
# standalone_compile sometime.
if
is_torch_equal_or_newer
(
"2.6"
):
stack
.
enter_context
(
torch
.
_inductor
.
config
.
patch
(
fx_graph_remote_cache
=
False
))
stack
.
enter_context
(
torch
.
_functorch
.
config
.
patch
(
enable_remote_autograd_cache
=
False
))
compiled_graph
=
compile_fx
(
compiled_graph
=
compile_fx
(
graph
,
graph
,
example_inputs
,
example_inputs
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment