Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
9d5b4e4d
Unverified
Commit
9d5b4e4d
authored
Nov 11, 2024
by
Woosuk Kwon
Committed by
GitHub
Nov 11, 2024
Browse files
[V1] Enable custom ops with piecewise CUDA graphs (#10228)
Signed-off-by:
Woosuk Kwon
<
woosuk.kwon@berkeley.edu
>
parent
8a7fe47d
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
0 deletions
+2
-0
vllm/v1/worker/gpu_model_runner.py
vllm/v1/worker/gpu_model_runner.py
+2
-0
No files found.
vllm/v1/worker/gpu_model_runner.py
View file @
9d5b4e4d
import
os
import
time
import
time
from
dataclasses
import
dataclass
from
dataclasses
import
dataclass
from
typing
import
TYPE_CHECKING
,
Dict
,
List
,
Optional
,
Set
from
typing
import
TYPE_CHECKING
,
Dict
,
List
,
Optional
,
Set
...
@@ -405,6 +406,7 @@ class GPUModelRunner:
...
@@ -405,6 +406,7 @@ class GPUModelRunner:
if
self
.
use_cuda_graph
:
if
self
.
use_cuda_graph
:
# FIXME(woosuk): Currently, we do not use inductor to reduce the
# FIXME(woosuk): Currently, we do not use inductor to reduce the
# compilation time and any potential issues with the inductor.
# compilation time and any potential issues with the inductor.
os
.
environ
[
"VLLM_CUSTOM_OPS"
]
=
"all"
set_compilation_config
(
set_compilation_config
(
CompilationConfig
(
CompilationConfig
(
use_cudagraph
=
True
,
use_cudagraph
=
True
,
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment