Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
bc956b38
Unverified
Commit
bc956b38
authored
Jun 14, 2025
by
Huy Do
Committed by
GitHub
Jun 14, 2025
Browse files
Only build CUTLASS MoE kernels on Hopper (#19648)
parent
294fc1e2
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
2 deletions
+2
-2
CMakeLists.txt
CMakeLists.txt
+2
-2
No files found.
CMakeLists.txt
View file @
bc956b38
...
@@ -542,10 +542,10 @@ if(VLLM_GPU_LANG STREQUAL "CUDA")
...
@@ -542,10 +542,10 @@ if(VLLM_GPU_LANG STREQUAL "CUDA")
# CUTLASS MoE kernels
# CUTLASS MoE kernels
# The MoE kernel cutlass_moe_mm requires CUDA 12.3 or later (and
only
works
# The MoE kernel cutlass_moe_mm requires CUDA 12.3 or later (and
ONLY
works
# on Hopper). get_cutlass_(pplx_)moe_mm_data should only be compiled
# on Hopper). get_cutlass_(pplx_)moe_mm_data should only be compiled
# if it's possible to compile MoE kernels that use its output.
# if it's possible to compile MoE kernels that use its output.
cuda_archs_loose_intersection
(
SCALED_MM_ARCHS
"9.0a
;10.0a
"
"
${
CUDA_ARCHS
}
"
)
cuda_archs_loose_intersection
(
SCALED_MM_ARCHS
"9.0a"
"
${
CUDA_ARCHS
}
"
)
if
(
${
CMAKE_CUDA_COMPILER_VERSION
}
VERSION_GREATER_EQUAL 12.3 AND SCALED_MM_ARCHS
)
if
(
${
CMAKE_CUDA_COMPILER_VERSION
}
VERSION_GREATER_EQUAL 12.3 AND SCALED_MM_ARCHS
)
set
(
SRCS
"csrc/quantization/cutlass_w8a8/moe/grouped_mm_c3x.cu"
set
(
SRCS
"csrc/quantization/cutlass_w8a8/moe/grouped_mm_c3x.cu"
"csrc/quantization/cutlass_w8a8/moe/moe_data.cu"
)
"csrc/quantization/cutlass_w8a8/moe/moe_data.cu"
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment