Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
5d5146ee
Unverified
Commit
5d5146ee
authored
Oct 02, 2025
by
Michael Goin
Committed by
GitHub
Oct 02, 2025
Browse files
[CI/Build] Conditionally register cutlass_fp4_group_mm to fix building on Hopper (#26138)
Signed-off-by:
mgoin
<
mgoin64@gmail.com
>
parent
2aaa4238
Changes
2
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
7 additions
and
1 deletion
+7
-1
csrc/quantization/fp4/nvfp4_blockwise_moe_kernel.cu
csrc/quantization/fp4/nvfp4_blockwise_moe_kernel.cu
+6
-0
csrc/torch_bindings.cpp
csrc/torch_bindings.cpp
+1
-1
No files found.
csrc/quantization/fp4/nvfp4_blockwise_moe_kernel.cu
View file @
5d5146ee
...
@@ -14,6 +14,8 @@
...
@@ -14,6 +14,8 @@
* limitations under the License.
* limitations under the License.
*/
*/
#include "core/registration.h"
#include <torch/all.h>
#include <torch/all.h>
#include <cutlass/arch/arch.h>
#include <cutlass/arch/arch.h>
...
@@ -418,3 +420,7 @@ void cutlass_fp4_group_mm(
...
@@ -418,3 +420,7 @@ void cutlass_fp4_group_mm(
"12.8 or above."
);
"12.8 or above."
);
#endif
#endif
}
}
TORCH_LIBRARY_IMPL_EXPAND
(
TORCH_EXTENSION_NAME
,
CUDA
,
m
)
{
m
.
impl
(
"cutlass_fp4_group_mm"
,
&
cutlass_fp4_group_mm
);
}
csrc/torch_bindings.cpp
View file @
5d5146ee
...
@@ -397,7 +397,7 @@ TORCH_LIBRARY_EXPAND(TORCH_EXTENSION_NAME, ops) {
...
@@ -397,7 +397,7 @@ TORCH_LIBRARY_EXPAND(TORCH_EXTENSION_NAME, ops) {
" Tensor a_blockscale, Tensor b_blockscales, Tensor alphas,"
" Tensor a_blockscale, Tensor b_blockscales, Tensor alphas,"
" Tensor problem_sizes, Tensor expert_offsets, Tensor sf_offsets) -> ()"
,
" Tensor problem_sizes, Tensor expert_offsets, Tensor sf_offsets) -> ()"
,
{
stride_tag
});
{
stride_tag
});
ops
.
impl
(
"cutlass_fp4_group_mm"
,
torch
::
kCUDA
,
&
cutlass_fp4_group_mm
);
// conditionally compiled so impl registration is in source file
// CUTLASS w8a8 GEMM, supporting symmetric per-tensor or per-row/column
// CUTLASS w8a8 GEMM, supporting symmetric per-tensor or per-row/column
// quantization, as well as bias
// quantization, as well as bias
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment