Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
300a59c4
Unverified
Commit
300a59c4
authored
Oct 03, 2025
by
Matthew Bonanni
Committed by
GitHub
Oct 03, 2025
Browse files
Avoid division by zero in cache DS MLA kernel (#26174)
Signed-off-by:
Matthew Bonanni
<
mbonanni@redhat.com
>
parent
d76541a6
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
1 deletion
+2
-1
csrc/cache_kernels.cu
csrc/cache_kernels.cu
+2
-1
No files found.
csrc/cache_kernels.cu
View file @
300a59c4
...
@@ -16,7 +16,7 @@
...
@@ -16,7 +16,7 @@
#include <algorithm>
#include <algorithm>
#include <cassert>
#include <cassert>
#include <cfloat>
// FLT_MIN
#include <cfloat>
#ifdef USE_ROCM
#ifdef USE_ROCM
#include <hip/hip_bf16.h>
#include <hip/hip_bf16.h>
...
@@ -479,6 +479,7 @@ __global__ void concat_and_cache_ds_mla_kernel(
...
@@ -479,6 +479,7 @@ __global__ void concat_and_cache_ds_mla_kernel(
// Compute the scale for the tile
// Compute the scale for the tile
float
tile_scale
=
max_abs
/
448.
f
;
float
tile_scale
=
max_abs
/
448.
f
;
tile_scale
=
fmaxf
(
tile_scale
,
FLT_MIN
);
// The first lane of each half-warp writes the scale to kv_cache
// The first lane of each half-warp writes the scale to kv_cache
if
((
lane_idx
==
0
)
||
(
lane_idx
==
16
))
{
if
((
lane_idx
==
0
)
||
(
lane_idx
==
16
))
{
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment