Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
db561fb6
Commit
db561fb6
authored
Feb 04, 2025
by
zhuwenwen
Browse files
update cache_kernels.cu
parent
afd0da21
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
2 additions
and
1 deletion
+2
-1
csrc/cache_kernels.cu
csrc/cache_kernels.cu
+2
-1
No files found.
csrc/cache_kernels.cu
View file @
db561fb6
...
...
@@ -371,6 +371,7 @@ __global__ void read_cache_kernel(
value
[
tgt_value_idx
]
=
fp8
::
scaled_convert
<
scalar_t
,
cache_t
,
kv_dt
>
(
tgt_value
,
1.0
);
}
}
}
template
<
typename
scalar_t
,
typename
cache_t
,
Fp8KVCacheDataType
kv_dt
>
...
...
@@ -660,6 +661,7 @@ void write_cache_multi_layers(
}
#define CALL_CONCAT_AND_CACHE_MLA(KV_T, CACHE_T, KV_DTYPE) \
vllm::concat_and_cache_mla_kernel<KV_T, CACHE_T, KV_DTYPE> \
<<<grid, block, 0, stream>>>( \
...
...
@@ -707,7 +709,6 @@ void concat_and_cache_mla(
CALL_CONCAT_AND_CACHE_MLA
);
}
namespace
vllm
{
template
<
typename
Tout
,
typename
Tin
,
Fp8KVCacheDataType
kv_dt
>
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment