Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
5f2a473f
Unverified
Commit
5f2a473f
authored
Jan 08, 2026
by
Andreas Karatzas
Committed by
GitHub
Jan 08, 2026
Browse files
[ROCm][CI] v1 cpu offloading attention backend fix (#31833)
Signed-off-by:
Andreas Karatzas
<
akaratza@amd.com
>
parent
6b2a672e
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
2 deletions
+4
-2
tests/v1/kv_offload/test_cpu_offloading.py
tests/v1/kv_offload/test_cpu_offloading.py
+4
-2
No files found.
tests/v1/kv_offload/test_cpu_offloading.py
View file @
5f2a473f
...
@@ -15,10 +15,12 @@ from vllm.distributed.kv_events import BlockStored, KVEventBatch
...
@@ -15,10 +15,12 @@ from vllm.distributed.kv_events import BlockStored, KVEventBatch
from
vllm.platforms
import
current_platform
from
vllm.platforms
import
current_platform
CPU_BLOCK_SIZES
=
[
48
]
CPU_BLOCK_SIZES
=
[
48
]
ATTN_BACKENDS
=
[
"FLASH_ATTN"
,
"TRITON_ATTN"
]
ATTN_BACKENDS
=
[]
if
current_platform
.
is_cuda
():
if
current_platform
.
is_cuda
():
ATTN_BACKENDS
.
append
(
"FLASHINFER"
)
ATTN_BACKENDS
=
[
"FLASH_ATTN"
,
"FLASHINFER"
,
"TRITON_ATTN"
]
elif
current_platform
.
is_rocm
():
ATTN_BACKENDS
=
[
"TRITON_ATTN"
]
class
MockSubscriber
:
class
MockSubscriber
:
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment