Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
OpenDAS
vllm_cscc
Commits
1cbbcfe8
Unverified
Commit
1cbbcfe8
authored
Mar 23, 2026
by
Nicolò Lucchesi
Committed by
GitHub
Mar 23, 2026
Browse files
[CI][PD] Add Hybrid SSM integration tests to CI (#37657)
Signed-off-by:
NickLucche
<
nlucches@redhat.com
>
parent
aceadb5e
Changes
3
Show whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
14 additions
and
2 deletions
+14
-2
.buildkite/test_areas/distributed.yaml
.buildkite/test_areas/distributed.yaml
+11
-0
tests/v1/kv_connector/nixl_integration/config_sweep_accuracy_test.sh
..._connector/nixl_integration/config_sweep_accuracy_test.sh
+2
-2
tests/v1/kv_connector/nixl_integration/test_accuracy.py
tests/v1/kv_connector/nixl_integration/test_accuracy.py
+1
-0
No files found.
.buildkite/test_areas/distributed.yaml
View file @
1cbbcfe8
...
@@ -257,6 +257,17 @@ steps:
...
@@ -257,6 +257,17 @@ steps:
-
uv pip install --system -r /vllm-workspace/requirements/kv_connectors.txt
-
uv pip install --system -r /vllm-workspace/requirements/kv_connectors.txt
-
CROSS_LAYERS_BLOCKS=True bash v1/kv_connector/nixl_integration/config_sweep_accuracy_test.sh
-
CROSS_LAYERS_BLOCKS=True bash v1/kv_connector/nixl_integration/config_sweep_accuracy_test.sh
-
label
:
Hyrbid SSM NixlConnector PD accuracy tests (4 GPUs)
timeout_in_minutes
:
20
working_dir
:
"
/vllm-workspace/tests"
num_devices
:
4
source_file_dependencies
:
-
vllm/distributed/kv_transfer/kv_connector/v1/nixl_connector.py
-
tests/v1/kv_connector/nixl_integration/
commands
:
-
uv pip install --system -r /vllm-workspace/requirements/kv_connectors.txt
-
HYBRID_SSM=1 bash v1/kv_connector/nixl_integration/config_sweep_accuracy_test.sh
-
label
:
NixlConnector PD + Spec Decode acceptance (2 GPUs)
-
label
:
NixlConnector PD + Spec Decode acceptance (2 GPUs)
timeout_in_minutes
:
30
timeout_in_minutes
:
30
device
:
a100
device
:
a100
...
...
tests/v1/kv_connector/nixl_integration/config_sweep_accuracy_test.sh
View file @
1cbbcfe8
...
@@ -19,9 +19,9 @@ dp_ep_configs=(
...
@@ -19,9 +19,9 @@ dp_ep_configs=(
"DP_EP=1 GPU_MEMORY_UTILIZATION=0.8 PREFILLER_TP_SIZE=2 DECODER_TP_SIZE=2 MODEL_NAMES=deepseek-ai/deepseek-vl2-tiny"
# MLA+P-TP2, D-DPEP=2 (TP=1)
"DP_EP=1 GPU_MEMORY_UTILIZATION=0.8 PREFILLER_TP_SIZE=2 DECODER_TP_SIZE=2 MODEL_NAMES=deepseek-ai/deepseek-vl2-tiny"
# MLA+P-TP2, D-DPEP=2 (TP=1)
)
)
hybrid_ssm_configs
=(
hybrid_ssm_configs
=(
"ENABLE_HMA_FLAG=1 GPU_MEMORY_UTILIZATION=0.8 MODEL_NAMES=
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
VLLM_SERVE_EXTRA_ARGS=--max-model-len,8192,--trust-remote-code"
"ENABLE_HMA_FLAG=1 GPU_MEMORY_UTILIZATION=0.8 MODEL_NAMES=
ibm-granite/granite-4.0-h-tiny
VLLM_SERVE_EXTRA_ARGS=--max-model-len,8192,--trust-remote-code"
# TODO: (NickLucche) Address async scheduling issue with TP>1 separately as this may impact other models.
# TODO: (NickLucche) Address async scheduling issue with TP>1 separately as this may impact other models.
"ENABLE_HMA_FLAG=1 PREFILLER_TP_SIZE=2 DECODER_TP_SIZE=2 GPU_MEMORY_UTILIZATION=0.8 MODEL_NAMES=
nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8
VLLM_SERVE_EXTRA_ARGS=--max-model-len,8192,--trust-remote-code,--no-async-scheduling"
"ENABLE_HMA_FLAG=1 PREFILLER_TP_SIZE=2 DECODER_TP_SIZE=2 GPU_MEMORY_UTILIZATION=0.8 MODEL_NAMES=
ibm-granite/granite-4.0-h-tiny
VLLM_SERVE_EXTRA_ARGS=--max-model-len,8192,--trust-remote-code,--no-async-scheduling"
)
)
# Select config array based on DP_EP env var
# Select config array based on DP_EP env var
...
...
tests/v1/kv_connector/nixl_integration/test_accuracy.py
View file @
1cbbcfe8
...
@@ -19,6 +19,7 @@ EXPECTED_VALUES = {
...
@@ -19,6 +19,7 @@ EXPECTED_VALUES = {
"deepseek-ai/DeepSeek-V2-Lite-Chat"
:
0.65
,
"deepseek-ai/DeepSeek-V2-Lite-Chat"
:
0.65
,
"google/gemma-3-4b-it"
:
0.74
,
"google/gemma-3-4b-it"
:
0.74
,
"nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8"
:
0.84
,
"nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-FP8"
:
0.84
,
"ibm-granite/granite-4.0-h-tiny"
:
0.80
,
}
}
SIMPLE_PROMPT
=
(
SIMPLE_PROMPT
=
(
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment