chore: Add note on vLLM DSR1 gibberish outputs upstream issue (#5353)

Signed-off-by: Piotr Tarasiewicz <ptarasiewicz@nvidia.com>

chore: Add note on vLLM DSR1 gibberish outputs upstream issue (#5353)
Signed-off-by: Piotr Tarasiewicz <ptarasiewicz@nvidia.com>
efc9ebf0 · ptarasiewiczNV · GitHub · ae03c857 · efc9ebf0
Unverified Commit efc9ebf0 authored Jan 12, 2026 by ptarasiewiczNV Committed by GitHub Jan 12, 2026
Show whitespace changes
Inline Side-by-side

Showing with 1 addition and 0 deletions

recipes/deepseek-r1/vllm/disagg/README.md recipes/deepseek-r1/vllm/disagg/README.md +1 -0

No files found.
--- a/recipes/deepseek-r1/vllm/disagg/README.md
+++ b/recipes/deepseek-r1/vllm/disagg/README.md
@@ -93,5 +93,6 @@ curl -sS http://localhost:8000/v1/chat/completions \
 - If your storage class differs, update `storageClassName` before applying the PVC.
 - **If you want to run multinode deployments, IBGDA (InfiniBand GPU Direct Async) must be enabled on your nodes.** To enable IBGDA, you can follow this configuration script: [configure_system_drivers.sh](https://github.com/vllm-project/vllm/blob/v0.11.2/tools/ep_kernels/configure_system_drivers.sh). The script configures NVIDIA driver parameters and requires a system reboot to take effect.
 - `VLLM_MOE_DP_CHUNK_SIZE` can be tuned further. The value 384 was chosen to be largest possible that still can be deployed on 16 H200s. This value should be greater than per rank concurrency.
+- Starting with vLLM v0.12.0 (Dynamo v0.8.0) DeepSeek-R1 in this configuration might return gibberish outputs, please track the upstream issue [vLLM #32190](https://github.com/vllm-project/vllm/issues/32190).