[Doc] Small update of DeepSeek v3.2 document (#12138)

97828878 · Baizhou Zhang · GitHub · c001deba · 97828878
Unverified Commit 97828878 authored Oct 25, 2025 by Baizhou Zhang Committed by GitHub Oct 25, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

docs/basic_usage/deepseek_v32.md docs/basic_usage/deepseek_v32.md +1 -1

No files found.
--- a/docs/basic_usage/deepseek_v32.md
+++ b/docs/basic_usage/deepseek_v32.md
@@ -48,7 +48,7 @@ python -m sglang.launch_server --model deepseek-ai/DeepSeek-V3.2-Exp --tp 8 --ep
 ```
 ### Configuration Tips
- **DP Attention**: For DeepSeek V3.2 model, the kernels are customized for the use case of `dp_size=8`. So
+- **DP Attention**: For DeepSeek V3.2 model, the kernels are customized for the use case of `dp_size=8`, so DP attention is enabled by default for better stability and performance. The feature of launching with pure TP is still under development.
 - **Choices of Attention Kernels**: The attention backend is automatically set to `nsa` attention backend for DeepSeek V3.2 model. In this backend, different kernels for sparse prefilling/decoding are implemented, which can be specified by `--nsa-prefill-backend` and `--nsa-decode-backend` server arguments. The choices of nsa prefill/decode attention kernels include:
  - `flashmla_sparse`: `flash_mla_sparse_fwd` kernel from `flash_mla` library. Can run on both Hopper and Blackwell GPUs.
  - `flashmla_kv`: `flash_mla_with_kvcache` kernel from `flash_mla` library. Can run on both Hopper and Blackwell GPUs.