fix typo

f90db8bc · Yineng Zhang · GitHub · d8ad5970 · f90db8bc
Unverified Commit f90db8bc authored Feb 08, 2025 by Yineng Zhang Committed by GitHub Feb 08, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

benchmark/deepseek_v3/README.md benchmark/deepseek_v3/README.md +1 -1

No files found.
--- a/benchmark/deepseek_v3/README.md
+++ b/benchmark/deepseek_v3/README.md
@@ -131,7 +131,7 @@ docker run --gpus all \
    python3 -m sglang.bench_serving --backend sglang --dataset-name random --random-input 1 --random-output 512 --random-range-ratio 1 --num-prompts 1 --host 0.0.0.0 --port 40000 --output-file "deepseekv3_multinode.jsonl"
 ```

-### Example: Serving with four A100*4 nodes
+### Example: Serving with four A100*8 nodes
 To serve DeepSeek-V3 with A100 GPUs, we need to convert the [FP8 model checkpoints](https://huggingface.co/deepseek-ai/DeepSeek-V3) to BF16 with [script](https://github.com/deepseek-ai/DeepSeek-V3/blob/main/inference/fp8_cast_bf16.py) mentioned [here](https://github.com/deepseek-ai/DeepSeek-V3/blob/main/inference/fp8_cast_bf16.py) first.

 Since the BF16 model is over 1.3 TB, we need to prepare four A100 nodes, each with 8 80GB GPUs. Assume the first node's IP is `10.0.0.1`, and the converted model path is `/path/to/DeepSeek-V3-BF16`, we can have following commands to launch the server.