Unverified Commit 8a2681e2 authored by Ke Bao's avatar Ke Bao Committed by GitHub
Browse files

Update readme (#2625)

parent 5276a675
...@@ -18,8 +18,9 @@ If you see errors when launching the server, please check if it has finished dow ...@@ -18,8 +18,9 @@ If you see errors when launching the server, please check if it has finished dow
### Using Docker (Recommended) ### Using Docker (Recommended)
```bash ```bash
docker run --gpus all --shm-size 32g -p 30000:30000 -v ~/.cache/huggingface:/root/.cache/huggingface --ipc=host lmsysorg/sglang:latest \ docker run --gpus all --shm-size 32g -p 30000:30000 -v ~/.cache/huggingface:/root/.cache/huggingface --ipc=host lmsysorg/sglang:latest \
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3 --enable-dp-attention --tp 8 --trust-remote-code --port 30000 python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3 --tp 8 --trust-remote-code --port 30000
``` ```
For large QPS scenarios, you can add the `--enable-dp-attention` argument to improve throughput.
### Using pip ### Using pip
```bash ```bash
...@@ -27,7 +28,7 @@ docker run --gpus all --shm-size 32g -p 30000:30000 -v ~/.cache/huggingface:/roo ...@@ -27,7 +28,7 @@ docker run --gpus all --shm-size 32g -p 30000:30000 -v ~/.cache/huggingface:/roo
pip install "sglang[all]==0.4.1.post1" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer pip install "sglang[all]==0.4.1.post1" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer
# Launch # Launch
python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3 --enable-dp-attention --tp 8 --trust-remote-code python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3 --tp 8 --trust-remote-code
``` ```
### Example with OpenAI API ### Example with OpenAI API
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment