Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
zhaoyu6
sglang
Commits
8a2681e2
Unverified
Commit
8a2681e2
authored
Dec 28, 2024
by
Ke Bao
Committed by
GitHub
Dec 28, 2024
Browse files
Update readme (#2625)
parent
5276a675
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
2 deletions
+3
-2
benchmark/deepseek_v3/README.md
benchmark/deepseek_v3/README.md
+3
-2
No files found.
benchmark/deepseek_v3/README.md
View file @
8a2681e2
...
@@ -18,8 +18,9 @@ If you see errors when launching the server, please check if it has finished dow
...
@@ -18,8 +18,9 @@ If you see errors when launching the server, please check if it has finished dow
### Using Docker (Recommended)
### Using Docker (Recommended)
```
bash
```
bash
docker run
--gpus
all
--shm-size
32g
-p
30000:30000
-v
~/.cache/huggingface:/root/.cache/huggingface
--ipc
=
host lmsysorg/sglang:latest
\
docker run
--gpus
all
--shm-size
32g
-p
30000:30000
-v
~/.cache/huggingface:/root/.cache/huggingface
--ipc
=
host lmsysorg/sglang:latest
\
python3
-m
sglang.launch_server
--model
deepseek-ai/DeepSeek-V3
--enable-dp-attention
--tp
8
--trust-remote-code
--port
30000
python3
-m
sglang.launch_server
--model
deepseek-ai/DeepSeek-V3
--tp
8
--trust-remote-code
--port
30000
```
```
For large QPS scenarios, you can add the
`--enable-dp-attention`
argument to improve throughput.
### Using pip
### Using pip
```
bash
```
bash
...
@@ -27,7 +28,7 @@ docker run --gpus all --shm-size 32g -p 30000:30000 -v ~/.cache/huggingface:/roo
...
@@ -27,7 +28,7 @@ docker run --gpus all --shm-size 32g -p 30000:30000 -v ~/.cache/huggingface:/roo
pip
install
"sglang[all]==0.4.1.post1"
--find-links
https://flashinfer.ai/whl/cu124/torch2.4/flashinfer
pip
install
"sglang[all]==0.4.1.post1"
--find-links
https://flashinfer.ai/whl/cu124/torch2.4/flashinfer
# Launch
# Launch
python3
-m
sglang.launch_server
--model
deepseek-ai/DeepSeek-V3
--enable-dp-attention
--tp
8
--trust-remote-code
python3
-m
sglang.launch_server
--model
deepseek-ai/DeepSeek-V3
--tp
8
--trust-remote-code
```
```
### Example with OpenAI API
### Example with OpenAI API
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment