"git@developer.sourcefind.cn:renzhc/diffusers_dcu.git" did not exist on "9c82b68f074df116f0c5044d20a0cf9c0086b5ac"
Unverified Commit 484d0e02 authored by Qiaolin Yu's avatar Qiaolin Yu Committed by GitHub
Browse files

doc: add bench_one_batch_server in the benchmark doc (#8441)

parent 5922c0cb
...@@ -4,10 +4,15 @@ ...@@ -4,10 +4,15 @@
- Benchmark the latency of running a single static batch without a server. The arguments are the same as for `launch_server.py`. - Benchmark the latency of running a single static batch without a server. The arguments are the same as for `launch_server.py`.
Note that this is a simplified test script without a dynamic batching server, so it may run out of memory for a batch size that a real server can handle. A real server truncates the prefill into several batches, while this simplified script does not. Note that this is a simplified test script without a dynamic batching server, so it may run out of memory for a batch size that a real server can handle. A real server truncates the prefill into several batches, while this simplified script does not.
- Without a server (do not need to launch a server)
```bash
python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32
```
- With a server (please use `sglang.launch_server` to launch a server first and run the following command.)
```bash
python -m sglang.bench_one_batch_server --base-url http://127.0.0.1:30000 --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch-size 32 --input-len 256 --output-len 32
```
```bash
python -m sglang.bench_one_batch --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --batch 32 --input-len 256 --output-len 32
```
- Benchmark offline processing. This script will start an offline engine and run the benchmark. - Benchmark offline processing. This script will start an offline engine and run the benchmark.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment