@@ -198,7 +199,7 @@ Instructions for supporting a new model are [here](https://github.com/sgl-projec
...
@@ -198,7 +199,7 @@ Instructions for supporting a new model are [here](https://github.com/sgl-projec
### Benchmark Performance
### Benchmark Performance
- Benchmark a single static batch. Run the following command without launching a server. The arguments are the same as those for `launch_server.py`.
- Benchmark a single static batch by running the following command without launching a server. The arguments are the same as those for `launch_server.py`. This is not a dynamic batching server, so it may run out of memory for a batch size that can run successfully with a real server. This is because a real server will truncate the prefill into several batches/chunks, while this unit test does not do this.