Unverified Commit 59791431 authored by Zheng Zeng's avatar Zheng Zeng Committed by GitHub
Browse files

[doc] fix typo in opt inference tutorial (#2849)

parent 93534643
......@@ -50,7 +50,7 @@ python opt_fastapi.py <model> --queue_size <QueueSize>
```
The `<QueueSize>` can be an integer in `[0, MAXINT]`. If it's `0`, the request queue size is infinite. If it's a positive integer, when the request queue is full, incoming requests will be dropped (the HTTP status code of response will be 406).
### Configure bathcing
### Configure batching
```shell
python opt_fastapi.py <model> --max_batch_size <MaxBatchSize>
```
......@@ -85,4 +85,4 @@ Then open the web interface link which is on your console.
See [script/processing_ckpt_66b.py](./script/processing_ckpt_66b.py).
## OPT-175B
See [script/process-opt-175b](./script/process-opt-175b/).
\ No newline at end of file
See [script/process-opt-175b](./script/process-opt-175b/).
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment