Unverified Commit f211331c authored by Reid's avatar Reid Committed by GitHub
Browse files

[Doc] small fix (#17277)


Signed-off-by: default avatarreidliu41 <reid201711@gmail.com>
Co-authored-by: default avatarreidliu41 <reid201711@gmail.com>
parent 9053d0b1
......@@ -59,7 +59,7 @@ A code example can be found here: <gh-file:examples/offline_inference/basic/basi
### `LLM.beam_search`
The {class}`~vllm.LLM.beam_search` method implements [beam search](https://huggingface.co/docs/transformers/en/generation_strategies#beam-search-decoding) on top of {class}`~vllm.LLM.generate`.
The {class}`~vllm.LLM.beam_search` method implements [beam search](https://huggingface.co/docs/transformers/en/generation_strategies#beam-search) on top of {class}`~vllm.LLM.generate`.
For example, to search using 5 beams and output at most 50 tokens:
```python
......
......@@ -793,6 +793,8 @@ or `--limit-mm-per-prompt` (online serving). For example, to enable passing up t
Offline inference:
```python
from vllm import LLM
llm = LLM(
model="Qwen/Qwen2-VL-7B-Instruct",
limit_mm_per_prompt={"image": 4},
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment