Unverified Commit 83f7bbb3 authored by TankNee's avatar TankNee Committed by GitHub
Browse files

Add chat doc in quick start (#21213)


Co-authored-by: default avatarCyrus Leung <cyrus.tl.leung@gmail.com>
parent b5dfb94f
...@@ -98,6 +98,43 @@ for output in outputs: ...@@ -98,6 +98,43 @@ for output in outputs:
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}") print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
``` ```
!!! note
The `llm.generate` method does not automatically apply the model's chat template to the input prompt. Therefore, if you are using an Instruct model or Chat model, you should manually apply the corresponding chat template to ensure the expected behavior. Alternatively, you can use the `llm.chat` method and pass a list of messages which have the same format as those passed to OpenAI's `client.chat.completions`:
??? code
```python
# Using tokenizer to apply chat template
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("/path/to/chat_model")
messages_list = [
[{"role": "user", "content": prompt}]
for prompt in prompts
]
texts = tokenizer.apply_chat_template(
messages_list,
tokenize=False,
add_generation_prompt=True,
)
# Generate outputs
outputs = llm.generate(texts, sampling_params)
# Print the outputs.
for output in outputs:
prompt = output.prompt
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
# Using chat interface.
outputs = llm.chat(messages_list, sampling_params)
for idx, output in enumerate(outputs):
prompt = prompts[idx]
generated_text = output.outputs[0].text
print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")
```
[](){ #quickstart-online } [](){ #quickstart-online }
## OpenAI-Compatible Server ## OpenAI-Compatible Server
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment