Unverified Commit 6d1479ca authored by Reid's avatar Reid Committed by GitHub
Browse files

[doc] add the print result (#17584)


Signed-off-by: default avatarreidliu41 <reid201711@gmail.com>
Co-authored-by: default avatarreidliu41 <reid201711@gmail.com>
parent b8b0859b
...@@ -30,6 +30,7 @@ from vllm import LLM ...@@ -30,6 +30,7 @@ from vllm import LLM
model = LLM("facebook/opt-125m", quantization="fp8") model = LLM("facebook/opt-125m", quantization="fp8")
# INFO 06-10 17:55:42 model_runner.py:157] Loading model weights took 0.1550 GB # INFO 06-10 17:55:42 model_runner.py:157] Loading model weights took 0.1550 GB
result = model.generate("Hello, my name is") result = model.generate("Hello, my name is")
print(result[0].outputs[0].text)
``` ```
:::{warning} :::{warning}
...@@ -106,6 +107,7 @@ Load and run the model in `vllm`: ...@@ -106,6 +107,7 @@ Load and run the model in `vllm`:
from vllm import LLM from vllm import LLM
model = LLM("./Meta-Llama-3-8B-Instruct-FP8-Dynamic") model = LLM("./Meta-Llama-3-8B-Instruct-FP8-Dynamic")
model.generate("Hello my name is") model.generate("Hello my name is")
print(result[0].outputs[0].text)
``` ```
Evaluate accuracy with `lm_eval` (for example on 250 samples of `gsm8k`): Evaluate accuracy with `lm_eval` (for example on 250 samples of `gsm8k`):
...@@ -188,4 +190,5 @@ from vllm import LLM ...@@ -188,4 +190,5 @@ from vllm import LLM
model = LLM(model="Meta-Llama-3-8B-Instruct-FP8/") model = LLM(model="Meta-Llama-3-8B-Instruct-FP8/")
# INFO 06-10 21:15:41 model_runner.py:159] Loading model weights took 8.4596 GB # INFO 06-10 21:15:41 model_runner.py:159] Loading model weights took 8.4596 GB
result = model.generate("Hello, my name is") result = model.generate("Hello, my name is")
print(result[0].outputs[0].text)
``` ```
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment