[doc] add the print result (#17584)

Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>

[doc] add the print result (#17584)
Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>
6d1479ca · Reid · GitHub · b8b0859b · 6d1479ca
Unverified Commit 6d1479ca authored May 02, 2025 by Reid Committed by GitHub May 02, 2025
Show whitespace changes
Inline Side-by-side

Showing with 3 additions and 0 deletions

docs/source/features/quantization/fp8.md docs/source/features/quantization/fp8.md +3 -0

No files found.
--- a/docs/source/features/quantization/fp8.md
+++ b/docs/source/features/quantization/fp8.md
@@ -30,6 +30,7 @@ from vllm import LLM
 model = LLM("facebook/opt-125m", quantization="fp8")
 # INFO 06-10 17:55:42 model_runner.py:157] Loading model weights took 0.1550 GB
 result = model.generate("Hello, my name is")
+print(result[0].outputs[0].text)
 ```
 :::{warning}
@@ -106,6 +107,7 @@ Load and run the model in `vllm`:
 from vllm import LLM
 model = LLM("./Meta-Llama-3-8B-Instruct-FP8-Dynamic")
 model.generate("Hello my name is")
+print(result[0].outputs[0].text)
 ```
 Evaluate accuracy with `lm_eval` (for example on 250 samples of `gsm8k`):
@@ -188,4 +190,5 @@ from vllm import LLM
 model = LLM(model="Meta-Llama-3-8B-Instruct-FP8/")
 # INFO 06-10 21:15:41 model_runner.py:159] Loading model weights took 8.4596 GB
 result = model.generate("Hello, my name is")
+print(result[0].outputs[0].text)
 ```