Hotfixing the link. (#2811)

a70dd299 · Nicolas Patry · GitHub · 042791fb · a70dd299
Unverified Commit a70dd299 authored Dec 10, 2024 by Nicolas Patry Committed by GitHub Dec 09, 2024
Show whitespace changes
Inline Side-by-side

Showing with 1 addition and 1 deletion

docs/source/conceptual/chunking.md docs/source/conceptual/chunking.md +1 -1

No files found.
--- a/docs/source/conceptual/chunking.md
+++ b/docs/source/conceptual/chunking.md
@@ -72,7 +72,7 @@ Long:  `MODEL_ID=$MODEL_ID  HOST=localhost:8000 k6 run load_tests/long.js`

 ### Results

-![benchmarks_v3](https://github.com/huggingface/text-generation-inference/blob/main/assets/v3_benchmarks.png)
+![benchmarks_v3](https://github.com/huggingface/text-generation-inference/blob/042791fbd5742b1644d42c493db6bec669df6537/assets/v3_benchmarks.png)

 Our benchmarking results show significant performance gains, with a 13x speedup over vLLM with prefix caching, and up to 30x speedup without prefix caching. These results are consistent with our production data and demonstrate the effectiveness of our optimized LLM architecture.