Unverified Commit 359200f6 authored by Reid's avatar Reid Committed by GitHub
Browse files

[doc] fix link (#20417)


Signed-off-by: default avatarreidliu41 <reid201711@gmail.com>
parent 220aee90
......@@ -4,7 +4,7 @@ This script is used to profile the TPU performance of vLLM for specific prefill
Note: an actual running server is a mix of both prefill of many shapes and decode of many shapes.
We assume you are on a TPU already (this was tested on TPU v6e) and have installed vLLM according to the [installation guide](https://docs.vllm.ai/en/latest/getting_started/installation/ai_accelerator/index.html).
We assume you are on a TPU already (this was tested on TPU v6e) and have installed vLLM according to the [Google TPU installation guide](https://docs.vllm.ai/en/latest/getting_started/installation/google_tpu.html).
> In all examples below, we run several warmups before (so `--enforce-eager` is okay)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment