Unverified Commit 73e0f8ca authored by J Wyman's avatar J Wyman Committed by GitHub
Browse files

docs: Fix Markdown Render Error (#1502)

parent 0e7d4d82
...@@ -85,8 +85,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a ...@@ -85,8 +85,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a
./container/run.sh --mount-workspace ./container/run.sh --mount-workspace
``` ```
> [!Tip] > [!Tip]
> The huggingface home source mount can be changed by setting `--hf-cache ~/.cache/huggingface`. > The huggingface home source mount can be changed by setting `--hf-cache ~/.cache/huggingface`.
2. Start disaggregated services 2. Start disaggregated services
...@@ -95,8 +95,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a ...@@ -95,8 +95,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a
dynamo serve benchmarks.disagg:Frontend -f benchmarks/disagg.yaml 1> disagg.log 2>&1 & dynamo serve benchmarks.disagg:Frontend -f benchmarks/disagg.yaml 1> disagg.log 2>&1 &
``` ```
> [!Tip] > [!Tip]
> Check the `disagg.log` to make sure the service is fully started before collecting performance numbers. > Check the `disagg.log` to make sure the service is fully started before collecting performance numbers.
3. Collect the performance numbers: 3. Collect the performance numbers:
...@@ -130,8 +130,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a ...@@ -130,8 +130,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a
./container/run.sh --mount-workspace ./container/run.sh --mount-workspace
``` ```
> [!Tip] > [!Tip]
> The huggingface home source mount can be changed by setting `--hf-cache ~/.cache/huggingface`. > The huggingface home source mount can be changed by setting `--hf-cache ~/.cache/huggingface`.
2. Config NATS and ETCD (node 1) 2. Config NATS and ETCD (node 1)
...@@ -140,8 +140,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a ...@@ -140,8 +140,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a
export ETCD_ENDPOINTS="<node_0_ip_addr>:2379" export ETCD_ENDPOINTS="<node_0_ip_addr>:2379"
``` ```
> [!Important] > [!Important]
> Node 1 must be able to reach Node 0 over the network for the above services. > Node 1 must be able to reach Node 0 over the network for the above services.
3. Start workers (node 0) 3. Start workers (node 0)
...@@ -150,8 +150,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a ...@@ -150,8 +150,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a
dynamo serve benchmarks.disagg_multinode:Frontend -f benchmarks/disagg_multinode.yaml 1> disagg_multinode.log 2>&1 & dynamo serve benchmarks.disagg_multinode:Frontend -f benchmarks/disagg_multinode.yaml 1> disagg_multinode.log 2>&1 &
``` ```
> [!Tip] > [!Tip]
> Check the `disagg_multinode.log` to make sure the service is fully started before collecting performance numbers. > Check the `disagg_multinode.log` to make sure the service is fully started before collecting performance numbers.
4. Start workers (node 1) 4. Start workers (node 1)
...@@ -160,8 +160,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a ...@@ -160,8 +160,8 @@ With the Dynamo repository, benchmarking image and model available, and **NATS a
dynamo serve components.prefill_worker:PrefillWorker -f benchmarks/disagg_multinode.yaml 1> prefill_multinode.log 2>&1 & dynamo serve components.prefill_worker:PrefillWorker -f benchmarks/disagg_multinode.yaml 1> prefill_multinode.log 2>&1 &
``` ```
> [!Tip] > [!Tip]
> Check the `prefill_multinode.log` to make sure the service is fully started before collecting performance numbers. > Check the `prefill_multinode.log` to make sure the service is fully started before collecting performance numbers.
5. Collect the performance numbers: 5. Collect the performance numbers:
...@@ -188,8 +188,8 @@ With the Dynamo repository and the benchmarking image available, perform the fol ...@@ -188,8 +188,8 @@ With the Dynamo repository and the benchmarking image available, perform the fol
./container/run.sh --mount-workspace ./container/run.sh --mount-workspace
``` ```
> [!Tip] > [!Tip]
> The Hugging Face home source mount can be changed by setting `--hf-cache ~/.cache/huggingface`. > The Hugging Face home source mount can be changed by setting `--hf-cache ~/.cache/huggingface`.
2. Start vLLM serve 2. Start vLLM serve
...@@ -212,10 +212,10 @@ With the Dynamo repository and the benchmarking image available, perform the fol ...@@ -212,10 +212,10 @@ With the Dynamo repository and the benchmarking image available, perform the fol
--port 8002 1> vllm_1.log 2>&1 & --port 8002 1> vllm_1.log 2>&1 &
``` ```
> [!Tip] > [!Tip]
> Check the `vllm_0.log` and `vllm_1.log` to make sure the service is fully started before collecting performance numbers. > Check the `vllm_0.log` and `vllm_1.log` to make sure the service is fully started before collecting performance numbers.
> >
> If benchmarking with two or more nodes, `--tensor-parallel-size 8` should be used and only run one `vllm serve` instance per node. > If benchmarking with two or more nodes, `--tensor-parallel-size 8` should be used and only run one `vllm serve` instance per node.
3. Use NGINX as load balancer 3. Use NGINX as load balancer
...@@ -225,8 +225,8 @@ With the Dynamo repository and the benchmarking image available, perform the fol ...@@ -225,8 +225,8 @@ With the Dynamo repository and the benchmarking image available, perform the fol
service nginx restart service nginx restart
``` ```
> [!Note] > [!Note]
> If benchmarking over 2 nodes, the `upstream` configuration will need to be updated to link to the `vllm serve` on the second node. > If benchmarking over 2 nodes, the `upstream` configuration will need to be updated to link to the `vllm serve` on the second node.
4. Collect the performance numbers: 4. Collect the performance numbers:
...@@ -258,7 +258,8 @@ Note: As each `perf.sh` adds a new artifacts directory in the `artifacts_root` a ...@@ -258,7 +258,8 @@ Note: As each `perf.sh` adds a new artifacts directory in the `artifacts_root` a
> @ [GitHub](https://github.com/triton-inference-server/perf_analyzer) for additional information about how to run GenAI-Perf > @ [GitHub](https://github.com/triton-inference-server/perf_analyzer) for additional information about how to run GenAI-Perf
> and how to interpret results. > and how to interpret results.
## Iterpreting Results
## Interpreting Results
### Plotting Pareto Graphs ### Plotting Pareto Graphs
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment