Unverified Commit ef292944 authored by dagil-nvidia's avatar dagil-nvidia Committed by GitHub
Browse files

docs: fix markdown formatting in Distributed_Inference README (#5947)

parent cd3f9bbd
......@@ -69,6 +69,7 @@ aiconfigurator cli default --model LLAMA3.1_70B --total_gpus 16 --system h200_sx
```
and from the output, you can see the Pareto curve with the suggested P/D settings
![text](images/pareto.png)
3. Start the serving with 1 prefill worker with tensor parallelism 4 and 1 decoding worker with tensor parallelism 8 as AI Configurator suggested. Update the `my-tag` in `disagg_router.yaml` with the latest Dynamo version and your local cache folder path and run following command.
![text](images/settings.png)
```sh
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment