docs: Remove invalid link (#506)

b865bd4f · Graham King · GitHub · e38e5abf · b865bd4f
Commit b865bd4f authored Apr 03, 2025 by Graham King Committed by GitHub Apr 03, 2025
Show whitespace changes
Inline Side-by-side

Showing with 1 addition and 22 deletions

docs/guides/dynamo_run.md docs/guides/dynamo_run.md +1 -22

No files found.
--- a/docs/guides/dynamo_run.md
+++ b/docs/guides/dynamo_run.md
@@ -357,31 +357,10 @@ Run:
 dynamo-run in=text out=trtllm --model-path /app/trtllm_engine/ --model-config ~/llm_models/Llama-3.2-3B-Instruct/
 ```

-Note that TRT-LLM uses it's own `.engine` format for weights. Repo models must be converted like so:
-
-+ Get the build container
-```
-docker run --gpus all -it nvcr.io/nvidian/nemo-llm/trtllm-engine-builder:0.2.0 bash
-```
-
-+ Fetch the model and convert
-```
-mkdir /tmp/model
-huggingface-cli download meta-llama/Llama-3.2-3B-Instruct --local-dir /tmp/model
-python convert_checkpoint.py --model_dir /tmp/model/ --output_dir ./converted --dtype [float16|bfloat16|whatever you want] --tp_size X --pp_size Y
-trtllm-build --checkpoint_dir ./converted --output_dir ./final/trtllm_engine --use_paged_context_fmha enable --gemm_plugin auto
-```
+Note that TRT-LLM uses it's own `.engine` format for weights.

 The `--model-path` you give to `dynamo-run` must contain the `config.json` (TRT-LLM's , not the model's) and `rank0.engine` (plus other ranks if relevant).

-+ Execute
-TRT-LLM is a C++ library that must have been previously built and installed. It needs a lot of memory to compile. Gitlab builds a container you can try:
-```
-sudo docker run --gpus all -it -v /home/user:/outside-home gitlab-master.nvidia.com:5005/dl/ai-services/libraries/rust/nim-nvllm/tensorrt_llm_runtime:85fa4a6f
-```
-
-Copy the trt-llm engine, the model's `.json` files (for the model deployment card) and the `nio` binary built for the correct glibc (container is Ubuntu 22.04 currently) into that container.
-
 ## Echo Engines

 Dynamo includes two echo engines for testing and debugging purposes: