chore: Make debug profile use all optimizations (#317)

It hardly slows the build down, and it makes things run much faster. That allows us to switch to the debug (default) profile for development, and keep the release profile for, well, releasing. Motivated by changes in https://github.com/ai-dynamo/dynamo/pull/279

chore: Make debug profile use all optimizations (#317)
It hardly slows the build down, and it makes things run much faster. That allows us to switch to the debug (default) profile for development, and keep the release profile for, well, releasing. Motivated by changes in https://github.com/ai-dynamo/dynamo/pull/279
00e54337 · Graham King · GitHub · d4d93b6a · 00e54337 · 00e54337
Commit 00e54337 authored Mar 20, 2025 by Graham King Committed by GitHub Mar 20, 2025
Show whitespace changes
Inline Side-by-side

Showing with 17 additions and 10 deletions

Cargo.toml Cargo.toml +5 -0

docs/guides/dynamo_run.md docs/guides/dynamo_run.md +12 -10

No files found.
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -71,6 +71,11 @@ prometheus = { version = "0.13" }
 [profile.dev.package]
 insta.opt-level = 3

+[profile.dev]
+# release level optimizations otherwise everything feels slow
+opt-level = 3
+
 [profile.release]
+# These make the build much slower but shrink the binary, and could help performance
 codegen-units = 1
 lto = true
--- a/docs/guides/dynamo_run.md
+++ b/docs/guides/dynamo_run.md
@@ -116,24 +116,26 @@ Optionally can run `cargo build` from any location with arguments:

 - Linux with GPU and CUDA (tested on Ubuntu):
 ```
-cargo build --release --features mistralrs,cuda
+cargo build --features cuda
 ```

 - macOS with Metal:
 ```
-cargo build --release --features mistralrs,metal
+cargo build --features metal
 ```

 - CPU only:
 ```
-cargo build --release --features mistralrs
+cargo build
 ```

-The binary will be called `dynamo-run` in `target/release`
+The binary will be called `dynamo-run` in `target/debug`
 ```
-cd target/release
+cd target/debug
 ```

+Build with `--release` for a smaller binary and better performance, but longer build times. The binary will be in `target/release`.
+
 ## sglang

 1. Setup the python virtual env:
@@ -149,7 +151,7 @@ uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124
 2. Build

 ```
-cargo build --release --features sglang
+cargo build --features sglang
 ```

 3. Run
@@ -168,7 +170,7 @@ dynamo-run in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llam

 ## llama_cpp

- `cargo build --release --features llamacpp,cuda`
+- `cargo build --features llamacpp,cuda`

 - `dynamo-run out=llama_cpp --model-path ~/llm_models/Llama-3.2-3B-Instruct-Q6_K.gguf --model-config ~/llm_models/Llama-3.2-3B-Instruct/`

@@ -197,7 +199,7 @@ uv pip install vllm==0.7.3 setuptools

 Build:
 ```
-cargo build --release --features vllm
+cargo build --features vllm
 ```

 Run (still inside that virtualenv) - HF repo:
@@ -230,7 +232,7 @@ You can provide your own engine in a Python file. The file must provide a genera
 async def generate(request):
 ```

-Build: `cargo build --release --features python`
+Build: `cargo build --features python`

 ### Python does the pre-processing

@@ -343,7 +345,7 @@ TensorRT-LLM. Requires `clang` and `libclang-dev`.

 Build:
 ```
-cargo build --release --features trtllm
+cargo build --features trtllm
 ```

 Run: