Commit 00e54337 authored by Graham King's avatar Graham King Committed by GitHub
Browse files

chore: Make debug profile use all optimizations (#317)

It hardly slows the build down, and it makes things run much faster. That allows us to switch to the debug (default) profile for development, and keep the release profile for, well, releasing.

Motivated by changes in https://github.com/ai-dynamo/dynamo/pull/279
parent d4d93b6a
...@@ -71,6 +71,11 @@ prometheus = { version = "0.13" } ...@@ -71,6 +71,11 @@ prometheus = { version = "0.13" }
[profile.dev.package] [profile.dev.package]
insta.opt-level = 3 insta.opt-level = 3
[profile.dev]
# release level optimizations otherwise everything feels slow
opt-level = 3
[profile.release] [profile.release]
# These make the build much slower but shrink the binary, and could help performance
codegen-units = 1 codegen-units = 1
lto = true lto = true
...@@ -116,24 +116,26 @@ Optionally can run `cargo build` from any location with arguments: ...@@ -116,24 +116,26 @@ Optionally can run `cargo build` from any location with arguments:
- Linux with GPU and CUDA (tested on Ubuntu): - Linux with GPU and CUDA (tested on Ubuntu):
``` ```
cargo build --release --features mistralrs,cuda cargo build --features cuda
``` ```
- macOS with Metal: - macOS with Metal:
``` ```
cargo build --release --features mistralrs,metal cargo build --features metal
``` ```
- CPU only: - CPU only:
``` ```
cargo build --release --features mistralrs cargo build
``` ```
The binary will be called `dynamo-run` in `target/release` The binary will be called `dynamo-run` in `target/debug`
``` ```
cd target/release cd target/debug
``` ```
Build with `--release` for a smaller binary and better performance, but longer build times. The binary will be in `target/release`.
## sglang ## sglang
1. Setup the python virtual env: 1. Setup the python virtual env:
...@@ -149,7 +151,7 @@ uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124 ...@@ -149,7 +151,7 @@ uv pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124
2. Build 2. Build
``` ```
cargo build --release --features sglang cargo build --features sglang
``` ```
3. Run 3. Run
...@@ -168,7 +170,7 @@ dynamo-run in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llam ...@@ -168,7 +170,7 @@ dynamo-run in=none out=sglang --model-path ~/llm_models/DeepSeek-R1-Distill-Llam
## llama_cpp ## llama_cpp
- `cargo build --release --features llamacpp,cuda` - `cargo build --features llamacpp,cuda`
- `dynamo-run out=llama_cpp --model-path ~/llm_models/Llama-3.2-3B-Instruct-Q6_K.gguf --model-config ~/llm_models/Llama-3.2-3B-Instruct/` - `dynamo-run out=llama_cpp --model-path ~/llm_models/Llama-3.2-3B-Instruct-Q6_K.gguf --model-config ~/llm_models/Llama-3.2-3B-Instruct/`
...@@ -197,7 +199,7 @@ uv pip install vllm==0.7.3 setuptools ...@@ -197,7 +199,7 @@ uv pip install vllm==0.7.3 setuptools
Build: Build:
``` ```
cargo build --release --features vllm cargo build --features vllm
``` ```
Run (still inside that virtualenv) - HF repo: Run (still inside that virtualenv) - HF repo:
...@@ -230,7 +232,7 @@ You can provide your own engine in a Python file. The file must provide a genera ...@@ -230,7 +232,7 @@ You can provide your own engine in a Python file. The file must provide a genera
async def generate(request): async def generate(request):
``` ```
Build: `cargo build --release --features python` Build: `cargo build --features python`
### Python does the pre-processing ### Python does the pre-processing
...@@ -343,7 +345,7 @@ TensorRT-LLM. Requires `clang` and `libclang-dev`. ...@@ -343,7 +345,7 @@ TensorRT-LLM. Requires `clang` and `libclang-dev`.
Build: Build:
``` ```
cargo build --release --features trtllm cargo build --features trtllm
``` ```
Run: Run:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment