Unverified Commit 836d7417 authored by Anant Sharma's avatar Anant Sharma Committed by GitHub
Browse files

feat: restructure source code for python packaging (#3201)


Signed-off-by: default avatarAnant Sharma <anants@nvidia.com>
parent 4e38d628
...@@ -77,9 +77,10 @@ cargo build --locked --profile dev --features dynamo-llm/block-manager ...@@ -77,9 +77,10 @@ cargo build --locked --profile dev --features dynamo-llm/block-manager
# echo "maturin is already installed" # echo "maturin is already installed"
# fi # fi
# install ai-dynamo-runtime
(cd $DYNAMO_HOME/lib/bindings/python && retry maturin develop) (cd $DYNAMO_HOME/lib/bindings/python && retry maturin develop)
# installs overall python packages, grabs binaries from .build/target/debug # install ai-dynamo
cd $DYNAMO_HOME && retry env DYNAMO_BIN_PATH=$CARGO_TARGET_DIR/debug uv pip install -e . cd $DYNAMO_HOME && retry env DYNAMO_BIN_PATH=$CARGO_TARGET_DIR/debug uv pip install -e .
{ set +x; } 2>/dev/null { set +x; } 2>/dev/null
...@@ -96,23 +97,6 @@ if [ -n "${PYTHONPATH:-}" ]; then ...@@ -96,23 +97,6 @@ if [ -n "${PYTHONPATH:-}" ]; then
echo "# PYTHONPATH modified from /workspace to use DYNAMO_HOME" >> ~/.bashrc echo "# PYTHONPATH modified from /workspace to use DYNAMO_HOME" >> ~/.bashrc
show_and_run echo "export PYTHONPATH=\"$PYTHONPATH\"" >> ~/.bashrc show_and_run echo "export PYTHONPATH=\"$PYTHONPATH\"" >> ~/.bashrc
fi fi
else
# PYTHONPATH not set, extract from README.md or use backup
PYTHONPATH_LINE=$(grep "^export PYTHONPATH=" $DYNAMO_HOME/README.md | head -n1)
if [ -n "$PYTHONPATH_LINE" ]; then
# Remove the ${PYTHONPATH}: prefix if it exists, then replace $(pwd) with the actual path
MODIFIED_LINE=$(echo "$PYTHONPATH_LINE" | sed 's/\${PYTHONPATH}://g' | sed "s|\$(pwd)|$DYNAMO_HOME|g")
eval "$MODIFIED_LINE"
# Also add to .bashrc for persistence (with expanded path)
if ! grep -q "export PYTHONPATH=" ~/.bashrc; then
# MODIFIED_LINE already has $DYNAMO_HOME expanded to /home/ubuntu/dynamo
echo "# PYTHONPATH is derived from the README.md" >> ~/.bashrc
show_and_run echo "$MODIFIED_LINE" >> ~/.bashrc
fi
else
# Back-up version. Make sure to sync this with the README.md's PYTHONPATH. This is the version from 2025-08-19
show_and_run export PYTHONPATH=$DYNAMO_HOME/components/frontend/src:$DYNAMO_HOME/components/planner/src:$DYNAMO_HOME/components/backends/vllm/src:$DYNAMO_HOME/components/backends/sglang/src:$DYNAMO_HOME/components/backends/trtllm/src:$DYNAMO_HOME/components/backends/llama_cpp/src:$DYNAMO_HOME/components/backends/mocker/src
fi
fi fi
if ! grep -q "export GPG_TTY=" ~/.bashrc; then if ! grep -q "export GPG_TTY=" ~/.bashrc; then
......
...@@ -29,18 +29,21 @@ vllm: &vllm ...@@ -29,18 +29,21 @@ vllm: &vllm
- 'container/deps/requirements.vllm.txt' - 'container/deps/requirements.vllm.txt'
- 'container/deps/vllm/**' - 'container/deps/vllm/**'
- 'components/backends/vllm/**' - 'components/backends/vllm/**'
- 'components/src/dynamo/vllm/**'
- 'tests/serve/test_vllm.py' - 'tests/serve/test_vllm.py'
sglang: &sglang sglang: &sglang
- 'container/Dockerfile.sglang' - 'container/Dockerfile.sglang'
- 'container/Dockerfile.sglang-wideep' - 'container/Dockerfile.sglang-wideep'
- 'components/backends/sglang/**' - 'components/backends/sglang/**'
- 'components/src/dynamo/sglang/**'
- 'container/build.sh' - 'container/build.sh'
- 'tests/serve/test_sglang.py' - 'tests/serve/test_sglang.py'
trtllm: &trtllm trtllm: &trtllm
- 'container/Dockerfile.trtllm' - 'container/Dockerfile.trtllm'
- 'components/backends/trtllm/**' - 'components/backends/trtllm/**'
- 'components/src/dynamo/trtllm/**'
- 'container/build.sh' - 'container/build.sh'
- 'container/build_trtllm_wheel.sh' - 'container/build_trtllm_wheel.sh'
- 'container/deps/**' - 'container/deps/**'
......
...@@ -314,14 +314,9 @@ maturin develop --uv ...@@ -314,14 +314,9 @@ maturin develop --uv
``` ```
cd $PROJECT_ROOT cd $PROJECT_ROOT
uv pip install . uv pip install -e .
# For development, use
export PYTHONPATH="${PYTHONPATH}:$(pwd)/components/frontend/src:$(pwd)/components/planner/src:$(pwd)/components/backends/vllm/src:$(pwd)/components/backends/sglang/src:$(pwd)/components/backends/trtllm/src:$(pwd)/components/backends/llama_cpp/src:$(pwd)/components/backends/mocker/src"
``` ```
> [!Note]
> Editable (`-e`) does not work because the `dynamo` package is split over multiple directories, one per backend.
You should now be able to run `python -m dynamo.frontend`. You should now be able to run `python -m dynamo.frontend`.
Remember that nats and etcd must be running (see earlier). Remember that nats and etcd must be running (see earlier).
......
...@@ -31,7 +31,7 @@ Each engine provides launch scripts for different deployment patterns in their r ...@@ -31,7 +31,7 @@ Each engine provides launch scripts for different deployment patterns in their r
## Core Components ## Core Components
### [Backends](backends/) ### [Backends](src/dynamo/)
The backends directory contains inference engine integrations and implementations, with a key focus on: The backends directory contains inference engine integrations and implementations, with a key focus on:
...@@ -40,7 +40,7 @@ The backends directory contains inference engine integrations and implementation ...@@ -40,7 +40,7 @@ The backends directory contains inference engine integrations and implementation
- **TensorRT-LLM** - TensorRT-LLM integration with disaggregated serving capabilities - **TensorRT-LLM** - TensorRT-LLM integration with disaggregated serving capabilities
### [Frontend](frontend/) ### [Frontend](src/dynamo/frontend/)
The frontend component provides the HTTP API layer and request processing: The frontend component provides the HTTP API layer and request processing:
...@@ -49,7 +49,7 @@ The frontend component provides the HTTP API layer and request processing: ...@@ -49,7 +49,7 @@ The frontend component provides the HTTP API layer and request processing:
- **Router** - Routes requests to appropriate workers based on load and KV cache state - **Router** - Routes requests to appropriate workers based on load and KV cache state
- **Auto-discovery** - Automatically discovers and registers available workers - **Auto-discovery** - Automatically discovers and registers available workers
### [Planner](planner/) ### [Planner](src/dynamo/planner/)
The planner component monitors system state and dynamically adjusts worker allocation: The planner component monitors system state and dynamically adjusts worker allocation:
......
...@@ -116,11 +116,9 @@ uv pip install maturin ...@@ -116,11 +116,9 @@ uv pip install maturin
cd $DYNAMO_HOME/lib/bindings/python cd $DYNAMO_HOME/lib/bindings/python
maturin develop --uv maturin develop --uv
cd $DYNAMO_HOME cd $DYNAMO_HOME
uv pip install . # installs sglang supported version along with dynamo
export PYTHONPATH="${PYTHONPATH}:$(pwd)/components/backends/sglang/src" # include the prerelease flag to install flashinfer rc versions
# install target sglang version (you can choose any version) uv pip install --prerelease=allow -e .[sglang]
# we include the prerelease flag in order to install flashinfer rc versions
uv pip install --prerelease=allow sglang[all]==0.4.9.post6
``` ```
</details> </details>
......
...@@ -61,8 +61,8 @@ git checkout $(git describe --tags $(git rev-list --tags --max-count=1)) ...@@ -61,8 +61,8 @@ git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
### Large Scale P/D and WideEP Features ### Large Scale P/D and WideEP Features
| Feature | TensorRT-LLM | Notes | | Feature | TensorRT-LLM | Notes |
|--------------------|--------------|-----------------------------------------------------------------------| |--------------------|--------------|-----------------------------------------------------------------|
| **WideEP** | ✅ | | | **WideEP** | ✅ | |
| **DP Rank Routing**| ✅ | | | **DP Rank Routing**| ✅ | |
| **GB200 Support** | ✅ | | | **GB200 Support** | ✅ | |
......
...@@ -13,10 +13,10 @@ python -m dynamo.llama_cpp --model-path /data/models/Qwen3-0.6B-Q8_0.gguf [args] ...@@ -13,10 +13,10 @@ python -m dynamo.llama_cpp --model-path /data/models/Qwen3-0.6B-Q8_0.gguf [args]
## Request Migration ## Request Migration
You can enable [request migration](../../../docs/architecture/request_migration.md) to handle worker failures gracefully. Use the `--migration-limit` flag to specify how many times a request can be migrated to another worker: You can enable [request migration](/docs/architecture/request_migration.md) to handle worker failures gracefully. Use the `--migration-limit` flag to specify how many times a request can be migrated to another worker:
```bash ```bash
python3 -m dynamo.llama_cpp ... --migration-limit=3 python3 -m dynamo.llama_cpp ... --migration-limit=3
``` ```
This allows a request to be migrated up to 3 times before failing. See the [Request Migration Architecture](../../../docs/architecture/request_migration.md) documentation for details on how this works. This allows a request to be migrated up to 3 times before failing. See the [Request Migration Architecture](/docs/architecture/request_migration.md) documentation for details on how this works.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment