Unverified Commit 836d7417 authored by Anant Sharma's avatar Anant Sharma Committed by GitHub
Browse files

feat: restructure source code for python packaging (#3201)


Signed-off-by: default avatarAnant Sharma <anants@nvidia.com>
parent 4e38d628
......@@ -77,9 +77,10 @@ cargo build --locked --profile dev --features dynamo-llm/block-manager
# echo "maturin is already installed"
# fi
# install ai-dynamo-runtime
(cd $DYNAMO_HOME/lib/bindings/python && retry maturin develop)
# installs overall python packages, grabs binaries from .build/target/debug
# install ai-dynamo
cd $DYNAMO_HOME && retry env DYNAMO_BIN_PATH=$CARGO_TARGET_DIR/debug uv pip install -e .
{ set +x; } 2>/dev/null
......@@ -96,23 +97,6 @@ if [ -n "${PYTHONPATH:-}" ]; then
echo "# PYTHONPATH modified from /workspace to use DYNAMO_HOME" >> ~/.bashrc
show_and_run echo "export PYTHONPATH=\"$PYTHONPATH\"" >> ~/.bashrc
fi
else
# PYTHONPATH not set, extract from README.md or use backup
PYTHONPATH_LINE=$(grep "^export PYTHONPATH=" $DYNAMO_HOME/README.md | head -n1)
if [ -n "$PYTHONPATH_LINE" ]; then
# Remove the ${PYTHONPATH}: prefix if it exists, then replace $(pwd) with the actual path
MODIFIED_LINE=$(echo "$PYTHONPATH_LINE" | sed 's/\${PYTHONPATH}://g' | sed "s|\$(pwd)|$DYNAMO_HOME|g")
eval "$MODIFIED_LINE"
# Also add to .bashrc for persistence (with expanded path)
if ! grep -q "export PYTHONPATH=" ~/.bashrc; then
# MODIFIED_LINE already has $DYNAMO_HOME expanded to /home/ubuntu/dynamo
echo "# PYTHONPATH is derived from the README.md" >> ~/.bashrc
show_and_run echo "$MODIFIED_LINE" >> ~/.bashrc
fi
else
# Back-up version. Make sure to sync this with the README.md's PYTHONPATH. This is the version from 2025-08-19
show_and_run export PYTHONPATH=$DYNAMO_HOME/components/frontend/src:$DYNAMO_HOME/components/planner/src:$DYNAMO_HOME/components/backends/vllm/src:$DYNAMO_HOME/components/backends/sglang/src:$DYNAMO_HOME/components/backends/trtllm/src:$DYNAMO_HOME/components/backends/llama_cpp/src:$DYNAMO_HOME/components/backends/mocker/src
fi
fi
if ! grep -q "export GPG_TTY=" ~/.bashrc; then
......
......@@ -29,18 +29,21 @@ vllm: &vllm
- 'container/deps/requirements.vllm.txt'
- 'container/deps/vllm/**'
- 'components/backends/vllm/**'
- 'components/src/dynamo/vllm/**'
- 'tests/serve/test_vllm.py'
sglang: &sglang
- 'container/Dockerfile.sglang'
- 'container/Dockerfile.sglang-wideep'
- 'components/backends/sglang/**'
- 'components/src/dynamo/sglang/**'
- 'container/build.sh'
- 'tests/serve/test_sglang.py'
trtllm: &trtllm
- 'container/Dockerfile.trtllm'
- 'components/backends/trtllm/**'
- 'components/src/dynamo/trtllm/**'
- 'container/build.sh'
- 'container/build_trtllm_wheel.sh'
- 'container/deps/**'
......
......@@ -314,14 +314,9 @@ maturin develop --uv
```
cd $PROJECT_ROOT
uv pip install .
# For development, use
export PYTHONPATH="${PYTHONPATH}:$(pwd)/components/frontend/src:$(pwd)/components/planner/src:$(pwd)/components/backends/vllm/src:$(pwd)/components/backends/sglang/src:$(pwd)/components/backends/trtllm/src:$(pwd)/components/backends/llama_cpp/src:$(pwd)/components/backends/mocker/src"
uv pip install -e .
```
> [!Note]
> Editable (`-e`) does not work because the `dynamo` package is split over multiple directories, one per backend.
You should now be able to run `python -m dynamo.frontend`.
Remember that nats and etcd must be running (see earlier).
......
......@@ -31,7 +31,7 @@ Each engine provides launch scripts for different deployment patterns in their r
## Core Components
### [Backends](backends/)
### [Backends](src/dynamo/)
The backends directory contains inference engine integrations and implementations, with a key focus on:
......@@ -40,7 +40,7 @@ The backends directory contains inference engine integrations and implementation
- **TensorRT-LLM** - TensorRT-LLM integration with disaggregated serving capabilities
### [Frontend](frontend/)
### [Frontend](src/dynamo/frontend/)
The frontend component provides the HTTP API layer and request processing:
......@@ -49,7 +49,7 @@ The frontend component provides the HTTP API layer and request processing:
- **Router** - Routes requests to appropriate workers based on load and KV cache state
- **Auto-discovery** - Automatically discovers and registers available workers
### [Planner](planner/)
### [Planner](src/dynamo/planner/)
The planner component monitors system state and dynamically adjusts worker allocation:
......
......@@ -116,11 +116,9 @@ uv pip install maturin
cd $DYNAMO_HOME/lib/bindings/python
maturin develop --uv
cd $DYNAMO_HOME
uv pip install .
export PYTHONPATH="${PYTHONPATH}:$(pwd)/components/backends/sglang/src"
# install target sglang version (you can choose any version)
# we include the prerelease flag in order to install flashinfer rc versions
uv pip install --prerelease=allow sglang[all]==0.4.9.post6
# installs sglang supported version along with dynamo
# include the prerelease flag to install flashinfer rc versions
uv pip install --prerelease=allow -e .[sglang]
```
</details>
......
......@@ -61,8 +61,8 @@ git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
### Large Scale P/D and WideEP Features
| Feature | TensorRT-LLM | Notes |
|--------------------|--------------|-----------------------------------------------------------------------|
| Feature | TensorRT-LLM | Notes |
|--------------------|--------------|-----------------------------------------------------------------|
| **WideEP** | ✅ | |
| **DP Rank Routing**| ✅ | |
| **GB200 Support** | ✅ | |
......
......@@ -13,10 +13,10 @@ python -m dynamo.llama_cpp --model-path /data/models/Qwen3-0.6B-Q8_0.gguf [args]
## Request Migration
You can enable [request migration](../../../docs/architecture/request_migration.md) to handle worker failures gracefully. Use the `--migration-limit` flag to specify how many times a request can be migrated to another worker:
You can enable [request migration](/docs/architecture/request_migration.md) to handle worker failures gracefully. Use the `--migration-limit` flag to specify how many times a request can be migrated to another worker:
```bash
python3 -m dynamo.llama_cpp ... --migration-limit=3
```
This allows a request to be migrated up to 3 times before failing. See the [Request Migration Architecture](../../../docs/architecture/request_migration.md) documentation for details on how this works.
This allows a request to be migrated up to 3 times before failing. See the [Request Migration Architecture](/docs/architecture/request_migration.md) documentation for details on how this works.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment