Unverified Commit a38319e6 authored by Tushar Sharma's avatar Tushar Sharma Committed by GitHub
Browse files

build: OPS-810: add dynamo frontend image w/EPP support (#4150)


Signed-off-by: default avatarTushar Sharma <tusharma@nvidia.com>
parent f50c3861
# syntax=docker/dockerfile:1.10.0
# SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
ARG DYNAMO_BASE_IMAGE="dynamo:latest-none"
ARG EPP_IMAGE="us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:v0.5.1-dirty"
ARG PYTHON_VERSION=3.12
FROM ${DYNAMO_BASE_IMAGE} AS dynamo_base
FROM ${EPP_IMAGE} AS epp
FROM nvcr.io/nvidia/base/ubuntu:noble-20250619 AS frontend
ARG PYTHON_VERSION
RUN apt-get update -y \
&& apt-get install -y --no-install-recommends \
# required for EPP
ca-certificates \
libstdc++6 \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
# Create dynamo user with group 0 for OpenShift compatibility
RUN userdel -r ubuntu > /dev/null 2>&1 || true \
&& useradd -m -s /bin/bash -g 0 dynamo \
&& [ `id -u dynamo` -eq 1000 ] \
&& mkdir -p /home/dynamo/.cache /opt/dynamo /workspace \
&& chown -R dynamo: /opt/dynamo /home/dynamo/.cache /workspace \
&& chmod -R g+w /opt/dynamo /home/dynamo/.cache /workspace
# Set HOME so ModelExpress can find the cache directory
ENV HOME=/home/dynamo
# Switch to dynamo user
USER dynamo
ENV DYNAMO_HOME=/opt/dynamo
WORKDIR /
COPY --chown=dynamo: --from=epp /epp /epp
COPY --chown=dynamo: container/launch_message.txt /opt/dynamo/.launch_screen
# Copy tests, benchmarks, deploy and components with correct ownership
COPY --chown=dynamo: tests /workspace/tests
COPY --chown=dynamo: examples /workspace/examples
COPY --chown=dynamo: benchmarks /workspace/benchmarks
COPY --chown=dynamo: deploy /workspace/deploy
COPY --chown=dynamo: components/ /workspace/components/
COPY --chown=dynamo: recipes/ /workspace/recipes/
# Copy attribution files with correct ownership
COPY --chown=dynamo: ATTRIBUTION* LICENSE /workspace/
ENV VIRTUAL_ENV=/opt/dynamo/venv
ENV PATH="/opt/dynamo/venv/bin:$PATH"
# Copy virtual environment directly from dynamo_base (dev image)
# This includes all installed packages: dynamo, nixl, requirements.txt, requirements.test.txt
# Copy uv to system /bin
COPY --from=dynamo_base /bin/uv /bin/uvx /bin/
RUN uv python install $PYTHON_VERSION
COPY --chown=dynamo: --from=dynamo_base /opt/dynamo/venv/ /opt/dynamo/venv/
# Setup environment for all users
USER root
RUN chmod 755 /opt/dynamo/.launch_screen && \
echo 'source /opt/dynamo/venv/bin/activate' >> /etc/bash.bashrc && \
echo 'cat /opt/dynamo/.launch_screen' >> /etc/bash.bashrc
USER dynamo
ENTRYPOINT ["/epp"]
CMD ["/bin/bash"]
...@@ -15,6 +15,8 @@ The NVIDIA Dynamo project uses containerized development and deployment to maint ...@@ -15,6 +15,8 @@ The NVIDIA Dynamo project uses containerized development and deployment to maint
- `Dockerfile.trtllm` - For TensorRT-LLM inference backend - `Dockerfile.trtllm` - For TensorRT-LLM inference backend
- `Dockerfile.sglang` - For SGLang inference backend - `Dockerfile.sglang` - For SGLang inference backend
- `Dockerfile` - Base/standalone configuration - `Dockerfile` - Base/standalone configuration
- `Dockerfile.frontend` - For Kubernetes Gateway API Inference Extension integration with EPP
- `Dockerfile.epp` - For building the Endpoint Picker (EPP) image
### Why Containerization? ### Why Containerization?
...@@ -192,6 +194,55 @@ The `build.sh --dev-image` option takes a dev image and then builds a local-dev ...@@ -192,6 +194,55 @@ The `build.sh --dev-image` option takes a dev image and then builds a local-dev
./build.sh --dev-image dynamo:latest-vllm --framework vllm --dry-run ./build.sh --dev-image dynamo:latest-vllm --framework vllm --dry-run
``` ```
### Building the Frontend Image
The frontend image is a specialized container that includes the Dynamo components (NATS, etcd, dynamo, NIXL, etc) along with the Endpoint Picker (EPP) for Kubernetes Gateway API Inference Extension integration. This image is primarily used for inference gateway deployments.
**Step 1: Build the Custom Dynamo EPP Image**
Follow the instructions in [`deploy/inference-gateway/README.md`](../deploy/inference-gateway/README.md) under "Build the custom EPP image" section. This process:
- Clones the Gateway API Inference Extension repository
- Applies Dynamo-specific patches for custom routing
- Builds the Dynamo router as a static library
- Creates a custom EPP image with integrated Dynamo routing capabilities
**Step 2: Build the Dynamo Base Image**
The base image contains the core Dynamo runtime components, NATS server, etcd, and Python dependencies:
```bash
# Build the base dev image (framework=none for frontend-only deployment)
./build.sh --framework none --target dev
```
**Step 3: Build the Frontend Image**
Now build the frontend image that combines the Dynamo base with the EPP:
```bash
# 2. Build the frontend image using the pre-built EPP
docker buildx build --load --platform linux/amd64 \
--build-arg DYNAMO_BASE_IMAGE=dynamo:latest-none-dev \
--build-arg EPP_IMAGE={EPP_IMAGE_TAG} \
--build-arg PYTHON_VERSION=3.12 \
-f container/Dockerfile.frontend \
-t dynamo:latest-none-frontend \
.
```
#### Frontend Image Contents
The frontend image includes:
- **EPP (Endpoint Picker)**: Handles request routing and load balancing for inference gateway
- **Dynamo Runtime**: Core platform components and routing logic
- **NIXL**: NVIDIA InfiniBand Library for high-performance network communication
- **Benchmarking Tools**: Performance testing utilities (aiperf, aiconfigurator, etc)
- **Python Environment**: Virtual environment with all required dependencies
- **NATS Server**: Message broker for Dynamo's distributed communication
- **etcd**: Distributed key-value store for configuration and coordination
#### Deployment
The frontend image is designed for Kubernetes deployment with the Gateway API Inference Extension. See [`deploy/inference-gateway/README.md`](../deploy/inference-gateway/README.md) for complete deployment instructions using Helm charts.
### run.sh - Container Runtime Manager ### run.sh - Container Runtime Manager
The `run.sh` script launches Docker containers with the appropriate configuration for development and inference workloads. The `run.sh` script launches Docker containers with the appropriate configuration for development and inference workloads.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment