Commit c14e460c authored by ishandhanani's avatar ishandhanani Committed by GitHub
Browse files

docs: hello world and vllm process docs (#525)

parent 4b6cfc1b
......@@ -15,8 +15,17 @@ See the License for the specific language governing permissions and
limitations under the License.
-->
# Hello World Example
## Overview
This example demonstrates the basic concepts of Dynamo by creating a simple multi-service pipeline. It shows how to:
1. Create and connect multiple Dynamo services
2. Pass data between services using Dynamo's runtime
3. Set up a simple HTTP API endpoint
4. Deploy and interact with a Dynamo service graph
Pipeline Architecture:
```
......@@ -38,16 +47,35 @@ Users/Clients (HTTP)
└─────────────┘
```
## Component Descriptions
### Frontend Service
- Serves as the entry point for external HTTP requests
- Exposes a `/generate` HTTP API endpoint that clients can call
- Processes incoming text and passes it to the Middle service
### Middle Service
- Acts as an intermediary service in the pipeline
- Receives requests from the Frontend
- Appends "-mid" to the text and forwards it to the Backend
### Backend Service
- Functions as the final service in the pipeline
- Processes requests from the Middle service
- Appends "-back" to the text and yields tokens
## Running the Example
## Unified serve
1. Launch all three services using a single command -
1. Launch all three services using a single command:
```bash
cd /workspace/examples/hello_world
dynamo serve hello_world:Frontend
```
2. Send request to frontend using curl -
The `dynamo serve` command deploys the entire service graph, automatically handling the dependencies between Frontend, Middle, and Backend services.
2. Send request to frontend using curl:
```bash
curl -X 'POST' \
......@@ -58,3 +86,16 @@ curl -X 'POST' \
"text": "test"
}'
```
## Expected Output
When you send the request with "test" as input, the response will show how the text flows through each service:
```
Frontend: Middle: Backend: test-mid-back
```
This demonstrates how:
1. The Frontend receives "test"
2. The Middle service adds "-mid" to create "test-mid"
3. The Backend service adds "-back" to create "test-mid-back"
\ No newline at end of file
......@@ -13,10 +13,14 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
from pydantic import BaseModel
from dynamo.sdk import DYNAMO_IMAGE, api, depends, dynamo_endpoint, service
logger = logging.getLogger(__name__)
"""
Pipeline Architecture:
......@@ -48,32 +52,27 @@ class ResponseType(BaseModel):
@service(
resources={"cpu": "2"},
traffic={"timeout": 30},
dynamo={
"enabled": True,
"namespace": "inference",
},
workers=3,
image=DYNAMO_IMAGE,
)
class Backend:
def __init__(self) -> None:
print("Starting backend")
logger.info("Starting backend")
@dynamo_endpoint()
async def generate(self, req: RequestType):
"""Generate tokens."""
req_text = req.text
print(f"Backend received: {req_text}")
logger.info(f"Backend received: {req_text}")
text = f"{req_text}-back"
for token in text.split():
yield f"Backend: {token}"
@service(
resources={"cpu": "2"},
traffic={"timeout": 30},
dynamo={"enabled": True, "namespace": "inference"},
image=DYNAMO_IMAGE,
)
......@@ -81,23 +80,21 @@ class Middle:
backend = depends(Backend)
def __init__(self) -> None:
print("Starting middle")
logger.info("Starting middle")
@dynamo_endpoint()
async def generate(self, req: RequestType):
"""Forward requests to backend."""
req_text = req.text
print(f"Middle received: {req_text}")
logger.info(f"Middle received: {req_text}")
text = f"{req_text}-mid"
next_request = RequestType(text=text).model_dump_json()
async for response in self.backend.generate(next_request):
print(f"Middle received response: {response}")
logger.info(f"Middle received response: {response}")
yield f"Middle: {response}"
@service(
resources={"cpu": "1"},
traffic={"timeout": 60},
image=DYNAMO_IMAGE,
) # Regular HTTP API
class Frontend:
......
......@@ -157,9 +157,8 @@ See [multinode-examples.md](multinode-examples.md) for more details.
### Close deployment
Kill all dynamo processes managed by circusd.
> [!IMPORTANT]
> We are aware of an issue where vLLM subprocesses might not be killed when `ctrl-c` is pressed.
> We are working on addressing this. Relevant vLLM issues can be found [here](https://github.com/vllm-project/vllm/pull/8492) and [here](https://github.com/vllm-project/vllm/issues/6219#issuecomment-2439257824).
```
ctrl-c
pkill python3
```
To stop the serve, you can press `ctrl-c` which will kill the different components. In order to kill the remaining vLLM subprocesses you can run `nvidia-smi` and `kill -9` the remaining processes or run `pkill python3` from inside of the container.
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment