README.md 3.11 KB
Newer Older
1
2
3
4
5
6
7
# Dynamo SDK

Dynamo is a python based SDK for building and deploying distributed inference applications. Dynamo leverages concepts from open source projects like [BentoML](https://github.com/bentoml/bentoml) to provide a developer friendly experience to go from local development to K8s deployment.

## Installation

```bash
ishandhanani's avatar
ishandhanani committed
8
pip install ai-dynamo
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
```

## Quickstart
Lets build a simple distributed pipeline with 3 components: `Frontend`, `Middle` and `Backend`. The structure of the pipeline looks like this:

```
Users/Clients (HTTP)


┌─────────────┐
│  Frontend   │  HTTP API endpoint (/generate)
└─────────────┘


┌─────────────┐
│   Middle    │
└─────────────┘


┌─────────────┐
│  Backend    │
└─────────────┘
```

The code for the pipeline looks like this:

```python
# filename: pipeline.py
37
38
from fastapi import FastAPI
from fastapi.responses import StreamingResponse
39
from pydantic import BaseModel
40
from dynamo.sdk import DYNAMO_IMAGE, depends, dynamo_endpoint, service, dynamo_api
41

42
43
44
45
46

class RequestType(BaseModel):
    text: str


47
48
49
50
51
class ResponseType(BaseModel):
    text: str


@service(
52
    dynamo={"namespace": "inference"},
53
54
55
56
57
58
59
60
)
class Backend:
    @dynamo_endpoint()
    async def generate(self, req: RequestType):
        text = f"{req.text}-back"
        for token in text.split():
            yield f"Backend: {token}"

61
62

@service(
63
    dynamo={"namespace": "inference"},
64
65
66
67
68
69
70
)
class Middle:
    backend = depends(Backend)

    @dynamo_endpoint()
    async def generate(self, req: RequestType):
        text = f"{req.text}-mid"
71
72
73
74
75
76
77
        next_request = RequestType(text=text).model_dump_json()
        async for response in self.backend.generate(next_request):
            yield f"Middle: {response}"


app = FastAPI(title="Hello World!")

78
79

@service(
80
    dynamo={"namespace": "inference"},
81
    app=app,
82
)
83
84
85
class Frontend:
    middle = depends(Middle)

86
    @dynamo_api()
87
88
89
90
91
92
93
    async def generate(self, request: RequestType):
        async def content_generator():
            async for response in self.middle.generate(request.model_dump_json()):
                yield f"Frontend: {response}"

        return StreamingResponse(content_generator())

94
95
96
97
98
99
```

You can run this pipeline locally by spinning up ETCD and NATS and then running the pipeline:

```bash
# Spin up ETCD and NATS
100
docker compose -f deploy/metrics/docker-compose.yml up -d
101
102
103
104
105
106
107
108
109
110
111
112
```

then

```bash
# Run the pipeline
dynamo serve pipeline:Frontend
```

Once it's up and running, you can make a request to the pipeline using

```bash
113
curl -X POST http://localhost:8000/generate \
114
115
116
117
    -H "Content-Type: application/json" \
    -d '{"text": "federer"}'
```

118
You should see the following output
119
120
121
122

```bash
federer-mid-back
```
123
124
125

You can find in-depth documentation for the Dynamo SDK [here](./docs/sdk/README.md) and the Dynamo CLI [here](./docs/cli/README.md)

126
Please refer to [hello_world](../../../examples/hello_world/README.md) and [llm](../../../examples/llm/README.md) for examples.