README.md 3.88 KB
Newer Older
Alec's avatar
Alec committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<!--
SPDX-FileCopyrightText: Copyright (c) 2024-2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

Ryan Olson's avatar
Ryan Olson committed
18
# Triton Distributed Runtime
Alec's avatar
Alec committed
19

Ryan Olson's avatar
Ryan Olson committed
20
<h4>A Datacenter Scale Distributed Inference Serving Framework</h4>
Alec's avatar
Alec committed
21

Ryan Olson's avatar
Ryan Olson committed
22
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
Alec's avatar
Alec committed
23

Ryan Olson's avatar
Ryan Olson committed
24
25
26
27
28
Rust implementation of the Triton distributed runtime system, enabling distributed computing capabilities for machine learning workloads.

## 🛠️ Prerequisites

### Install Rust and Cargo using [rustup](https://rustup.rs/):
Alec's avatar
Alec committed
29
30
31
32
33

```bash
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
```

Ryan Olson's avatar
Ryan Olson committed
34
### Build
Alec's avatar
Alec committed
35

Ryan Olson's avatar
Ryan Olson committed
36
```
Alec's avatar
Alec committed
37
cargo build
Ryan Olson's avatar
Ryan Olson committed
38
cargo test
Alec's avatar
Alec committed
39
40
```

Ryan Olson's avatar
Ryan Olson committed
41
### Start Dependencies
Alec's avatar
Alec committed
42

Ryan Olson's avatar
Ryan Olson committed
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
#### Docker Compose

The simplest way to deploy the pre-requisite services is using
[docker-compose](https://docs.docker.com/compose/install/linux/),
defined in the project's root [docker-compose.yml](docker-compose.yml).

```
docker-compose up -d
```

This will deploy a [NATS.io](https://nats.io/) server and an [etcd](https://etcd.io/)
server used to communicate between and discover components at runtime.


#### Local (alternate)

To deploy the pre-requisite services locally instead of using `docker-compose`
above, you can manually launch each:

- [NATS.io](https://docs.nats.io/running-a-nats-service/introduction/installation) server with [Jetstream](https://docs.nats.io/nats-concepts/jetstream)
    - example: `nats-server -js --trace`
- [etcd](https://etcd.io) server
    - follow instructions in [etcd installation](https://etcd.io/docs/v3.5/install/) to start an `etcd-server` locally


### Run Examples

When developing or running examples, any process or user that shared your core-services (`etcd` and `nats.io`) will
be operating within your distributed runtime.

The current examples use a hard-coded `namespace`. We will address the `namespace` collisions in this
[issue](https://github.com/triton-inference-server/triton_distributed/issues/114).

All examples require the `etcd` and `nats.io` pre-requisites to be running and available.

#### Rust `hello_world`

With two terminals open, in one window:

```
cd examples/hello_world
cargo run --bin server
Alec's avatar
Alec committed
85
86
```

Ryan Olson's avatar
Ryan Olson committed
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
In the second terminal, execute:

```
cd examples/hello_world
cargo run --bin client
```

which should yield some output similar to:
```
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 6.25s
     Running `target/debug/client`
Annotated { data: Some("h"), id: None, event: None, comment: None }
Annotated { data: Some("e"), id: None, event: None, comment: None }
Annotated { data: Some("l"), id: None, event: None, comment: None }
Annotated { data: Some("l"), id: None, event: None, comment: None }
Annotated { data: Some("o"), id: None, event: None, comment: None }
Annotated { data: Some(" "), id: None, event: None, comment: None }
Annotated { data: Some("w"), id: None, event: None, comment: None }
Annotated { data: Some("o"), id: None, event: None, comment: None }
Annotated { data: Some("r"), id: None, event: None, comment: None }
Annotated { data: Some("l"), id: None, event: None, comment: None }
Annotated { data: Some("d"), id: None, event: None, comment: None }
```

#### Python

113
See the [README.md](/lib/bindings/python/README.md) for details
Alec's avatar
Alec committed
114

Ryan Olson's avatar
Ryan Olson committed
115
116
The Python and Rust `hello_world` client and server examples are interchangeable,
so you can start the Python `server.py` and talk to it from the Rust `client`.