README.md 1.2 KB
Newer Older
1
# Multi-dimensional Parallelism with Colossal-AI
2
3


4
5
6
7
8
9
10
11
12
13
14
15
16
## 🚀Quick Start
1. Install our model zoo.
```bash
pip install titans
```
2. Run with synthetic data which is of similar shape to CIFAR10 with the `-s` flag.
```bash
colossalai run --nproc_per_node 4 train.py --config config.py -s
```

3. Modify the config file to play with different types of tensor parallelism, for example, change tensor parallel size to be 4 and mode to be 2d and run on 8 GPUs.


17
## Install Titans Model Zoo
18
19

```bash
20
pip install titans
21
22
23
24
25
```


## Prepare Dataset

26
27
We use CIFAR10 dataset in this example. You should invoke the `donwload_cifar10.py` in the tutorial root directory or directly run the `auto_parallel_with_resnet.py`.
The dataset will be downloaded to `colossalai/examples/tutorials/data` by default.
28
29
30
31
32
33
34
35
36
37
38
39
If you wish to use customized directory for the dataset. You can set the environment variable `DATA` via the following command.

```bash
export DATA=/path/to/data
```


## Run on 2*2 device mesh

Current configuration setting on `config.py` is TP=2, PP=2.

```bash
40
# train with cifar10
41
colossalai run --nproc_per_node 4 train.py --config config.py
42
43

# train with synthetic data
44
colossalai run --nproc_per_node 4 train.py --config config.py -s
45
```