Unverified Commit d53415bc authored by Frank Lee's avatar Frank Lee Committed by GitHub
Browse files

[tutorial] added data script and updated readme (#1916)

parent 155e2023
...@@ -7,18 +7,33 @@ Welcome to the [Colossal-AI](https://github.com/hpcaitech/ColossalAI) tutorial, ...@@ -7,18 +7,33 @@ Welcome to the [Colossal-AI](https://github.com/hpcaitech/ColossalAI) tutorial,
[Colossal-AI](https://github.com/hpcaitech/ColossalAI), a unified deep learning system for the big model era, integrates [Colossal-AI](https://github.com/hpcaitech/ColossalAI), a unified deep learning system for the big model era, integrates
many advanced technologies such as multi-dimensional tensor parallelism, sequence parallelism, heterogeneous memory management, many advanced technologies such as multi-dimensional tensor parallelism, sequence parallelism, heterogeneous memory management,
large-scale optimization, adaptive task scheduling, etc. By using Colossal-AI, we could help users to efficiently and large-scale optimization, adaptive task scheduling, etc. By using Colossal-AI, we could help users to efficiently and
quickly deploy large AI model training and inference, reducing large AI model training budgets and scaling down the labor cost of learning and deployment. quickly deploy large AI model training and inference, reducing large AI model training budgets and scaling down the labor cost of learning and deployment.
### 🚀 Quick Links ### 🚀 Quick Links
[**Colossal-AI**](https://github.com/hpcaitech/ColossalAI) | [**Colossal-AI**](https://github.com/hpcaitech/ColossalAI) |
[**Paper**](https://arxiv.org/abs/2110.14883) | [**Paper**](https://arxiv.org/abs/2110.14883) |
[**Documentation**](https://www.colossalai.org/) | [**Documentation**](https://www.colossalai.org/) |
[**Forum**](https://github.com/hpcaitech/ColossalAI/discussions) | [**Forum**](https://github.com/hpcaitech/ColossalAI/discussions) |
[**Slack**](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w) [**Slack**](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w)
## Prerequisite
To run this example, you only need to have PyTorch and Colossal-AI installed. A sample script to download the dependencies is given below.
```
# install torch 1.12 with CUDA 11.3
# visit https://pytorch.org/get-started/locally/ to download other versions
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
# install latest ColossalAI
# visit https://colossalai.org/download to download corresponding version of Colossal-AI
pip install colossalai==0.1.11+torch1.12cu11.3 -f https://release.colossalai.org
```
## Table of Content ## Table of Content
- Multi-dimensional Parallelism - Multi-dimensional Parallelism
...@@ -43,7 +58,15 @@ quickly deploy large AI model training and inference, reducing large AI model tr ...@@ -43,7 +58,15 @@ quickly deploy large AI model training and inference, reducing large AI model tr
- Acceleration of Stable Diffusion - Acceleration of Stable Diffusion
- Stable Diffusion with Lightning - Stable Diffusion with Lightning
- Try Lightning Colossal-AI strategy to optimize memory and accelerate speed - Try Lightning Colossal-AI strategy to optimize memory and accelerate speed
## Prepare Common Dataset
**This tutorial folder aims to let the user to quickly try out the training scripts**. One major task for deep learning is data preparataion. To save time on data preparation, we use `CIFAR10` for most tutorials and synthetic datasets if the dataset required is too large. To make the `CIFAR10` dataset shared across the different examples, it should be downloaded in tutorial root directory with the following command.
```python
python download_cifar10.py
```
## Discussion ## Discussion
...@@ -51,4 +74,3 @@ Discussion about the [Colossal-AI](https://github.com/hpcaitech/ColossalAI) proj ...@@ -51,4 +74,3 @@ Discussion about the [Colossal-AI](https://github.com/hpcaitech/ColossalAI) proj
If you think there is a need to discuss anything, you may jump to our [Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w). If you think there is a need to discuss anything, you may jump to our [Slack](https://join.slack.com/t/colossalaiworkspace/shared_invite/zt-z7b26eeb-CBp7jouvu~r0~lcFzX832w).
If you encounter any problem while running these tutorials, you may want to raise an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose) in this repository. If you encounter any problem while running these tutorials, you may want to raise an [issue](https://github.com/hpcaitech/ColossalAI/issues/new/choose) in this repository.
import os
from torchvision.datasets import CIFAR10
def main():
dir_path = os.path.dirname(os.path.realpath(__file__))
data_root = os.path.join(dir_path, 'data')
dataset = CIFAR10(root=data_root, download=True)
if __name__ == '__main__':
main()
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment