Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ColossalAI
Commits
35813ed3
"...git@developer.sourcefind.cn:OpenDAS/colossalai.git" did not exist on "2a951955ade14fd067bc5bee34a5ff7e57513ac6"
Unverified
Commit
35813ed3
authored
Dec 13, 2021
by
Frank Lee
Committed by
GitHub
Dec 13, 2021
Browse files
update examples and sphnix docs for the new api (#63)
parent
7d371105
Changes
118
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
136 additions
and
94 deletions
+136
-94
docs/colossalai/colossalai.nn.model.vision_transformer.vision_transformer.rst
...ssalai.nn.model.vision_transformer.vision_transformer.rst
+0
-5
docs/colossalai/colossalai.nn.multi_tensor_apply.multi_tensor_apply.rst
...i/colossalai.nn.multi_tensor_apply.multi_tensor_apply.rst
+0
-5
docs/colossalai/colossalai.nn.optimizer.loss_scaler.rst
docs/colossalai/colossalai.nn.optimizer.loss_scaler.rst
+0
-5
docs/colossalai/colossalai.nn.optimizer.rst
docs/colossalai/colossalai.nn.optimizer.rst
+4
-9
docs/colossalai/colossalai.nn.optimizer.zero_redundancy_optimizer_level_1.rst
...ssalai.nn.optimizer.zero_redundancy_optimizer_level_1.rst
+0
-5
docs/colossalai/colossalai.nn.optimizer.zero_redundancy_optimizer_level_2.rst
...ssalai.nn.optimizer.zero_redundancy_optimizer_level_2.rst
+0
-5
docs/colossalai/colossalai.nn.optimizer.zero_redundancy_optimizer_level_3.rst
...ssalai.nn.optimizer.zero_redundancy_optimizer_level_3.rst
+0
-5
docs/colossalai/colossalai.nn.rst
docs/colossalai/colossalai.nn.rst
+4
-5
docs/colossalai/colossalai.registry.rst
docs/colossalai/colossalai.registry.rst
+4
-4
docs/colossalai/colossalai.rst
docs/colossalai/colossalai.rst
+11
-9
docs/colossalai/colossalai.trainer.rst
docs/colossalai/colossalai.trainer.rst
+4
-3
docs/colossalai/colossalai.utils.data_sampler.rst
docs/colossalai/colossalai.utils.data_sampler.rst
+5
-0
docs/colossalai/colossalai.utils.gradient_accumulation.rst
docs/colossalai/colossalai.utils.gradient_accumulation.rst
+5
-0
docs/colossalai/colossalai.utils.multi_tensor_apply.rst
docs/colossalai/colossalai.utils.multi_tensor_apply.rst
+8
-0
docs/colossalai/colossalai.utils.rst
docs/colossalai/colossalai.utils.rst
+7
-4
docs/colossalai/colossalai.zero.rst
docs/colossalai/colossalai.zero.rst
+5
-0
docs/config.md
docs/config.md
+9
-0
docs/installation.md
docs/installation.md
+5
-7
docs/run_demo.md
docs/run_demo.md
+64
-23
docs/zero.md
docs/zero.md
+1
-0
No files found.
docs/colossalai/colossalai.nn.model.vision_transformer.vision_transformer.rst
deleted
100644 → 0
View file @
7d371105
colossalai.nn.model.vision\_transformer.vision\_transformer
===========================================================
.. automodule:: colossalai.nn.model.vision_transformer.vision_transformer
:members:
docs/colossalai/colossalai.nn.multi_tensor_apply.multi_tensor_apply.rst
deleted
100644 → 0
View file @
7d371105
colossalai.nn.multi\_tensor\_apply.multi\_tensor\_apply
=======================================================
.. automodule:: colossalai.nn.multi_tensor_apply.multi_tensor_apply
:members:
docs/colossalai/colossalai.nn.optimizer.loss_scaler.rst
deleted
100644 → 0
View file @
7d371105
colossalai.nn.optimizer.loss\_scaler
====================================
.. automodule:: colossalai.nn.optimizer.loss_scaler
:members:
docs/colossalai/colossalai.nn.optimizer.rst
View file @
35813ed3
colossalai.nn.optimizer
colossalai.nn.optimizer
=======================
=======================
.. automodule:: colossalai.nn.optimizer
:members:
.. toctree::
.. toctree::
:maxdepth: 2
:maxdepth: 2
colossalai.nn.optimizer.fp16_optimizer
colossalai.nn.optimizer.fused_adam
colossalai.nn.optimizer.fused_adam
colossalai.nn.optimizer.fused_lamb
colossalai.nn.optimizer.fused_lamb
colossalai.nn.optimizer.fused_sgd
colossalai.nn.optimizer.fused_sgd
colossalai.nn.optimizer.lamb
colossalai.nn.optimizer.lamb
colossalai.nn.optimizer.lars
colossalai.nn.optimizer.lars
colossalai.nn.optimizer.loss_scaler
colossalai.nn.optimizer.zero_redundancy_optimizer_level_1
colossalai.nn.optimizer
.zero_redundancy_optimizer_level_2
.. automodule::
colossalai.nn.optimizer
colossalai.nn.optimizer.zero_redundancy_optimizer_level_3
:members:
docs/colossalai/colossalai.nn.optimizer.zero_redundancy_optimizer_level_1.rst
deleted
100644 → 0
View file @
7d371105
colossalai.nn.optimizer.zero\_redundancy\_optimizer\_level\_1
=============================================================
.. automodule:: colossalai.nn.optimizer.zero_redundancy_optimizer_level_1
:members:
docs/colossalai/colossalai.nn.optimizer.zero_redundancy_optimizer_level_2.rst
deleted
100644 → 0
View file @
7d371105
colossalai.nn.optimizer.zero\_redundancy\_optimizer\_level\_2
=============================================================
.. automodule:: colossalai.nn.optimizer.zero_redundancy_optimizer_level_2
:members:
docs/colossalai/colossalai.nn.optimizer.zero_redundancy_optimizer_level_3.rst
deleted
100644 → 0
View file @
7d371105
colossalai.nn.optimizer.zero\_redundancy\_optimizer\_level\_3
=============================================================
.. automodule:: colossalai.nn.optimizer.zero_redundancy_optimizer_level_3
:members:
docs/colossalai/colossalai.nn.rst
View file @
35813ed3
colossalai.nn
colossalai.nn
=============
=============
.. automodule:: colossalai.nn
:members:
.. toctree::
.. toctree::
:maxdepth: 2
:maxdepth: 2
colossalai.nn.data
colossalai.nn.layer
colossalai.nn.layer
colossalai.nn.loss
colossalai.nn.loss
colossalai.nn.lr_scheduler
colossalai.nn.lr_scheduler
colossalai.nn.model
colossalai.nn.model
colossalai.nn.multi_tensor_apply
colossalai.nn.optimizer
colossalai.nn.optimizer
.. automodule:: colossalai.nn
:members:
docs/colossalai/colossalai.registry.rst
View file @
35813ed3
colossalai.registry
colossalai.registry
===================
===================
.. automodule:: colossalai.registry
:members:
.. toctree::
.. toctree::
:maxdepth: 2
:maxdepth: 2
colossalai.registry.registry
colossalai.registry.registry
.. automodule:: colossalai.registry
:members:
docs/colossalai/colossalai.rst
View file @
35813ed3
colossalai
colossalai
==========
==========
.. automodule:: colossalai
.. toctree::
:members:
:maxdepth: 2
colossalai.constants
colossalai.core
colossalai.initialize
.. toctree::
.. toctree::
:maxdepth: 2
:maxdepth: 2
colossalai.amp
colossalai.builder
colossalai.builder
colossalai.communication
colossalai.communication
colossalai.context
colossalai.context
...
@@ -16,11 +22,7 @@ colossalai
...
@@ -16,11 +22,7 @@ colossalai
colossalai.registry
colossalai.registry
colossalai.trainer
colossalai.trainer
colossalai.utils
colossalai.utils
colossalai.zero
.. automodule:: colossalai
.. toctree::
:members:
:maxdepth: 2
colossalai.constants
colossalai.core
colossalai.initialize
docs/colossalai/colossalai.trainer.rst
View file @
35813ed3
colossalai.trainer
colossalai.trainer
==================
==================
.. automodule:: colossalai.trainer
:members:
.. toctree::
.. toctree::
:maxdepth: 2
:maxdepth: 2
...
@@ -14,3 +11,7 @@ colossalai.trainer
...
@@ -14,3 +11,7 @@ colossalai.trainer
:maxdepth: 2
:maxdepth: 2
colossalai.trainer.metric
colossalai.trainer.metric
.. automodule:: colossalai.trainer
:members:
docs/colossalai/colossalai.
nn.optimizer.fp16_optimiz
er.rst
→
docs/colossalai/colossalai.
utils.data_sampl
er.rst
View file @
35813ed3
colossalai.
nn.optimizer.fp16\_optimiz
er
colossalai.
utils.data\_sampl
er
=======================================
=======================================
.. automodule:: colossalai.
nn.optimizer.fp16_optimiz
er
.. automodule:: colossalai.
utils.data_sampl
er
:members:
:members:
docs/colossalai/colossalai.utils.gradient_accumulation.rst
0 → 100644
View file @
35813ed3
colossalai.utils.gradient\_accumulation
=======================================
.. automodule:: colossalai.utils.gradient_accumulation
:members:
docs/colossalai/colossalai.
nn
.multi_tensor_apply.rst
→
docs/colossalai/colossalai.
utils
.multi_tensor_apply.rst
View file @
35813ed3
colossalai.nn.multi\_tensor\_apply
colossalai.nn.multi\_tensor\_apply
==================================
==================================
.. automodule:: colossalai.
nn
.multi_tensor_apply
.. automodule:: colossalai.
utils.multi_tensor_apply
.multi_tensor_apply
:members:
:members:
.. toctree::
:maxdepth: 2
colossalai.nn.multi_tensor_apply.multi_tensor_apply
docs/colossalai/colossalai.utils.rst
View file @
35813ed3
colossalai.utils
colossalai.utils
================
================
.. automodule:: colossalai.utils
:members:
.. toctree::
.. toctree::
:maxdepth: 2
:maxdepth: 2
...
@@ -12,5 +8,12 @@ colossalai.utils
...
@@ -12,5 +8,12 @@ colossalai.utils
colossalai.utils.checkpointing
colossalai.utils.checkpointing
colossalai.utils.common
colossalai.utils.common
colossalai.utils.cuda
colossalai.utils.cuda
colossalai.utils.data_sampler
colossalai.utils.gradient_accumulation
colossalai.utils.memory
colossalai.utils.memory
colossalai.utils.multi_tensor_apply
colossalai.utils.timer
colossalai.utils.timer
.. automodule:: colossalai.utils
:members:
docs/colossalai/colossalai.zero.rst
0 → 100644
View file @
35813ed3
colossalai.zero
================
.. automodule:: colossalai.zero
:members:
docs/config.md
View file @
35813ed3
...
@@ -18,6 +18,15 @@ fp16 = dict(
...
@@ -18,6 +18,15 @@ fp16 = dict(
initial_scale
=
2
**
8
initial_scale
=
2
**
8
)
)
# optional
# configuration for zero
# you can refer to the Zero Redundancy optimizer and zero offload section for details
# https://www.colossalai.org/zero.html
zero
=
dict
(
level
=<
int
>
,
...
)
# optional
# optional
# if you are using complex gradient handling
# if you are using complex gradient handling
# otherwise, you do not need this in your config file
# otherwise, you do not need this in your config file
...
...
docs/installation.md
View file @
35813ed3
# Setup
# Setup
##
Install with pip
##
# PyPI
```
bash
```
bash
pip
install
colossalai
pip
install
colossalai
```
```
## Install from source
### Install From Source (Recommended)
> We **recommend** you to install from source as the Colossal-AI is updating frequently in the early versions. The documentation will be in line with the main branch of the repository. Feel free to raise an issue if you encounter any problem. :)
```
shell
```
shell
git clone
git@
github.com
:
hpcaitech/ColossalAI.git
git clone
https://
github.com
/
hpcaitech/ColossalAI.git
cd
ColossalAI
cd
ColossalAI
# install dependency
# install dependency
pip
install
-r
requirements/requirements.txt
pip
install
-r
requirements/requirements.txt
...
@@ -22,8 +24,4 @@ Install and enable CUDA kernel fusion (compulsory installation when using fused
...
@@ -22,8 +24,4 @@ Install and enable CUDA kernel fusion (compulsory installation when using fused
```
shell
```
shell
pip
install
-v
--no-cache-dir
--global-option
=
"--cuda_ext"
.
pip
install
-v
--no-cache-dir
--global-option
=
"--cuda_ext"
.
# install with editable enabled
pip
install
-v
--no-cache-dir
--global-option
=
"--cuda_ext"
-e
.
```
```
docs/run_demo.md
View file @
35813ed3
...
@@ -7,51 +7,92 @@ can also run on systems with only one GPU. Quick demos showing how to use Coloss
...
@@ -7,51 +7,92 @@ can also run on systems with only one GPU. Quick demos showing how to use Coloss
## Single GPU
## Single GPU
Colossal-AI can be used to train deep learning models on systems with only one GPU and achieve baseline
Colossal-AI can be used to train deep learning models on systems with only one GPU and achieve baseline
performances.
[
Here
](
https://colab.research.google.com/drive/1fJnqqFzPuzZ_kn1lwCpG2nh3l2ths0KE?usp=sharing#scrollTo=cQ_y7lBG09LS
)
performances.
We provided an example to train ResNet on CIFAR10 data with only one GPU. You can find this example in
is an
example
showing how to train a LeNet model on the CIFAR10 dataset using Colossal-AI
.
`
example
s\resnet_cifar10_data_parallel`
in the repository. Detailed instructions can be found in its
`README.md`
.
## Multiple GPUs
## Multiple GPUs
Colossal-AI can be used to train deep learning models on distributed systems with multiple GPUs and accelerate the
Colossal-AI can be used to train deep learning models on distributed systems with multiple GPUs and accelerate the
training process drastically by applying efficient parallelization techiniques, which will be elaborated in
training process drastically by applying efficient parallelization techiniques, which will be elaborated in
the
[
Parallelization
](
parallelization.md
)
section below. Run the code below on your distributed system with 4 GPUs,
the
[
Parallelization
](
parallelization.md
)
section below.
where
`HOST`
is the IP address of your system. Note that we use
the
[
Slurm
](
https://slurm.schedmd.com/documentation.html
)
job scheduling system here.
```
bash
You can turn the resnet example mentioned above into a multi-GPU training by setting
`--nproc_per_node`
to be the number of
HOST
=
xxx.xxx.xxx.xxx srun ./scripts/slurm_dist_train.sh ./examples/run_trainer.py ./configs/vit/vit_2d.py
GPUs you have on your system. We also provide an example of Vision Transformer which relies on
```
training with more GPUs. You can visit this example in
`examples\vit_b16_imagenet_data_parallel`
. It has a detailed instructional
`README.md`
for you too.
## Sample Training Script
`./configs/vit/vit_2d.py`
is a config file, which is introduced in the
[
Config file
](
config.md
)
section below. These
Below is a typical way of how you train the model using
config files are used by Colossal-AI to define all kinds of training arguments, such as the model, dataset and training
method (optimizer, lr_scheduler, epoch, etc.). Config files are highly customizable and can be modified so as to train
different models.
`./examples/run_trainer.py`
contains a standard training script and is presented below, it reads the config file and
realizes the training process.
```
python
```
python
import
colossalai
import
colossalai
from
colossalai.
core
import
global_context
as
gpc
from
colossalai.
amp
import
AMP_TYPE
from
colossalai.logging
import
get_dist_logger
from
colossalai.logging
import
get_dist_logger
from
colossalai.trainer
import
Trainer
from
colossalai.trainer
import
Trainer
,
hooks
from
colossalai.utils
import
get_dataloader
CONFIG
=
dict
(
parallel
=
dict
(
pipeline
=
1
,
tensor
=
1
,
mode
=
None
),
fp16
=
dict
(
mode
=
AMP_TYPE
.
TORCH
),
gradient_accumulation
=
4
,
clip_grad_norm
=
1.0
)
def
run_trainer
():
def
run_trainer
():
engine
,
train_dataloader
,
test_dataloader
=
colossalai
.
initialize
()
parser
=
colossalai
.
get_default_parser
()
args
=
parser
.
parse_args
()
colossalai
.
launch
(
config
=
CONFIG
,
rank
=
args
.
rank
,
world_size
=
args
.
world_size
,
host
=
args
.
host
,
port
=
args
.
port
,
backend
=
args
.
backend
)
logger
=
get_dist_logger
()
logger
=
get_dist_logger
()
logger
.
info
(
"engine is built"
,
ranks
=
[
0
])
# instantiate your compoentns
model
=
MyModel
()
optimizer
=
MyOptimizer
(
model
.
parameters
(),
...)
train_dataset
=
TrainDataset
()
test_dataset
=
TestDataset
()
train_dataloader
=
get_dataloader
(
train_dataset
,
...)
test_dataloader
=
get_dataloader
(
test_dataset
,
...)
lr_scheduler
=
MyScheduler
()
logger
.
info
(
"components are built"
)
engine
,
train_dataloader
,
test_dataloader
,
lr_scheduler
=
colossalai
.
initialize
(
model
,
optimizer
,
criterion
,
train_dataloader
,
test_dataloader
,
lr_scheduler
)
trainer
=
Trainer
(
engine
=
engine
,
trainer
=
Trainer
(
engine
=
engine
,
verbose
=
True
)
verbose
=
True
)
logger
.
info
(
"trainer is built"
,
ranks
=
[
0
])
logger
.
info
(
"start training"
,
ranks
=
[
0
])
hook_list
=
[
hooks
.
LossHook
(),
hooks
.
LRSchedulerHook
(
lr_scheduler
=
lr_scheduler
,
by_epoch
=
False
),
hooks
.
AccuracyHook
(),
hooks
.
TensorboardHook
(
log_dir
=
'./tb_logs'
,
ranks
=
[
0
]),
hooks
.
LogMetricByEpochHook
(
logger
),
hooks
.
LogMemoryByEpochHook
(
logger
),
hooks
.
SaveCheckpointHook
(
checkpoint_dir
=
'./ckpt'
)
]
trainer
.
fit
(
trainer
.
fit
(
train_dataloader
=
train_dataloader
,
train_dataloader
=
train_dataloader
,
test_dataloader
=
test_dataloader
,
test_dataloader
=
test_dataloader
,
epochs
=
gpc
.
config
.
num_epochs
,
epochs
=
NUM_EPOCH
,
hooks
_cfg
=
gpc
.
config
.
hooks
,
hooks
=
hook_list
,
display_progress
=
True
,
display_progress
=
True
,
test_interval
=
2
test_interval
=
2
)
)
...
...
docs/zero.md
View file @
35813ed3
...
@@ -19,6 +19,7 @@ Below are a few examples of ZeRO-3 configurations.
...
@@ -19,6 +19,7 @@ Below are a few examples of ZeRO-3 configurations.
### Example of ZeRO-3 Configurations
### Example of ZeRO-3 Configurations
You can refer to the
[
DeepSpeed configuration
](
https://www.deepspeed.ai/docs/config-json/#zero-optimizations-for-fp16-training
)
for details.
Here we use
`Adam`
as the initial optimizer.
Here we use
`Adam`
as the initial optimizer.
1.
Use ZeRO to partition the optimizer states, gradients (level 2), and parameters (level 3).
1.
Use ZeRO to partition the optimizer states, gradients (level 2), and parameters (level 3).
...
...
Prev
1
2
3
4
5
6
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment