Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ColossalAI
Commits
2c45efc3
Unverified
Commit
2c45efc3
authored
Mar 31, 2022
by
Liang Bowen
Committed by
GitHub
Mar 31, 2022
Browse files
html refactor (#555)
parent
d1211148
Changes
133
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
66 additions
and
87 deletions
+66
-87
colossalai/amp/__init__.py
colossalai/amp/__init__.py
+1
-1
colossalai/amp/apex_amp/__init__.py
colossalai/amp/apex_amp/__init__.py
+1
-1
colossalai/builder/builder.py
colossalai/builder/builder.py
+8
-7
colossalai/builder/pipeline.py
colossalai/builder/pipeline.py
+8
-6
colossalai/communication/utils.py
colossalai/communication/utils.py
+3
-3
colossalai/context/parallel_context.py
colossalai/context/parallel_context.py
+2
-2
colossalai/context/random/seed_manager.py
colossalai/context/random/seed_manager.py
+4
-3
colossalai/nn/layer/parallel_3d/_operation.py
colossalai/nn/layer/parallel_3d/_operation.py
+1
-1
colossalai/nn/loss/loss_2d.py
colossalai/nn/loss/loss_2d.py
+1
-1
colossalai/nn/loss/loss_2p5d.py
colossalai/nn/loss/loss_2p5d.py
+1
-1
colossalai/nn/loss/loss_3d.py
colossalai/nn/loss/loss_3d.py
+1
-1
colossalai/nn/loss/loss_moe.py
colossalai/nn/loss/loss_moe.py
+2
-2
colossalai/trainer/_trainer.py
colossalai/trainer/_trainer.py
+1
-2
colossalai/utils/memory_tracer/async_memtracer.py
colossalai/utils/memory_tracer/async_memtracer.py
+16
-15
colossalai/utils/profiler/prof_utils.py
colossalai/utils/profiler/prof_utils.py
+16
-15
docs/colossalai/colossalai.amp.apex_amp.apex_amp.rst
docs/colossalai/colossalai.amp.apex_amp.apex_amp.rst
+0
-5
docs/colossalai/colossalai.amp.apex_amp.rst
docs/colossalai/colossalai.amp.apex_amp.rst
+0
-6
docs/colossalai/colossalai.amp.naive_amp.grad_scaler.base_grad_scaler.rst
...colossalai.amp.naive_amp.grad_scaler.base_grad_scaler.rst
+0
-5
docs/colossalai/colossalai.amp.naive_amp.grad_scaler.constant_grad_scaler.rst
...ssalai.amp.naive_amp.grad_scaler.constant_grad_scaler.rst
+0
-5
docs/colossalai/colossalai.amp.naive_amp.grad_scaler.dynamic_grad_scaler.rst
...ossalai.amp.naive_amp.grad_scaler.dynamic_grad_scaler.rst
+0
-5
No files found.
colossalai/amp/__init__.py
View file @
2c45efc3
...
...
@@ -19,7 +19,7 @@ def convert_to_amp(model: nn.Module, optimizer: Optimizer, criterion: _Loss, mod
optimizer (:class:`torch.optim.Optimizer`): your optimizer object.
criterion (:class:`torch.nn.modules.loss._Loss`): your loss function object.
mode (:class:`colossalai.amp.AMP_TYPE`): amp mode.
amp_config (:class:`colossalai.context.Config`
or
dict): configuration for different amp modes
amp_config (
Union[
:class:`colossalai.context.Config`
,
dict
]
): configuration for different amp modes
.
Returns:
A tuple (model, optimizer, criterion).
...
...
colossalai/amp/apex_amp/__init__.py
View file @
2c45efc3
...
...
@@ -9,7 +9,7 @@ def convert_to_apex_amp(model: nn.Module, optimizer: Optimizer, amp_config):
Args:
model (:class:`torch.nn.Module`): your model object.
optimizer (:class:`torch.optim.Optimizer`): your optimizer object.
amp_config (:class:
colossalai.context.Config
or
dict): configuration for initializing apex_amp.
amp_config (
Union[
:class:
`
colossalai.context.Config
`,
dict
]
): configuration for initializing apex_amp.
The ``amp_config`` should include parameters below:
::
...
...
colossalai/builder/builder.py
View file @
2c45efc3
...
...
@@ -29,21 +29,22 @@ def build_from_registry(config, registry: Registry):
is specified by `registry`.
Note:
the `config` is used to construct the return object such as `LAYERS`,
`OPTIMIZERS`
and other support types in `registry`. The `config` should contain
all required parameters of corresponding object. The details of support
types in `registry` and the `mod_type` in `config` could be found in
`registry <https://github.com/hpcaitech/ColossalAI/blob/main/colossalai/registry/__init__.py>`_.
the `config` is used to construct the return object such as `LAYERS`,
`OPTIMIZERS`
and other support types in `registry`. The `config` should contain
all required parameters of corresponding object. The details of support
types in `registry` and the `mod_type` in `config` could be found in
`registry <https://github.com/hpcaitech/ColossalAI/blob/main/colossalai/registry/__init__.py>`_.
Args:
config (dict or :class:`colossalai.context.colossalai.context.Config`): information
used in the construction of the return object.
registry (:class:`Registry`): A registry specifying the type of the return object
Returns: A Python object specified by `registry`
Returns:
A Python object specified by `registry`.
Raises:
Exception: Raises an Exception if an error occurred when building from registry
Exception: Raises an Exception if an error occurred when building from registry
.
"""
config_
=
config
.
copy
()
# keep the original config untouched
assert
isinstance
(
...
...
colossalai/builder/pipeline.py
View file @
2c45efc3
...
...
@@ -163,17 +163,19 @@ def count_layer_params(layers):
def
build_pipeline_model_from_cfg
(
config
,
num_chunks
:
int
=
1
,
partition_method
:
str
=
'parameter'
,
verbose
:
bool
=
False
):
"""An intializer to split the model into different stages for pipeline parallelism.
"""An in
i
tializer to split the model into different stages for pipeline parallelism.
An example for the model config is shown below. The class VisionTransformerFromConfig should
inherit colossalai.nn.model.ModelFromConfig to allow this initializer to build model from a sequence
of layer configurations.
model_config = dict(
type='VisionTransformerFromConfig',
embedding_cfg=dict(...),
...
)
::
model_config = dict(
type='VisionTransformerFromConfig',
embedding_cfg=dict(...),
...
)
Args:
config (dict): Configuration of the model.
...
...
colossalai/communication/utils.py
View file @
2c45efc3
...
...
@@ -45,7 +45,7 @@ def recv_tensor_meta(tensor_shape, prev_rank=None):
prev_rank (int): The rank of the source of the tensor.
Returns:
torch.Size: The shape of the tensor to be received.
:class:`
torch.Size
`
: The shape of the tensor to be received.
"""
if
tensor_shape
is
None
:
if
prev_rank
is
None
:
...
...
@@ -71,7 +71,7 @@ def split_tensor_into_1d_equal_chunks(tensor, new_buffer=False):
new_buffer (bool, optional): Whether to use a new buffer to store sliced tensor.
Returns:
torch.Tensor
: The split tensor
:class:`torch.Size`
: The split tensor
"""
partition_size
=
torch
.
numel
(
tensor
)
//
gpc
.
get_world_size
(
ParallelMode
.
PARALLEL_1D
)
start_index
=
partition_size
*
gpc
.
get_local_rank
(
ParallelMode
.
PARALLEL_1D
)
...
...
@@ -92,7 +92,7 @@ def gather_split_1d_tensor(tensor):
Args:
tensor (torch.Tensor): Tensor to be gathered after communication.
Returns:
gathered (torch.Tensor)
: The gathered tensor
:class:`torch.Size`
: The gathered tensor
.
"""
world_size
=
gpc
.
get_world_size
(
ParallelMode
.
PARALLEL_1D
)
numel
=
torch
.
numel
(
tensor
)
...
...
colossalai/context/parallel_context.py
View file @
2c45efc3
...
...
@@ -193,7 +193,7 @@ class ParallelContext(metaclass=SingletonMeta):
Returns:
bool: a boolean value indicating whether the current device is the first one
among its group for `parallel_mode`.
among its group for `parallel_mode`.
"""
rank
=
self
.
get_local_rank
(
parallel_mode
)
return
rank
==
0
...
...
@@ -211,7 +211,7 @@ class ParallelContext(metaclass=SingletonMeta):
Returns:
bool: a boolean value indicating whether the current device is the first one
among its group for `parallel_mode`.
among its group for `parallel_mode`.
"""
rank
=
self
.
get_local_rank
(
parallel_mode
)
world_size
=
self
.
get_world_size
(
parallel_mode
)
...
...
colossalai/context/random/seed_manager.py
View file @
2c45efc3
...
...
@@ -34,6 +34,7 @@ class SeedManager:
def
set_state
(
self
,
parallel_mode
:
ParallelMode
,
state
:
Tensor
):
"""Sets the state of the seed manager for `parallel_mode`.
Args:
parallel_mode (:class:`colossalai.context.ParallelMode`): The chosen parallel mode.
state (:class:`torch.Tensor`): the state to be set.
...
...
@@ -66,9 +67,9 @@ class SeedManager:
seed (int): The seed to be added.
overwrtie (bool, optional): Whether allows to overwrite the seed that has been set already
Raises
AssertionError: Raises an AssertionError if `parallel_mode` is not an instance of
:class:`colossalai.context.ParallelMode`
or the seed for `parallel_mode` has been added.
Raises
:
AssertionError: Raises an AssertionError if `parallel_mode` is not an instance of
:class:`colossalai.context.ParallelMode`
or the seed for `parallel_mode` has been added.
"""
assert
isinstance
(
parallel_mode
,
ParallelMode
),
'A valid ParallelMode must be provided'
if
overwrtie
is
False
:
...
...
colossalai/nn/layer/parallel_3d/_operation.py
View file @
2c45efc3
...
...
@@ -264,7 +264,7 @@ def layernorm_3d(input_: Tensor, weight: Tensor, bias: Tensor, normalized_shape:
def
split_tensor_3d
(
tensor
:
Tensor
,
dim
:
int
,
parallel_mode
:
ParallelMode
)
->
Tensor
:
r
"""Splits 3D parallel tensor in specified dimension.
Args:
Args:
tensor (:class:`torch.tensor`): Input tensor.
dim (int): Specified dimension in which to split.
parallel_mode (:class:`colossalai.context.parallel_mode.ParallelMode`, optional): Parallel mode.
...
...
colossalai/nn/loss/loss_2d.py
View file @
2c45efc3
...
...
@@ -27,7 +27,7 @@ class CrossEntropyLoss2D(_Loss):
reduce (bool, optional)
label_smoothing (float, optional)
More details about args
,
kwargs and torch.nn.functional.cross_entropy could be found in
More details about
``
args
``, ``
kwargs
``
and
``
torch.nn.functional.cross_entropy
``
could be found in
`Cross_entropy <https://pytorch.org/docs/stable/generated/torch.nn.functional.cross_entropy.html#torch.nn.functional.cross_entropy>`_.
"""
...
...
colossalai/nn/loss/loss_2p5d.py
View file @
2c45efc3
...
...
@@ -27,7 +27,7 @@ class CrossEntropyLoss2p5D(_Loss):
reduce (bool, optional)
label_smoothing (float, optional)
More details about args
,
kwargs and torch.nn.functional.cross_entropy could be found in
More details about
``
args
``, ``
kwargs
``
and
``
torch.nn.functional.cross_entropy
``
could be found in
`Cross_entropy <https://pytorch.org/docs/stable/generated/torch.nn.functional.cross_entropy.html#torch.nn.functional.cross_entropy>`_.
"""
def
__init__
(
self
,
reduction
=
True
,
*
args
,
**
kwargs
):
...
...
colossalai/nn/loss/loss_3d.py
View file @
2c45efc3
...
...
@@ -27,7 +27,7 @@ class CrossEntropyLoss3D(_Loss):
reduce (bool, optional)
label_smoothing (float, optional)
More details about args
,
kwargs and torch.nn.functional.cross_entropy could be found in
More details about
``
args
``, ``
kwargs
``
and
``
torch.nn.functional.cross_entropy
``
could be found in
`Cross_entropy <https://pytorch.org/docs/stable/generated/torch.nn.functional.cross_entropy.html#torch.nn.functional.cross_entropy>`_.
"""
...
...
colossalai/nn/loss/loss_moe.py
View file @
2c45efc3
...
...
@@ -23,7 +23,7 @@ class MoeCrossEntropyLoss(_Loss):
reduction (str, optional)
label_smoothing (float, optional)
More details about args
,
kwargs and torch.nn.functional.cross_entropy could be found in
More details about
``
args
``, ``
kwargs
``
and
``
torch.nn.functional.cross_entropy
``
could be found in
`Cross_entropy <https://pytorch.org/docs/stable/generated/torch.nn.functional.cross_entropy.html#torch.nn.functional.cross_entropy>`_.
"""
...
...
@@ -40,7 +40,7 @@ class MoeCrossEntropyLoss(_Loss):
input (:class:`torch.tensor`): Predicted unnormalized scores (often referred to as logits).
target (:class:`torch.tensor`): Ground truth class indices or class probabilities.
More details about args
,
kwargs and torch.nn.functional.cross_entropy could be found in
More details about
``
args
``, ``
kwargs
``
and
``
torch.nn.functional.cross_entropy
``
could be found in
`Cross_entropy <https://pytorch.org/docs/stable/generated/torch.nn.functional.cross_entropy.html#torch.nn.functional.cross_entropy>`_.
"""
main_loss
=
self
.
loss
(
*
args
)
...
...
colossalai/trainer/_trainer.py
View file @
2c45efc3
...
...
@@ -307,8 +307,7 @@ class Trainer:
max_steps (int, optional): Maximum number of running iterations.
test_dataloader (:class:`torch.utils.data.DataLoader`, optional): DataLoader for validation.
test_interval (int, optional): Interval of validation
hooks (list[`BaseHook <https://github.com/hpcaitech/ColossalAI/tree/main/colossalai/trainer/hooks>`_],
optional): A list of hooks used in training.
hooks (list[BaseHook], optional): A list of hooks used in training.
display_progress (bool, optional): If True, a progress bar will be displayed.
"""
...
...
colossalai/utils/memory_tracer/async_memtracer.py
View file @
2c45efc3
...
...
@@ -21,21 +21,22 @@ class AsyncMemoryMonitor:
:type power: int
Usage:
```python
async_mem_monitor = AsyncMemoryMonitor()
input = torch.randn(2, 20).cuda()
OP1 = torch.nn.Linear(20, 30).cuda()
OP2 = torch.nn.Linear(30, 40).cuda()
async_mem_monitor.start()
output = OP1(input)
async_mem_monitor.finish()
async_mem_monitor.start()
output = OP2(output)
async_mem_monitor.finish()
async_mem_monitor.save('log.pkl')
```
::
```python
async_mem_monitor = AsyncMemoryMonitor()
input = torch.randn(2, 20).cuda()
OP1 = torch.nn.Linear(20, 30).cuda()
OP2 = torch.nn.Linear(30, 40).cuda()
async_mem_monitor.start()
output = OP1(input)
async_mem_monitor.finish()
async_mem_monitor.start()
output = OP2(output)
async_mem_monitor.finish()
async_mem_monitor.save('log.pkl')
```
"""
def
__init__
(
self
,
power
:
int
=
10
):
...
...
colossalai/utils/profiler/prof_utils.py
View file @
2c45efc3
...
...
@@ -73,25 +73,26 @@ class ProfilerContext(object):
"""
Profiler context manager
Usage:
::
```python
world_size = 4
inputs = torch.randn(10, 10, dtype=torch.float32, device=get_current_device())
outputs = torch.empty(world_size, 10, 10, dtype=torch.float32, device=get_current_device())
outputs_list = list(torch.chunk(outputs, chunks=world_size, dim=0))
```python
world_size = 4
inputs = torch.randn(10, 10, dtype=torch.float32, device=get_current_device())
outputs = torch.empty(world_size, 10, 10, dtype=torch.float32, device=get_current_device())
outputs_list = list(torch.chunk(outputs, chunks=world_size, dim=0))
cc_prof = CommProfiler()
cc_prof = CommProfiler()
with ProfilerContext([cc_prof]) as prof:
op = dist.all_reduce(inputs, async_op=True)
dist.all_gather(outputs_list, inputs)
op.wait()
dist.reduce_scatter(inputs, outputs_list)
dist.broadcast(inputs, 0)
dist.reduce(inputs, 0)
with ProfilerContext([cc_prof]) as prof:
op = dist.all_reduce(inputs, async_op=True)
dist.all_gather(outputs_list, inputs)
op.wait()
dist.reduce_scatter(inputs, outputs_list)
dist.broadcast(inputs, 0)
dist.reduce(inputs, 0)
prof.show()
```
prof.show()
```
"""
def
__init__
(
self
,
profilers
:
List
[
BaseProfiler
]
=
None
,
enable
:
bool
=
True
):
...
...
docs/colossalai/colossalai.amp.apex_amp.apex_amp.rst
deleted
100644 → 0
View file @
d1211148
colossalai.amp.apex\_amp.apex\_amp
==================================
.. automodule:: colossalai.amp.apex_amp.apex_amp
:members:
docs/colossalai/colossalai.amp.apex_amp.rst
View file @
2c45efc3
...
...
@@ -3,9 +3,3 @@ colossalai.amp.apex\_amp
.. automodule:: colossalai.amp.apex_amp
:members:
.. toctree::
:maxdepth: 2
colossalai.amp.apex_amp.apex_amp
docs/colossalai/colossalai.amp.naive_amp.grad_scaler.base_grad_scaler.rst
deleted
100644 → 0
View file @
d1211148
colossalai.amp.naive\_amp.grad\_scaler.base\_grad\_scaler
=========================================================
.. automodule:: colossalai.amp.naive_amp.grad_scaler.base_grad_scaler
:members:
docs/colossalai/colossalai.amp.naive_amp.grad_scaler.constant_grad_scaler.rst
deleted
100644 → 0
View file @
d1211148
colossalai.amp.naive\_amp.grad\_scaler.constant\_grad\_scaler
=============================================================
.. automodule:: colossalai.amp.naive_amp.grad_scaler.constant_grad_scaler
:members:
docs/colossalai/colossalai.amp.naive_amp.grad_scaler.dynamic_grad_scaler.rst
deleted
100644 → 0
View file @
d1211148
colossalai.amp.naive\_amp.grad\_scaler.dynamic\_grad\_scaler
============================================================
.. automodule:: colossalai.amp.naive_amp.grad_scaler.dynamic_grad_scaler
:members:
Prev
1
2
3
4
5
…
7
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment