- 24 Sep, 2022 1 commit
-
-
HELSON authored
-
- 23 Sep, 2022 8 commits
-
-
HELSON authored
-
HELSON authored
-
YuliangLiu0306 authored
* [tensor] use communication autograd func * change all to all comm spec info * rename pattern and distinguish fwd/bwd * polish code
-
YuliangLiu0306 authored
-
YuliangLiu0306 authored
-
Boyuan Yao authored
* [fx] modify offload codegen * [fx] remove repeated hook definitions * [fx] modify offload test
-
YuliangLiu0306 authored
-
Super Daniel authored
* [fx] tuned the meta info and rotor solver. * [fx] remove import. * [fx] remove import. * [fx] remove import. * [fx] tune the meta calculations. * [fx] polish comments. * [fx] remove assertions. * [fx] modify test cases. * [fx] modify test cases. * [fx] optimize import. * [fx
-
- 22 Sep, 2022 2 commits
-
-
HELSON authored
* remove forced FP32 modules * correct no_shard-contexts' positions
-
Jiarui Fang authored
-
- 21 Sep, 2022 1 commit
-
-
Frank Lee authored
-
- 20 Sep, 2022 5 commits
-
-
Kirigaya Kazuto authored
* [pipeline/tuning] improve dispatch performance both time and space cost * [pipeline/converge] add interface for testing convergence * [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style * Update PipelineBase.py * [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera * [pipeline/chimera] test chimera | fix bug of initializing
-
Jiarui Fang authored
-
YuliangLiu0306 authored
* [fx] PoC of runtime shape consistency application * polish code
-
YuliangLiu0306 authored
-
Boyuan Yao authored
* [fx] add pofo algorithm * [fx] Add pofo solver * [fx] code refactor * [fx] fix test_linearize import
-
- 19 Sep, 2022 1 commit
-
-
Kirigaya Kazuto authored
[pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera (#1595) * [pipeline/tuning] improve dispatch performance both time and space cost * [pipeline/converge] add interface for testing convergence * [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style * Update PipelineBase.py * [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera
-
- 16 Sep, 2022 1 commit
-
-
YuliangLiu0306 authored
* [autoparallel] add bcast op handler * polish code * add more BCAST FUNC OP * polish code * add exception handler * polish
-
- 14 Sep, 2022 3 commits
-
-
Boyuan Yao authored
* [fx] add input activation offload to codegen * [fx] modify unit test * [fx] remove two skips in torch11 * [fx] use all_input_nodes instead of _input_nodes
-
Super Daniel authored
* [fx] add some comment and docstrings. * [fx] add dataflow analysis for an autograd graph. * add intepretation for graph analysis. * [fx] before doing save_tensor_hooks. * [fx] provide an accurate estimation of memory except for GPT-2. * [fx] provide an accurate estimation of memory except for GPT-2. * [fx] provide an accurate estimation of memory except for GPT-2. * [fx] a very accurate version on GPT-2. * [fx] refactor code. * [fx] remove redundant inplace=True. * [fx] refactor code. * [fx] refactor code. * [fx] refactor code. * [fx] dive into backward memory. * [fx] fix variable names in ckpt_solvers and unskip tests. * [fx] commit my changes. * [fx] restore skips. * [fx] restore skips. * [fx] chaange stage into phase. * [fx] chaange stage into phase. * [fx] chaange stage into phase.
-
YuliangLiu0306 authored
* [autoparallel] add reshape handler * polish code
-
- 13 Sep, 2022 5 commits
-
-
Frank Lee authored
* [autoparallel] refactored shape consistency to remove redundancy * polish code * polish code * polish code
-
YuliangLiu0306 authored
-
Frank Lee authored
-
YuliangLiu0306 authored
* [autoparallel]adapt solver with resnet * polish code * polish code
-
CsRic authored
-
- 12 Sep, 2022 1 commit
-
-
Boyuan Yao authored
* [fx] add nested activation_checkpoint codegen * undo algorithms commits * solver * undo some commits * [fx] torch11 add nested activation checkpoint codegen * remove some imports * [fx] add some comments in activation codegen * [fx] codegen instance error fix
-
- 08 Sep, 2022 1 commit
-
-
アマデウス authored
-
- 07 Sep, 2022 4 commits
-
-
Kirigaya Kazuto authored
-
Super Daniel authored
* [fx] compute memory stat and flop count for MetaInfoProp. * [fx] modify node attribute. * [fx] modify ckpt_chen. * [fx] fix compatibility. * [fx] fix import error. * [fx] skip test for MetaInfoProp. * [fx] skip test for MetaInfoProp. * [fx] skip test for MetaInfoProp. * [fx] skip test for MetaInfoProp. * [fx] skip if torch 1.11.0. * [fx] recover MetaInfoProp support for PyTorch 1.11. * [fx] provide a stable but not accurate enough version of profiler. * [fx] provide a stable but not accurate enough version of profiler. * [fx] fix compatibility in tests. * [fx] fix compatibility in tests. * [fx] fix compatibility in tests. * [fx] fix compatibility in tests. * [fx] fix compatibility in tests. * [fx] fix compatibility in tests. * [fx] fix compatibility in tests. * [fx] fix compatibility in tests. * [fx] fix compatibility in tests. * [fx] fix compatibility in tests. * [fx] fix import error.
-
YuliangLiu0306 authored
-
YuliangLiu0306 authored
-
- 06 Sep, 2022 1 commit
-
-
Jiarui Fang authored
-
- 05 Sep, 2022 1 commit
-
-
CsRic authored
-
- 02 Sep, 2022 1 commit
-
-
Boyuan Yao authored
* [fx] modify solver linearize and add test * [fx] add torch11 test of linearize but skip it * [fx] remove some unused imports
-
- 01 Sep, 2022 4 commits
-
-
Super Daniel authored
* [fx] add test for meta tensor. * [fx] add test for meta tensor. * [fx] add test for meta tensor. * [fx] add test for meta tensor. * [fx] fix error.
-
YuliangLiu0306 authored
-
CsRic authored
-
Kirigaya Kazuto authored
[pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508) * support p2p communication with any type of object | pass test * reconstruct pipeline schedule with p2p_v2.py(support communication with List[Any]) | pass test * [engin/schedule] use p2p_v2 to recontruct pipeline_schedule * [pipeline/rpc] implement a demo for PP with cuda rpc framework * [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B * [pipeline/rpc] implement distributed optimizer | test with assert_close * [pipeline/rpc] implement distributed optimizer | test with assert_close * [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy * [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy * [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy * [pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP * [pipeline/pipleline_process_group] remove comment * [pipeline/pipleline_process_group] remove comment * [pipeline/pipleline_process_group] skip process group test * [pipeline/pipleline_process_group] remove test named function
-