- 25 Mar, 2024 1 commit
-
-
Wenhao Chen authored
* fix: simplify merge_batch * fix: use return_outputs=False to eliminate extra memory consumption * feat: add return_outputs warning * style: remove `return_outputs=False` as it is the default value
-
- 28 Nov, 2023 1 commit
-
-
Wenhao Chen authored
* [shardformer] implement policy for all GPT-J models and test * [shardformer] support interleaved pipeline parallel for bert finetune * [shardformer] shardformer support falcon (#4883) * [shardformer]: fix interleaved pipeline for bert model (#5048) * [hotfix]: disable seq parallel for gptj and falcon, and polish code (#5093) * Add Mistral support for Shardformer (#5103) * [shardformer] add tests to mistral (#5105) --------- Co-authored-by:
Pengtai Xu <henryxu880@gmail.com> Co-authored-by:
ppt0011 <143150326+ppt0011@users.noreply.github.com> Co-authored-by:
flybird11111 <1829166702@qq.com> Co-authored-by:
eric8607242 <e0928021388@gmail.com>
-
- 24 Nov, 2023 1 commit
-
-
digger yu authored
-
- 22 Nov, 2023 1 commit
-
-
digger yu authored
-
- 17 Oct, 2023 1 commit
-
-
Baizhou Zhang authored
* add test * fix no_sync bug in low level zero plugin * fix test * add argument for grad accum * add grad accum in backward hook for gemini * finish implementation, rewrite tests * fix test * skip stuck model in low level zero test * update doc * optimize communication & fix gradient checkpoint * modify doc * cleaning codes * update cpu adam fp16 case
-
- 27 Sep, 2023 1 commit
-
-
Hongxin Liu authored
-
- 26 Sep, 2023 1 commit
-
-
Baizhou Zhang authored
* fix example format in docstring * polish shardformer doc
-
- 21 Sep, 2023 1 commit
-
-
Hongxin Liu authored
* [doc] clean up outdated docs * [doc] fix linking * [doc] fix linking
-
- 15 Sep, 2023 5 commits
-
-
Baizhou Zhang authored
* arrange position of chapters * fix typos in seq parallel doc
-
Bin Jia authored
* update doc of seq parallel * fix typo
-
flybird11111 authored
* [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document * [shardformer] update pipeline parallel document
-
Baizhou Zhang authored
* add compatibility matrix for shardformer doc * update tp doc
-
Baizhou Zhang authored
* create shardformer doc files * add docstring for seq-parallel * update ShardConfig docstring * add links to llama example * add outdated massage * finish introduction & supporting information * finish 'how shardformer works' * finish shardformer.md English doc * fix doctest fail * add Chinese document
-
- 05 Sep, 2023 3 commits
-
-
Hongxin Liu authored
-
Hongxin Liu authored
* [legacy] move engine to legacy * [example] fix seq parallel example * [example] fix seq parallel example * [test] test gemini pluging hang * [test] test gemini pluging hang * [test] test gemini pluging hang * [test] test gemini pluging hang * [test] test gemini pluging hang * [example] update seq parallel requirements
-
Hongxin Liu authored
* [legacy] move trainer to legacy * [doc] update docs related to trainer * [test] ignore legacy test
-
- 24 Aug, 2023 1 commit
-
-
Hongxin Liu authored
* [gemini] remove distributed-related part from colotensor (#4379) * [gemini] remove process group dependency * [gemini] remove tp part from colo tensor * [gemini] patch inplace op * [gemini] fix param op hook and update tests * [test] remove useless tests * [test] remove useless tests * [misc] fix requirements * [test] fix model zoo * [test] fix model zoo * [test] fix model zoo * [test] fix model zoo * [test] fix model zoo * [misc] update requirements * [gemini] refactor gemini optimizer and gemini ddp (#4398) * [gemini] update optimizer interface * [gemini] renaming gemini optimizer * [gemini] refactor gemini ddp class * [example] update gemini related example * [example] update gemini related example * [plugin] fix gemini plugin args * [test] update gemini ckpt tests * [gemini] fix checkpoint io * [example] fix opt example requirements * [example] fix opt example * [example] fix opt example * [example] fix opt example * [gemini] add static placement policy (#4443) * [gemini] add static placement policy * [gemini] fix param offload * [test] update gemini tests * [plugin] update gemini plugin * [plugin] update gemini plugin docstr * [misc] fix flash attn requirement * [test] fix gemini checkpoint io test * [example] update resnet example result (#4457) * [example] update bert example result (#4458) * [doc] update gemini doc (#4468) * [example] update gemini related examples (#4473) * [example] update gpt example * [example] update dreambooth example * [example] update vit * [example] update opt * [example] update palm * [example] update vit and opt benchmark * [hotfix] fix bert in model zoo (#4480) * [hotfix] fix bert in model zoo * [test] remove chatglm gemini test * [test] remove sam gemini test * [test] remove vit gemini test * [hotfix] fix opt tutorial example (#4497) * [hotfix] fix opt tutorial example * [hotfix] fix opt tutorial example
-
- 04 Aug, 2023 1 commit
-
-
flybird1111 authored
* [doc] fix gradient accumulation doc * [doc] fix gradient accumulation doc
-
- 28 Jun, 2023 1 commit
-
-
Jianghai authored
* fix some typos and problems in doc * fix some typos and problems in doc * add doc test
-
- 26 Jun, 2023 1 commit
-
-
Baizhou Zhang authored
-
- 09 Jun, 2023 1 commit
-
-
Frank Lee authored
-
- 07 Jun, 2023 1 commit
-
-
Hongxin Liu authored
* [doc] add lazy init en doc * [doc] add lazy init zh doc * [doc] add lazy init doc in sidebar * [doc] add lazy init doc test * [doc] fix lazy init doc link
-
- 06 Jun, 2023 1 commit
-
-
Baizhou Zhang authored
-
- 30 May, 2023 1 commit
-
-
jiangmingyan authored
* [doc] fix title of mixed precision * [doc]update document of zero with chunk * [doc] update document of zero with chunk, fix * [doc] update document of zero with chunk, fix * [doc] update document of zero with chunk, fix * [doc] update document of zero with chunk, add doc test * [doc] update document of zero with chunk, add doc test * [doc] update document of zero with chunk, fix installation * [doc] update document of zero with chunk, fix zero with chunk doc * [doc] update document of zero with chunk, fix zero with chunk doc
-
- 25 May, 2023 1 commit
-
-
jiangmingyan authored
-
- 24 May, 2023 2 commits
- 23 May, 2023 6 commits
-
-
jiangmingyan authored
-
jiangmingyan authored
-
jiangmingyan authored
-
Mingyan Jiang authored
-
Mingyan Jiang authored
-
jiangmingyan authored
* [doc]update gradient accumulation * [doc]update gradient accumulation * [doc]update gradient accumulation * [doc]update gradient accumulation * [doc]update gradient accumulation, fix * [doc]update gradient accumulation, fix * [doc]update gradient accumulation, fix * [doc]update gradient accumulation, add sidebars * [doc]update gradient accumulation, fix * [doc]update gradient accumulation, fix * [doc]update gradient accumulation, fix * [doc]update gradient accumulation, resolve comments * [doc]update gradient accumulation, resolve comments * fix
-
- 22 May, 2023 2 commits
-
-
jiangmingyan authored
* [doc] update gradient clipping document * [doc] update gradient clipping document * [doc] update gradient clipping document * [doc] update gradient clipping document * [doc] update gradient clipping document * [doc] update gradient clipping document * [doc] update gradient clipping doc, fix sidebars.json * [doc] update gradient clipping doc, fix doc test
-
Hongxin Liu authored
* [doc] add docstr for booster methods * [doc] fix autodoc
-
- 19 May, 2023 1 commit
-
-
Hongxin Liu authored
* [doc] add en cluster utils doc * [doc] add zh cluster utils doc * [doc] add cluster utils doc in sidebar
-
- 18 May, 2023 1 commit
-
-
jiangmingyan authored
-
- 26 Apr, 2023 1 commit
-
-
digger-yu authored
* Fixed several spelling errors under colossalai * Fix the spelling error in colossalai and docs directory * Cautious Changed the spelling error under the example folder * Update runtime_preparation_pass.py revert autograft to autograd * Update search_chunk.py utile to until * Update check_installation.py change misteach to mismatch in line 91 * Update 1D_tensor_parallel.md revert to perceptron * Update 2D_tensor_parallel.md revert to perceptron in line 73 * Update 2p5D_tensor_parallel.md revert to perceptron in line 71 * Update 3D_tensor_parallel.md revert to perceptron in line 80 * Update README.md revert to resnet in line 42 * Update reorder_graph.py revert to indice in line 7 * Update p2p.py revert to megatron in line 94 * Update initialize.py revert to torchrun in line 198 * Update routers.py change to detailed in line 63 * Update routers.py change to detailed in line 146 * Update README.md revert random number in line 402
-
- 17 Apr, 2023 1 commit
-
-
digger-yu authored
Display format optimization , same as fix#3562 Simultaneous modification of en version
-
- 04 Apr, 2023 1 commit
-
-
ver217 authored
* [zero] refactor low-level zero folder structure * [zero] fix legacy zero import path * [zero] fix legacy zero import path * [zero] remove useless import * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] fix test import path * [zero] fix test * [zero] fix circular import * [zero] update import
-