Commits · 81ea66d25d9dc10fcd4d7331e7a2274e849f0909 · OpenDAS / ColossalAI

13 Feb, 2023 1 commit
- [release] v0.2.3 (#2669) · 81ea66d2
  Frank Lee authored Feb 13, 2023
```
* [release] v0.2.3

* polish code
```
  81ea66d2
10 Feb, 2023 2 commits
- [Docs] layout converting management (#2665) · 8de85051
  YuliangLiu0306 authored Feb 10, 2023
  
  8de85051
- [release] v0.2.2 (#2661) · b673e5f7
  Frank Lee authored Feb 10, 2023
  
  b673e5f7
09 Feb, 2023 3 commits
- [doc] fixed compatiblity with docusaurus (#2657) · cd4f02be
  Frank Lee authored Feb 09, 2023
  
  cd4f02be
- [doc] added docusaurus-based version control (#2656) · a4ae43f0
  Frank Lee authored Feb 09, 2023
  
  a4ae43f0
- [doc] migrate the markdown files (#2652) · 85b2303b
  Frank Lee authored Feb 09, 2023
  
  85b2303b
08 Feb, 2023 1 commit
- [doc] updated the sphinx theme (#2635) · d3480396
  Frank Lee authored Feb 08, 2023
  
  d3480396
18 Nov, 2022 1 commit
- Update requirements.txt · a01278e8
  binmakeswell authored Nov 18, 2022
  
  a01278e8
17 Nov, 2022 1 commit
- [Gemini] ZeROHookV2 -> GeminiZeROHook (#1972) · cc0ed7cf
  Jiarui Fang authored Nov 17, 2022
  
  cc0ed7cf
25 Oct, 2022 1 commit
- fix file name (#1759) · 63f250bb
  Ziyue Jiang authored Oct 25, 2022
```
Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
```
  63f250bb
21 Jul, 2022 1 commit

[doc] update rst and docstring (#1351) · d068af81

ver217 authored Jul 21, 2022

* update rst

* add zero docstr

* fix docstr

* remove fx.tracer.meta_patch

* fix docstr

* fix docstr

* update fx rst

* fix fx docstr

* remove useless rst

d068af81

14 Jul, 2022 1 commit
- [hotfix] remove potiential circle import (#1307) · 4165eabb
  Jiarui Fang authored Jul 14, 2022
```
* make it faster

* [hotfix] remove circle import
```
  4165eabb
19 Apr, 2022 1 commit
- [refactor] moving memtracer to gemini (#801) · 4d9332b4
  Jiarui Fang authored Apr 19, 2022
  
  4d9332b4
01 Apr, 2022 1 commit
- update rst (#615) · f69507dd
  ver217 authored Apr 01, 2022
  
  f69507dd
31 Mar, 2022 1 commit
- html refactor (#555) · 2c45efc3
  Liang Bowen authored Mar 31, 2022
  
  2c45efc3
30 Mar, 2022 1 commit
- [docs] updatad docs of hybrid adam and cpu adam (#552) · c44d7970
  LuGY authored Mar 30, 2022
  
  c44d7970
25 Mar, 2022 1 commit
- [doc] update apidoc (#530) · ffca99d1
  ver217 authored Mar 25, 2022
  
  ffca99d1
22 Mar, 2022 1 commit
- docs get correct release version (#489) · 9caa8b64
  ver217 authored Mar 22, 2022
  
  9caa8b64
21 Mar, 2022 1 commit
- [doc] update rst (#470) · 7e30068a
  ver217 authored Mar 21, 2022
```
* update rst

* remove empty rst
```
  7e30068a
11 Mar, 2022 4 commits
- update README and images path (#384) · ce7b2c9a
  binmakeswell authored Mar 11, 2022
  
  ce7b2c9a
- add community group and update issue template(#271) · 08eccfe6
  binmakeswell authored Feb 28, 2022
  
  08eccfe6
- update experimental visualization (#253) · 3312d716
  Sze-qq authored Feb 28, 2022
  
  3312d716
- add Chinese README · 753035ed
  binmakeswell authored Feb 18, 2022
  
  753035ed
21 Jan, 2022 1 commit
- update logo · 6fb550ac
  WANG-CR authored Jan 21, 2022
  
  6fb550ac
19 Jan, 2022 3 commits
- update doc requirements and rtd conf (#165) · 1949d3a8
  ver217 authored Jan 19, 2022
  
  1949d3a8
- removed tutorial markdown and refreshed rst files for consistency · be85a0f3
  Frank Lee authored Jan 19, 2022
  
  be85a0f3
- add logo at homepage, add forum in issue template (#161) · 17ce8569
  binmakeswell authored Jan 19, 2022
  
  17ce8569
18 Jan, 2022 1 commit
- AMP docstring/markdown update (#160) · 9473a1b9
  puck_WCR authored Jan 18, 2022
  
  9473a1b9
30 Dec, 2021 1 commit

Optimize pipeline schedule (#94) · 96780e6e

ver217 authored Dec 30, 2021



* add pipeline shared module wrapper and update load batch

* added model parallel process group for amp and clip grad (#86)

* added model parallel process group for amp and clip grad

* update amp and clip with model parallel process group

* remove pipeline_prev/next group (#88)

* micro batch offload

* optimize pipeline gpu memory usage

* pipeline can receive tensor shape (#93)

* optimize pipeline gpu memory usage

* fix grad accumulation step counter

* rename classes and functions
Co-authored-by: Frank Lee <somerlee.9@gmail.com>

96780e6e

20 Dec, 2021 1 commit
- add interleaved pipeline, fix naive amp and update pipeline model initializer (#80) · 8f02a88d
  ver217 authored Dec 20, 2021
  
  8f02a88d
13 Dec, 2021 1 commit
- update examples and sphnix docs for the new api (#63) · 35813ed3
  Frank Lee authored Dec 13, 2021
  
  35813ed3
10 Dec, 2021 2 commits
- fix zero3 fp16 and add zero3 model context (#62) · 7d371105
  ver217 authored Dec 10, 2021
  
  7d371105
- update markdown docs (english) (#60) · 9a046653
  Frank Lee authored Dec 10, 2021
  
  9a046653
09 Dec, 2021 1 commit

Develop/experiments (#59) · da01c234

Frank Lee authored Dec 09, 2021



* Add gradient accumulation, fix lr scheduler

* fix FP16 optimizer and adapted torch amp with tensor parallel (#18)

* fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes

* fixed trainer

* Revert "fixed trainer"

This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.

* improved consistency between trainer, engine and schedule (#23)
Co-authored-by: 1SAA <c2h214748@gmail.com>

* Split conv2d, class token, positional embedding in 2d, Fix random number in ddp
Fix convergence in cifar10, Imagenet1000

* Integrate 1d tensor parallel in Colossal-AI (#39)

* fixed 1D and 2D convergence (#38)

* optimized 2D operations

* fixed 1D ViT convergence problem

* Feature/ddp (#49)

* remove redundancy func in setup (#19) (#20)

* use env to control the language of doc (#24) (#25)

* Support TP-compatible Torch AMP and Update trainer API (#27)

* Add gradient accumulation, fix lr scheduler

* fix FP16 optimizer and adapted torch amp with tensor parallel (#18)

* fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes

* fixed trainer

* Revert "fixed trainer"

This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.

* improved consistency between trainer, engine and schedule (#23)
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: ver217 <lhx0217@gmail.com>

* add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29)

* add explanation for ViT example (#35) (#36)

* support torch ddp

* fix loss accumulation

* add log for ddp

* change seed

* modify timing hook
Co-authored-by: Frank Lee <somerlee.9@gmail.com>
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: binmakeswell <binmakeswell@gmail.com>

* Feature/pipeline (#40)

* remove redundancy func in setup (#19) (#20)

* use env to control the language of doc (#24) (#25)

* Support TP-compatible Torch AMP and Update trainer API (#27)

* Add gradient accumulation, fix lr scheduler

* fix FP16 optimizer and adapted torch amp with tensor parallel (#18)

* fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes

* fixed trainer

* Revert "fixed trainer"

This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.

* improved consistency between trainer, engine and schedule (#23)
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: ver217 <lhx0217@gmail.com>

* add an example of ViT-B/16 and remove w_norm clipping in LAMB (#29)

* add explanation for ViT example (#35) (#36)

* optimize communication of pipeline parallel

* fix grad clip for pipeline
Co-authored-by: Frank Lee <somerlee.9@gmail.com>
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: binmakeswell <binmakeswell@gmail.com>

* optimized 3d layer to fix slow computation ; tested imagenet performance with 3d; reworked lr_scheduler config definition; fixed launch args; fixed some printing issues; simplified apis of 3d layers (#51)

* Update 2.5d layer code to get a similar accuracy on imagenet-1k dataset

* update api for better usability (#58)

update api for better usability
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: ver217 <lhx0217@gmail.com>
Co-authored-by: puck_WCR <46049915+WANG-CR@users.noreply.github.com>
Co-authored-by: binmakeswell <binmakeswell@gmail.com>
Co-authored-by: アマデウス <kurisusnowdeng@users.noreply.github.com>
Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com>

da01c234

18 Nov, 2021 1 commit

Support TP-compatible Torch AMP and Update trainer API (#27) · 3defa32a

Frank Lee authored Nov 18, 2021



* Add gradient accumulation, fix lr scheduler

* fix FP16 optimizer and adapted torch amp with tensor parallel (#18)

* fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes

* fixed trainer

* Revert "fixed trainer"

This reverts commit 2e0b0b76990e8d4e337add483d878c0f61cf5097.

* improved consistency between trainer, engine and schedule (#23)
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: ver217 <lhx0217@gmail.com>

3defa32a

15 Nov, 2021 1 commit
- use env to control the language of doc (#24) (#25) · 2b05de4c
  ver217 authored Nov 15, 2021
  
  2b05de4c
03 Nov, 2021 1 commit
- fixed some typos in the documents, added blog link and paper author information in README · 05e7069a
  binmakeswell authored Nov 03, 2021
  
  05e7069a
02 Nov, 2021 1 commit
- added Chinese documents and fixed some typos in English documents · 18ba66e0
  Fan Cui authored Nov 02, 2021
  
  18ba66e0
01 Nov, 2021 1 commit
- reoder parallelization methods in parallelization documentation · 50982c0b
  ver217 authored Nov 01, 2021
  
  50982c0b
29 Oct, 2021 1 commit
- update documentation · 3c7604ba
  ver217 authored Oct 29, 2021
  
  3c7604ba