Commits · 62f4e2eb0760ac8bfe28834b061dbc2bda93ade9 · OpenDAS / ColossalAI

06 Apr, 2023 1 commit
- [test] reorganize zero/gemini tests (#3445) · 933048ad
  ver217 authored Apr 06, 2023
  
  933048ad
04 Apr, 2023 1 commit

[zero] reorganize zero/gemini folder structure (#3424) · 26b7aac0

ver217 authored Apr 04, 2023

* [zero] refactor low-level zero folder structure

* [zero] fix legacy zero import path

* [zero] fix legacy zero import path

* [zero] remove useless import

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor gemini folder structure

* [zero] refactor legacy zero import path

* [zero] fix test import path

* [zero] fix test

* [zero] fix circular import

* [zero] update import

26b7aac0

28 Jan, 2023 1 commit
- [gemini] update ddp strict mode (#2518) · 707b11d4
  HELSON authored Jan 28, 2023
```
* [zero] add strict ddp mode for chunk init

* [gemini] update gpt example
```
  707b11d4
09 Jan, 2023 1 commit
- [polish] polish code for get_static_torch_model (#2405) · ea13a201
  HELSON authored Jan 09, 2023
```
* [gemini] polish code

* [testing] remove code

* [gemini] make more robust
```
  ea13a201
26 Dec, 2022 1 commit
- [testing] add beit model for unit testings (#2196) · a3100bd5
  HELSON authored Dec 26, 2022
```
* [testing] add beit model

* [beit] fix bugs

* [beit] fix bugs

* [testing] fix bugs
```
  a3100bd5
12 Dec, 2022 1 commit
- [Gemini] chunk init using runtime visited param order (#2115) · 9214d1fe
  Jiarui Fang authored Dec 12, 2022
  
  9214d1fe
09 Dec, 2022 1 commit

[zero] add L2 gradient clipping for ZeRO (#2112) · 63fbba3c

HELSON authored Dec 09, 2022

* [zero] add L2 gradient clipping

* [testing] add MlpModel

* [zero] add unit test for grad clipping

* fix atol

63fbba3c

05 Dec, 2022 1 commit
- [Gemini] add albert in test models. (#2075) · 40b7d55b
  Jiarui Fang authored Dec 05, 2022
  
  40b7d55b
30 Nov, 2022 5 commits
- [gemini] fix init bugs for modules (#2047) · f6178728
  HELSON authored Nov 30, 2022
```
* [gemini] fix init bugs for modules

* fix bugs
```
  f6178728
- [test] align model name with the file name. (#2045) · 1e885329
  Jiarui Fang authored Nov 30, 2022
  
  1e885329
- [hotfix] hotfix Gemini for no leaf modules bug (#2043) · 31c64402
  Jiarui Fang authored Nov 30, 2022
  
  31c64402
- [zero] fix testing parameters (#2042) · 384cd263
  HELSON authored Nov 30, 2022
  
  384cd263
- [zero] fix unit-tests (#2039) · 17a3c685
  HELSON authored Nov 30, 2022
  
  17a3c685
29 Nov, 2022 1 commit
- [Gemini] more tests for Gemini (#2038) · eb7742a4
  Jiarui Fang authored Nov 29, 2022
```
* [Gemini] more tests for Gemini

* polish code
```
  eb7742a4
24 Nov, 2022 1 commit
- [Gemini] add unitests to check gemini correctness (#2015) · 2e9cbfca
  Jiarui Fang authored Nov 24, 2022
  
  2e9cbfca
16 Nov, 2022 1 commit
- [Gemini] add GeminiAdamOptimizer (#1960) · f7e276fa
  Jiarui Fang authored Nov 16, 2022
  
  f7e276fa
02 Nov, 2022 1 commit

[hotfix] fix zero's incompatibility with checkpoint in torch-1.12 (#1786) · c6a1a626

HELSON authored Nov 02, 2022

* [hotfix] fix zero's incompatibility with checkpoint in torch-1.12

* [zero] add cpu shard init

* [zero] add tiny example test

* [colo_tensor] fix bugs for torch-1.11

c6a1a626

18 Oct, 2022 1 commit
- [zero] add chunk init function for users (#1729) · f69f9bf2
  HELSON authored Oct 18, 2022
```
* add chunk manager init function

* fix unit tests

* add comment

* add flush=True
```
  f69f9bf2
14 Oct, 2022 1 commit

[zero] add constant placement policy (#1705) · 1468e4bc

HELSON authored Oct 14, 2022

* fixes memory leak when paramter is in fp16 in ZeroDDP init.
* bans chunk releasement in CUDA. Only when a chunk is about to offload, it is allowed to release.
* adds a constant placement policy. With it, users can allocate a reserved caching memory space for parameters.

1468e4bc

09 Oct, 2022 1 commit
- [feature] A new ZeRO implementation (#1644) · b28991dd
  HELSON authored Oct 09, 2022
  
  b28991dd
26 Sep, 2022 1 commit
- Revert "[feature] new zero implementation (#1623)" (#1643) · c5d39215
  Jiarui Fang authored Sep 26, 2022
```
This reverts commit 5be118f4.
```
  c5d39215
24 Sep, 2022 1 commit
- [feature] new zero implementation (#1623) · 5be118f4
  HELSON authored Sep 24, 2022
  
  5be118f4