Commits · 73bff1128892dfcab91690790047ffda4ffa8716 · OpenDAS / ColossalAI

11 Mar, 2022 36 commits
- Added profiler communication operations · 73bff112
  1SAA authored Mar 04, 2022
```
Fixed bug for learning rate scheduler
```
  73bff112
- add badge and contributor list · d275b98b
  binmakeswell authored Mar 04, 2022
  
  d275b98b
- [zero] cpu adam kernel (#288) · a3269de5
  LuGY authored Mar 04, 2022
```
* Added CPU Adam

* finished the cpu adam

* updated the license

* delete useless parameters, removed resnet

* modified the method off cpu adam unittest

* deleted some useless codes

* removed useless codes
Co-authored-by: ver217 <lhx0217@gmail.com>
Co-authored-by: Frank Lee <somerlee.9@gmail.com>
Co-authored-by: jiaruifang <fangjiarui123@gmail.com>
```
  a3269de5
- [zero] yet an improved sharded param (#311) · 90d3aef6
  Jiarui Fang authored Mar 04, 2022
  
  90d3aef6
- [zero] polish shard strategy (#310) · c9e7d958
  Jiarui Fang authored Mar 04, 2022
```
* init shard param from shape tuple

* add more unitest for shard param

* add set_payload method for ShardedParam

* [zero] add shareded tensor class

* polish code

* add shard stratgy

* move shard and gather logic to shard strategy from shard tensor.

* polish code
```
  c9e7d958
- polish code · 3092317b
  ver217 authored Mar 04, 2022
  
  3092317b
- fix sharded param hook and unit test · 36f9a74a
  ver217 authored Mar 04, 2022
  
  36f9a74a
- impl shard optim v2 and add unit test · 001ca624
  ver217 authored Mar 04, 2022
  
  001ca624
- [zero] a shard strategy in granularity of tensor (#307) · 74f77e31
  Jiarui Fang authored Mar 04, 2022
  
  74f77e31
- [zero] sharded tensor (#305) · 80364c76
  Jiarui Fang authored Mar 04, 2022
```
* init shard param from shape tuple

* add more unitest for shard param

* add set_payload method for ShardedParam

* [zero] add shareded tensor class

* polish code
```
  80364c76
- [profiler] primary memory tracer · d3446892
  Jie Zhu authored Mar 04, 2022
  
  d3446892
- update unit testing CI rules · dfc3fafe
  FrankLeeeee authored Mar 03, 2022
  
  dfc3fafe
- added compatibility CI and options for release ci · bbbfe9b2
  FrankLeeeee authored Feb 28, 2022
  
  bbbfe9b2
- added pypi publication CI and remove formatting CI · 115bcc0b
  FrankLeeeee authored Feb 28, 2022
  
  115bcc0b
- rename shared adam to sharded optim v2 · b105371a
  ver217 authored Mar 03, 2022
  
  b105371a
- fix master params dtype · 70814dc2
  ver217 authored Mar 03, 2022
  
  70814dc2
- add fp32 master params in sharded adam · 795210dd
  ver217 authored Mar 03, 2022
  
  795210dd
- add sharded adam · a109225b
  ver217 authored Mar 03, 2022
  
  a109225b
- polish license (#300) · 8f74fbd9
  Jiarui Fang authored Mar 03, 2022
```
* init shard param from shape tuple

* add more unitest for shard param
```
  8f74fbd9
- Polish sharded parameter (#297) · e17e92c5
  Jiarui Fang authored Mar 03, 2022
```
* init shard param from shape tuple

* add more unitest for shard param

* add more unittests to shareded param
```
  e17e92c5
- [zero] add sharded grad and refactor grad hooks for ShardedModel (#287) · 7aef75ca
  ver217 authored Mar 02, 2022
  
  7aef75ca
- fixed typo in ShardParam (#294) · 9afb5c8b
  Frank Lee authored Mar 02, 2022
  
  9afb5c8b
- added unit test for sharded optimizer (#293) · 27155b85
  Frank Lee authored Mar 02, 2022
```
* added unit test for sharded optimizer

* refactor for elegance
```
  27155b85
- added buffer sync to naive amp model wrapper (#291) · e17e54e3
  Frank Lee authored Mar 02, 2022
  
  e17e54e3
- add a common util for hooks registered on parameter. (#292) · 8d653af4
  Jiarui Fang authored Mar 02, 2022
  
  8d653af4
- bug fix: pass hook_list to engine (#273) · f867365a
  Jie Zhu authored Mar 02, 2022
```
* bug fix: pass hook_list to engine

* change parameter name
```
  f867365a
- Feature/zero (#279) · 5a560a06
  Jiarui Fang authored Mar 01, 2022
```
* add zero1 (#209)

* add zero1

* add test zero1

* update zero stage 1 develop (#212)

* Implement naive zero3 (#240)

* naive zero3 works well

* add zero3 param manager

* add TODOs in comments

* add gather full param ctx

* fix sub module streams

* add offload

* fix bugs of hook and add unit tests

* fix bugs of hook and add unit tests (#252)

* add gather full param ctx

* fix sub module streams

* add offload

* fix bugs of hook and add unit tests

* polish code and add state dict hook

* fix bug

* update unit test

* refactor reconstructed zero code

* clip_grad support zero3 and add unit test

* add unit test for Zero3ParameterManager

* [WIP] initialize the shard param class

* [WIP] Yet another sharded model implementation (#274)

* [WIP] initialize the shard param class

* [WIP] Yes another implementation of shardModel. Using a better hook method.

* torch.concat -> torch.cat

* fix test_zero_level_1.py::test_zero_level_1 unitest

* remove deepspeed implementation and refactor for the reconstructed zero module

* polish zero dp unittests
Co-authored-by: ver217 <lhx0217@gmail.com>
Co-authored-by: Frank Lee <somerlee.9@gmail.com>
```
  5a560a06
- add community group and update issue template(#271) · 08eccfe6
  binmakeswell authored Feb 28, 2022
  
  08eccfe6
- update experimental visualization (#253) · 3312d716
  Sze-qq authored Feb 28, 2022
  
  3312d716
- add Chinese README · 753035ed
  binmakeswell authored Feb 18, 2022
  
  753035ed
- Added TPExpert for special situation · 82023779
  1SAA authored Feb 27, 2022
  
  82023779
- Fixed parameter initialization in FFNExpert (#251) · 36b84772
  HELSON authored Feb 27, 2022
  
  36b84772
- fixed CI dataset directory; fixed import error of 2.5d accuracy (#255) · e13293bb
  アマデウス authored Feb 24, 2022
  
  e13293bb
- Optimized MoE layer and fixed some bugs; · 219df6e6
  1SAA authored Feb 18, 2022
```
Decreased moe tests;

Added FFNExperts and ViTMoE model
```
  219df6e6
- fixed padding index issue for vocab parallel embedding layers; updated 3D... · 3dba0705
  zbian authored Feb 17, 2022
```
fixed padding index issue for vocab parallel embedding layers; updated 3D linear to be compatible with examples in the tutorial
```
  3dba0705
- update setup info (#233) · 24f8583c
  ver217 authored Feb 15, 2022
  
  24f8583c
15 Feb, 2022 4 commits
- Automated submodule synchronization · b9f8521f
  github-actions authored Feb 09, 2022
  
  b9f8521f
- fixed apex import (#227) · f5ca88ec
  Frank Lee authored Feb 14, 2022
  
  f5ca88ec
- updated readme and change log (#224) · eb3fda4c
  Frank Lee authored Feb 14, 2022
  
  eb3fda4c
- update setup and workflow (#222) · 578ea058
  ver217 authored Feb 14, 2022
  
  578ea058