Commits · ec5086c49c876836b597b5e3b4248e38cb55db6a · OpenDAS / ColossalAI

29 Mar, 2022 7 commits

Refactored docstring to google style · ec5086c4
Liang Bowen authored Mar 25, 2022

ec5086c4
[zero] non model data tracing (#545) · 53b1b6e3
Jiarui Fang authored Mar 29, 2022

53b1b6e3

[profiler] add MemProfiler (#356) · 73d36618

Jie Zhu authored Mar 29, 2022

* add memory trainer hook

* fix bug

* add memory trainer hook

* fix import bug

* fix import bug

* add trainer hook

* fix #370 git log bug

* modify `to_tensorboard` function to support better output

* remove useless output

* change the name of `MemProfiler`

* complete memory profiler

* replace error with warning

* finish trainer hook

* modify interface of MemProfiler

* modify `__init__.py` in profiler

* remove unnecessary pass statement

* add usage to doc string

* add usage to trainer hook

* new location to store temp data file

73d36618

[zero] optimize grad offload (#539) · fb841dd5
ver217 authored Mar 29, 2022
```
* optimize grad offload

* polish code

* polish code
```
fb841dd5
[logging] polish logger format (#543) · 7d81b5b4
Jiarui Fang authored Mar 29, 2022

7d81b5b4
[zero] polish ZeroInitContext (#540) · 1f90a3b1
ver217 authored Mar 29, 2022

1f90a3b1
[zero] get memory usage of sharded optim v2. (#542) · c11ff81b
Jiarui Fang authored Mar 29, 2022

c11ff81b

28 Mar, 2022 4 commits
- [zero] adapt for no-leaf module in zero (#535) · a30e2b4c
  HELSON authored Mar 28, 2022
```
only process module's own parameters in Zero context

add zero hooks for all modules that contrain parameters

gather parameters only belonging to module itself
```
  a30e2b4c
- [zero] refactor model data tracing (#537) · 705f5610
  Jiarui Fang authored Mar 28, 2022
  
  705f5610
- [zero] improve the accuracy of get_memory_usage of sharded param (#538) · a590ed0b
  Jiarui Fang authored Mar 28, 2022
  
  a590ed0b
- [zero] get memory usage for sharded param (#536) · 37cb70fe
  Jiarui Fang authored Mar 28, 2022
  
  37cb70fe
26 Mar, 2022 1 commit
- update version (#533) · 56ad9457
  ver217 authored Mar 26, 2022
  
  56ad9457
25 Mar, 2022 12 commits
- [doc] update apidoc (#530) · ffca99d1
  ver217 authored Mar 25, 2022
  
  ffca99d1
- [zero] fix grad offload (#528) · 05e33b25
  Jiarui Fang authored Mar 25, 2022
```
* [zero] fix grad offload

* polish code
```
  05e33b25
- [zero]added hybrid adam, removed loss scale in adam (#527) · 105c5301
  LuGY authored Mar 25, 2022
```
* [zero]added hybrid adam, removed loss scale of adam

* remove useless code
```
  105c5301
- [zero] refactor model data tracing (#522) · 8d8c5407
  Jiarui Fang authored Mar 25, 2022
  
  8d8c5407
- [test] fixed rerun_on_exception and adapted test cases (#487) · 3601b2ba
  Frank Lee authored Mar 25, 2022
  
  3601b2ba
- [refactor] remove old zero code (#517) · 4d322b79
  Jiarui Fang authored Mar 25, 2022
  
  4d322b79
- [cuda] modify the fused adam, support hybrid of fp16 and fp32 (#497) · 6a3f9fda
  LuGY authored Mar 25, 2022
  
  6a3f9fda
- [zero] add colo move inline (#521) · 920c5889
  Jiarui Fang authored Mar 25, 2022
  
  920c5889
- [log] polish disable_existing_loggers (#519) · 7be397ca
  ver217 authored Mar 25, 2022
  
  7be397ca
- [zero] fix init device bug in zero init context unittest (#516) · 0bebda6e
  Jiarui Fang authored Mar 25, 2022
  
  0bebda6e
- Update README.md (#514) · a5131643
  fastalgo authored Mar 25, 2022
  
  a5131643
- [zero] show model data cuda memory usage after zero context init. (#515) · 7ef3507a
  Jiarui Fang authored Mar 25, 2022
  
  7ef3507a
24 Mar, 2022 10 commits
- [zero] zero init ctx enable rm_torch_payload_on_the_fly (#512) · a2e61d61
  ver217 authored Mar 24, 2022
```
* enable rm_torch_payload_on_the_fly

* polish docstr
```
  a2e61d61
- [install] run with out rich (#513) · 81145208
  Jiarui Fang authored Mar 24, 2022
  
  81145208
- [MOE] add MOEGPT model (#510) · 0f2d2191
  HELSON authored Mar 24, 2022
  
  0f2d2191
- [zero] use colo model data api in optimv2 (#511) · bca0c49a
  Jiarui Fang authored Mar 24, 2022
  
  bca0c49a
- [memory] set cuda mem frac (#506) · 9330be0f
  Jiarui Fang authored Mar 24, 2022
  
  9330be0f
- [devops] recover tsinghua pip source due to proxy issue (#509) · 97933b67
  Frank Lee authored Mar 24, 2022
  
  97933b67
- [memory] add model data tensor moving api (#503) · 0035b7be
  Jiarui Fang authored Mar 24, 2022
  
  0035b7be
- [devops] remove tsinghua source for pip (#507) · 65ad47c3
  Frank Lee authored Mar 24, 2022
  
  65ad47c3
- [devops] remove tsinghua source for pip (#505) · 44f7bcb2
  Frank Lee authored Mar 24, 2022
  
  44f7bcb2
- fix discussion button in issue template (#504) · af56c1d0
  binmakeswell authored Mar 24, 2022
  
  af56c1d0
23 Mar, 2022 2 commits
- [polish] polish singleton and global context (#500) · a445e118
  Jiarui Fang authored Mar 23, 2022
  
  a445e118
- [zero] sharded model support the reuse of fp16 shard (#495) · 9ec1ce6a
  ver217 authored Mar 23, 2022
```
* sharded model supports reuse fp16 shard

* rename variable

* polish code

* polish code

* polish code
```
  9ec1ce6a
22 Mar, 2022 4 commits
- [MOE] remove old MoE legacy (#493) · f24b5ed2
  HELSON authored Mar 22, 2022
  
  f24b5ed2
- [zero] sharded model manages ophooks individually (#492) · c4c02424
  ver217 authored Mar 22, 2022
  
  c4c02424
- [MOE] support PR-MOE (#488) · c9023d40
  HELSON authored Mar 22, 2022
  
  c9023d40
- [zero] polish sharded optimizer v2 (#490) · a9ecb4b2
  ver217 authored Mar 22, 2022
  
  a9ecb4b2