Commits · cc3cbe9f6f291af172252f097952bfe247200195 · OpenDAS / ColossalAI

04 Jul, 2023 40 commits
- [workflow] show test duration (#4159) · cc3cbe9f
  Frank Lee authored Jul 04, 2023
  
  cc3cbe9f
- [cli] hotfix launch command for multi-nodes (#4165) · 1908caad
  Hongxin Liu authored Jul 04, 2023
  
  1908caad
- fix some typo colossalai/shardformer (#4160) · 2ac24040
  digger yu authored Jul 04, 2023
  
  2ac24040
- [format] applied code formatting on changed files in pull request 4152 (#4157) · c77b3b19
  github-actions[bot] authored Jul 04, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
  c77b3b19
- [chat] removed cache file (#4155) · f447ca18
  Frank Lee authored Jul 04, 2023
  
  f447ca18
- [shardformer] added development protocol for standardization (#4149) · 89f45eda
  Frank Lee authored Jul 04, 2023
  
  89f45eda
- [shardformer] made tensor parallelism configurable (#4144) · 1fb0d95d
  Frank Lee authored Jul 04, 2023
```
* [shardformer] made tensor parallelism configurable

* polish code
```
  1fb0d95d
- [shardformer] refactored some doc and api (#4137) · 74257cb4
  Frank Lee authored Jul 03, 2023
```
* [shardformer] refactored some doc and api

* polish code
```
  74257cb4
- [shardformer] write an shardformer example with bert finetuning (#4126) · 7f9b3033
  jiangmingyan authored Jun 30, 2023
```
* [shardformer] add benchmark of shardformer

* [shardformer] add benchmark of shardformer
```
  7f9b3033
- [shardformer] added embedding gradient check (#4124) · ae035d30
  Frank Lee authored Jun 30, 2023
  
  ae035d30
- [shardformer] import huggingface implicitly (#4101) · 44a190e6
  Frank Lee authored Jun 30, 2023
  
  44a190e6
- [shardformer] integrate with data parallelism (#4103) · 6a88bae4
  Frank Lee authored Jun 30, 2023
  
  6a88bae4
- [shardformer] supported fused normalization (#4112) · f3b6aaa6
  Frank Lee authored Jun 30, 2023
  
  f3b6aaa6
- [shardformer] supported bloom model (#4098) · b1c29015
  Frank Lee authored Jun 28, 2023
  
  b1c29015
- [shardformer] support vision transformer (#4096) · 8af29ee4
  Kun Lin authored Jun 28, 2023
```
* first v of vit shardformer

* keep vit

* update

* vit shard add vitattention vitlayer

* update num head shard para

* finish test for vit

* add new_model_class & postprocess

* add vit readme

* delete old files & fix the conflict

* fix sth
```
  8af29ee4
- [shardformer] shardformer support opt models (#4091) · ac809371
  jiangmingyan authored Jun 27, 2023
```
* [shardformer] shardformer support opt models

* [shardformer] shardformer support opt models, fix

* [shardformer] shardformer support opt models, fix

* [shardformer] shardformer support opt models, fix
```
  ac809371
- [shardformer] refactored layernorm (#4086) · d33a44e8
  Frank Lee authored Jun 26, 2023
  
  d33a44e8
- [test] fixed tests failed due to dtensor change (#4082) · c4b1b659
  Frank Lee authored Jun 26, 2023
```
* [test] fixed tests failed due to dtensor change

* polish code
```
  c4b1b659
- [shardformer] Add layernorm (#4072) · 92f67910
  FoolPlayer authored Jun 23, 2023
```
* add layernorm to bert

* add layernorm test

* add layernorm test with load state dict

* add use_mixedfusedLN in shard config

* refactor policy to support fused_layernorm
```
  92f67910
- [shardformer] supported fused qkv checkpoint (#4073) · 70c58cfd
  Frank Lee authored Jun 23, 2023
  
  70c58cfd
- [shardformer] add linearconv1d test (#4067) · 0803a614
  FoolPlayer authored Jun 22, 2023
```
* add linearconv1d test

* add linearconv1d test
```
  0803a614
- [shardformer] support module saving and loading (#4062) · 8eb09a4c
  Frank Lee authored Jun 22, 2023
```
* [shardformer] support module saving and loading

* polish code
```
  8eb09a4c
- support kit use for bert/gpt test (#4055) · 7740c55c
  FoolPlayer authored Jun 22, 2023
```
* support kit use for bert test

* support kit test for gpt2
```
  7740c55c
- [shardformer] refactored the shardformer layer structure (#4053) · f22ddace
  Frank Lee authored Jun 21, 2023
  
  f22ddace
- [shardformer] adapted T5 and LLaMa test to use kit (#4049) · 58df7205
  Frank Lee authored Jun 21, 2023
```
* [shardformer] adapted T5 and LLaMa test to use kit

* polish code
```
  58df7205
- [shardformer] add gpt2 test and layer class refactor (#4041) · 4021b9a8
  FoolPlayer authored Jun 20, 2023
```
* add gpt2 test and layer class refactor

* add dropout in gpt2 policy
```
  4021b9a8
- [shardformer] supported T5 and its variants (#4045) · d857f3db
  Frank Lee authored Jun 19, 2023
  
  d857f3db
- [shardformer] adapted llama to the new API (#4036) · c1d5453e
  Frank Lee authored Jun 19, 2023
  
  c1d5453e
- [shardformer] fix bert and gpt downstream with new api (#4024) · 74d176c8
  FoolPlayer authored Jun 19, 2023
```
* fix bert downstream with new api

* remove comment line
```
  74d176c8
- [shardformer] updated doc (#4016) · e253a070
  Frank Lee authored Jun 16, 2023
  
  e253a070
- support bert with new api · df018fc3
  FoolPlayer authored Jun 16, 2023
  
  df018fc3
- add vocabembedding layer · 507c0ad3
  FoolPlayer authored Jun 16, 2023
  
  507c0ad3
- [shardformer] removed inplace tensor sharding (#4018) · 45d93843
  Frank Lee authored Jun 16, 2023
  
  45d93843
- [shardformer] refactored embedding and dropout to parallel module (#4013) · 3893fa1a
  Frank Lee authored Jun 16, 2023
```
* [shardformer] refactored embedding and dropout to parallel module

* polish code
```
  3893fa1a
- integrate with dist layer (#4011) · dfca9678
  FoolPlayer authored Jun 16, 2023
  
  dfca9678
- [shardformer] integrated linear 1D with dtensor (#3996) · 015af592
  Frank Lee authored Jun 15, 2023
```
* [shardformer] integrated linear 1D with dtensor

* polish code
```
  015af592
- [shardformer] Refactor shardformer api (#4001) · d3bc5308
  FoolPlayer authored Jun 15, 2023
```
* fix an error in readme

* simplify code

* refactor shardformer

* add todo

* remove slicer

* resolve code review
```
  d3bc5308
- [device] support init device mesh from process group (#3990) · 61197124
  Frank Lee authored Jun 15, 2023
  
  61197124
- [shardformer] fix an error in readme (#3988) · a2f9af81
  FoolPlayer authored Jun 15, 2023
```
* fix an error in readme

* simplify code
```
  a2f9af81
- [Shardformer] Downstream bert (#3979) · f7774ec0
  FoolPlayer authored Jun 15, 2023
```
* add dist dropout in model

* update docstring and bert policy with dropout

* refactor basepolicy and sharded, update bert

* update format

* update gpt2 policy

* update bert policy

* remove unused code

* update readme for new policy usage

* add downstream model of bert

* remove unused code
```
  f7774ec0