Commits · 9a4842c571cd63e6a660182a234bc6ff60991ba0 · OpenDAS / ColossalAI

"examples/language/model_utils.py" did not exist on "839847b7d78bce6af5dfe58d27b5ce2c74a3619b"

17 Jul, 2023 1 commit
- revise shardformer readme (#4246) · 9a4842c5
  Jianghai authored Jul 17, 2023
  
  9a4842c5
12 Jul, 2023 1 commit
- Automated submodule synchronization (#4217) · 4e9b09c2
  github-actions[bot] authored Jul 12, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
  4e9b09c2
10 Jul, 2023 1 commit
- [docker] fixed ninja build command (#4203) · c1cf7520
  Frank Lee authored Jul 10, 2023
```
* [docker] fixed ninja build command

* polish code
```
  c1cf7520
07 Jul, 2023 2 commits
- Next commit [checkpointio] Unsharded Optimizer Checkpoint for Gemini Plugin (#4141) · 58913441
  Baizhou Zhang authored Jul 07, 2023
```
* [checkpointio] unsharded optimizer checkpoint for Gemini plugin

* [checkpointio] unsharded optimizer checkpoint for Gemini using all_gather
```
  58913441
- [docker] added ssh and rdma support for docker (#4192) · fee32a3b
  Frank Lee authored Jul 07, 2023
  
  fee32a3b
04 Jul, 2023 35 commits
- [dtensor] fixed readme file name and removed deprecated file (#4162) · 190a6ea9
  Frank Lee authored Jul 04, 2023
  
  190a6ea9
- [workflow] show test duration (#4159) · cc3cbe9f
  Frank Lee authored Jul 04, 2023
  
  cc3cbe9f
- [cli] hotfix launch command for multi-nodes (#4165) · 1908caad
  Hongxin Liu authored Jul 04, 2023
  
  1908caad
- fix some typo colossalai/shardformer (#4160) · 2ac24040
  digger yu authored Jul 04, 2023
  
  2ac24040
- [format] applied code formatting on changed files in pull request 4152 (#4157) · c77b3b19
  github-actions[bot] authored Jul 04, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
  c77b3b19
- [chat] removed cache file (#4155) · f447ca18
  Frank Lee authored Jul 04, 2023
  
  f447ca18
- [shardformer] added development protocol for standardization (#4149) · 89f45eda
  Frank Lee authored Jul 04, 2023
  
  89f45eda
- [shardformer] made tensor parallelism configurable (#4144) · 1fb0d95d
  Frank Lee authored Jul 04, 2023
```
* [shardformer] made tensor parallelism configurable

* polish code
```
  1fb0d95d
- [shardformer] refactored some doc and api (#4137) · 74257cb4
  Frank Lee authored Jul 03, 2023
```
* [shardformer] refactored some doc and api

* polish code
```
  74257cb4
- [shardformer] write an shardformer example with bert finetuning (#4126) · 7f9b3033
  jiangmingyan authored Jun 30, 2023
```
* [shardformer] add benchmark of shardformer

* [shardformer] add benchmark of shardformer
```
  7f9b3033
- [shardformer] added embedding gradient check (#4124) · ae035d30
  Frank Lee authored Jun 30, 2023
  
  ae035d30
- [shardformer] import huggingface implicitly (#4101) · 44a190e6
  Frank Lee authored Jun 30, 2023
  
  44a190e6
- [shardformer] integrate with data parallelism (#4103) · 6a88bae4
  Frank Lee authored Jun 30, 2023
  
  6a88bae4
- [shardformer] supported fused normalization (#4112) · f3b6aaa6
  Frank Lee authored Jun 30, 2023
  
  f3b6aaa6
- [shardformer] supported bloom model (#4098) · b1c29015
  Frank Lee authored Jun 28, 2023
  
  b1c29015
- [shardformer] support vision transformer (#4096) · 8af29ee4
  Kun Lin authored Jun 28, 2023
```
* first v of vit shardformer

* keep vit

* update

* vit shard add vitattention vitlayer

* update num head shard para

* finish test for vit

* add new_model_class & postprocess

* add vit readme

* delete old files & fix the conflict

* fix sth
```
  8af29ee4
- [shardformer] shardformer support opt models (#4091) · ac809371
  jiangmingyan authored Jun 27, 2023
```
* [shardformer] shardformer support opt models

* [shardformer] shardformer support opt models, fix

* [shardformer] shardformer support opt models, fix

* [shardformer] shardformer support opt models, fix
```
  ac809371
- [shardformer] refactored layernorm (#4086) · d33a44e8
  Frank Lee authored Jun 26, 2023
  
  d33a44e8
- [test] fixed tests failed due to dtensor change (#4082) · c4b1b659
  Frank Lee authored Jun 26, 2023
```
* [test] fixed tests failed due to dtensor change

* polish code
```
  c4b1b659
- [shardformer] Add layernorm (#4072) · 92f67910
  FoolPlayer authored Jun 23, 2023
```
* add layernorm to bert

* add layernorm test

* add layernorm test with load state dict

* add use_mixedfusedLN in shard config

* refactor policy to support fused_layernorm
```
  92f67910
- [shardformer] supported fused qkv checkpoint (#4073) · 70c58cfd
  Frank Lee authored Jun 23, 2023
  
  70c58cfd
- [shardformer] add linearconv1d test (#4067) · 0803a614
  FoolPlayer authored Jun 22, 2023
```
* add linearconv1d test

* add linearconv1d test
```
  0803a614
- [shardformer] support module saving and loading (#4062) · 8eb09a4c
  Frank Lee authored Jun 22, 2023
```
* [shardformer] support module saving and loading

* polish code
```
  8eb09a4c
- support kit use for bert/gpt test (#4055) · 7740c55c
  FoolPlayer authored Jun 22, 2023
```
* support kit use for bert test

* support kit test for gpt2
```
  7740c55c
- [shardformer] refactored the shardformer layer structure (#4053) · f22ddace
  Frank Lee authored Jun 21, 2023
  
  f22ddace
- [shardformer] adapted T5 and LLaMa test to use kit (#4049) · 58df7205
  Frank Lee authored Jun 21, 2023
```
* [shardformer] adapted T5 and LLaMa test to use kit

* polish code
```
  58df7205
- [shardformer] add gpt2 test and layer class refactor (#4041) · 4021b9a8
  FoolPlayer authored Jun 20, 2023
```
* add gpt2 test and layer class refactor

* add dropout in gpt2 policy
```
  4021b9a8
- [shardformer] supported T5 and its variants (#4045) · d857f3db
  Frank Lee authored Jun 19, 2023
  
  d857f3db
- [shardformer] adapted llama to the new API (#4036) · c1d5453e
  Frank Lee authored Jun 19, 2023
  
  c1d5453e
- [shardformer] fix bert and gpt downstream with new api (#4024) · 74d176c8
  FoolPlayer authored Jun 19, 2023
```
* fix bert downstream with new api

* remove comment line
```
  74d176c8
- [shardformer] updated doc (#4016) · e253a070
  Frank Lee authored Jun 16, 2023
  
  e253a070
- support bert with new api · df018fc3
  FoolPlayer authored Jun 16, 2023
  
  df018fc3
- add vocabembedding layer · 507c0ad3
  FoolPlayer authored Jun 16, 2023
  
  507c0ad3
- [shardformer] removed inplace tensor sharding (#4018) · 45d93843
  Frank Lee authored Jun 16, 2023
  
  45d93843
- [shardformer] refactored embedding and dropout to parallel module (#4013) · 3893fa1a
  Frank Lee authored Jun 16, 2023
```
* [shardformer] refactored embedding and dropout to parallel module

* polish code
```
  3893fa1a