Commits · 07c2e3d09cd6bf42f280f20f0cc2ba2eb47677cc · OpenDAS / ColossalAI

20 Sep, 2023 2 commits
- Merge pull request #4757 from ppt0011/main · 07c2e3d0
  ppt0011 authored Sep 20, 2023
```
[doc] explain suitable use case for each plugin
```
  07c2e3d0
- [doc] put native colossalai plugins first in description section · 4d7537ba
  Pengtai Xu authored Sep 20, 2023
  
  4d7537ba
19 Sep, 2023 4 commits
- [doc] add model examples for each plugin · e10d9f08
  Pengtai Xu authored Sep 19, 2023
  
  e10d9f08
- [doc] put individual plugin explanation in front · a04337bf
  Pengtai Xu authored Sep 19, 2023
  
  a04337bf
- [doc] explain suitable use case for each plugin · 10513f20
  Pengtai Xu authored Sep 19, 2023
  
  10513f20
- [misc] update pre-commit and run all files (#4752) · 079bf3cb
  Hongxin Liu authored Sep 19, 2023
```
* [misc] update pre-commit

* [misc] run pre-commit

* [misc] remove useless configuration files

* [misc] ignore cuda for clang-format
```
  079bf3cb
18 Sep, 2023 3 commits

[format] applied code formatting on changed files in pull request 4743 (#4750) · 3c6b831c
github-actions[bot] authored Sep 18, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
3c6b831c

[legacy] clean up legacy code (#4743) · b5f9e37c

Hongxin Liu authored Sep 18, 2023

* [legacy] remove outdated codes of pipeline (#4692)

* [legacy] remove cli of benchmark and update optim (#4690)

* [legacy] remove cli of benchmark and update optim

* [doc] fix cli doc test

* [legacy] fix engine clip grad norm

* [legacy] remove outdated colo tensor (#4694)

* [legacy] remove outdated colo tensor

* [test] fix test import

* [legacy] move outdated zero to legacy (#4696)

* [legacy] clean up utils (#4700)

* [legacy] clean up utils

* [example] update examples

* [legacy] clean up amp

* [legacy] fix amp module

* [legacy] clean up gpc (#4742)

* [legacy] clean up context

* [legacy] clean core, constants and global vars

* [legacy] refactor initialize

* [example] fix examples ci

* [example] fix examples ci

* [legacy] fix tests

* [example] fix gpt example

* [example] fix examples ci

* [devops] fix ci installation

* [example] fix examples ci

b5f9e37c

[kernel] update triton init #4740 (#4740) · 32e7f994
Xuanlei Zhao authored Sep 18, 2023

32e7f994

15 Sep, 2023 13 commits

[doc] explaination of loading large pretrained models (#4741) · d151dcab
Baizhou Zhang authored Sep 15, 2023

d151dcab

[example] llama2 add fine-tune example (#4673) · 4c4482f3

flybird11111 authored Sep 15, 2023

* [shardformer] update shardformer readme

[shardformer] update shardformer readme

[shardformer] update shardformer readme

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] change dataset

* [shardformer] change dataset

* [shardformer] fix CI

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

[example] update opt example

[example] resolve comments

fix

fix

* [example] llama2 add finetune example

* [example] llama2 add finetune example

* [example] llama2 add finetune example

* [example] llama2 add finetune example

* fix

* update llama2 example

* update llama2 example

* fix

* update llama2 example

* update llama2 example

* update llama2 example

* update llama2 example

* update llama2 example

* update llama2 example

* Update requirements.txt

* update llama2 example

* update llama2 example

* update llama2 example

4c4482f3

[shardformer] add custom policy in hybrid parallel plugin (#4718) · ac279799
Xuanlei Zhao authored Sep 15, 2023
```
* add custom policy

* update assert
```
ac279799
[doc] polish shardformer doc (#4735) · 451c3465
Baizhou Zhang authored Sep 15, 2023
```
* arrange position of chapters

* fix typos in seq parallel doc
```
451c3465
Merge pull request #4738 from ppt0011/main · 73eb3e88
ppt0011 authored Sep 15, 2023
```
[legacy] remove deterministic data loader test
```
73eb3e88

[example] add gpt2 HybridParallelPlugin example (#4653) · 608cffae

Bin Jia authored Sep 15, 2023

* add gpt2 HybridParallelPlugin example

* update readme and testci

* update test ci

* fix test_ci bug

* update requirements

* add requirements

* update requirements

* add requirement

* rename file

608cffae

[shardformer] update seq parallel document (#4730) · 6a03c933
Bin Jia authored Sep 15, 2023
```
* update doc of seq parallel

* fix typo
```
6a03c933
[legacy] remove deterministic data loader test · cd4e61d1
Pengtai Xu authored Sep 15, 2023

cd4e61d1

[shardformer] update pipeline parallel document (#4725) · 46162632

flybird11111 authored Sep 15, 2023

* [shardformer] update pipeline parallel document

* [shardformer] update pipeline parallel document

* [shardformer] update pipeline parallel document

* [shardformer] update pipeline parallel document

* [shardformer] update pipeline parallel document

* [shardformer] update pipeline parallel document

* [shardformer] update pipeline parallel document

* [shardformer] update pipeline parallel document

46162632

Optimized some syntax errors in the documentation and code under applications/ (#4127) · e4fc57c3
digger yu authored Sep 15, 2023
```
Co-authored-by: flybird11111 <1829166702@qq.com>
```
e4fc57c3
[doc] add shardformer support matrix/update tensor parallel documents (#4728) · 50e5602c
Baizhou Zhang authored Sep 15, 2023
```
* add compatibility matrix for shardformer doc

* update tp doc
```
50e5602c
[format] applied code formatting on changed files in pull request 4726 (#4727) · 8c2dda74
github-actions[bot] authored Sep 15, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
8c2dda74

[doc] Add user document for Shardformer (#4702) · f911d5b0

Baizhou Zhang authored Sep 15, 2023

* create shardformer doc files

* add docstring for seq-parallel

* update ShardConfig docstring

* add links to llama example

* add outdated massage

* finish introduction & supporting information

* finish 'how shardformer works'

* finish shardformer.md English doc

* fix doctest fail

* add Chinese document

f911d5b0

14 Sep, 2023 3 commits
- [doc] fix llama2 code link (#4726) · ce97790e
  binmakeswell authored Sep 14, 2023
```
* [doc] fix llama2 code link

* [doc] fix llama2 code link

* [doc] fix llama2 code link
```
  ce97790e
- [shardformer] to fix whisper test failed due to significant accuracy differences. (#4710) · 20190b49
  flybird11111 authored Sep 14, 2023
```
* [shardformer] fix whisper test failed

* [shardformer] fix whisper test failed

* [shardformer] fix whisper test failed

* [shardformer] fix whisper test failed
```
  20190b49
- [hotfix] Fix import error: colossal.kernel without triton installed (#4722) · e2c0e7f9
  Yuanheng Zhao authored Sep 14, 2023
```
* [hotfix] remove triton kernels from kernel init

* revise bloom/llama kernel imports for infer
```
  e2c0e7f9
13 Sep, 2023 2 commits
- [shardformer] fix GPT2DoubleHeadsModel (#4703) · c7d6975d
  flybird11111 authored Sep 13, 2023
  
  c7d6975d
- [doc] add potential solution for OOM in llama2 example (#4699) · 068372a7
  Baizhou Zhang authored Sep 13, 2023
  
  068372a7
12 Sep, 2023 4 commits

fix some typo with colossalai/device colossalai/tensor/ etc. (#4171) · 9c2feb2f
digger yu authored Sep 12, 2023
```
Co-authored-by: flybird11111 <1829166702@qq.com>
```
9c2feb2f
[hotfix] fix typo in hybrid parallel io (#4697) · d8ceeac1
Baizhou Zhang authored Sep 12, 2023

d8ceeac1

[shardformer] update shardformer readme (#4689) · 8844691f

flybird11111 authored Sep 12, 2023

* [shardformer] update shardformer readme

* [shardformer] update shardformer readme

* [shardformer] update shardformer readme

* [shardformer] update shardformer readme

* [shardformer] update shardformer readme

8844691f

[doc] Update booster user documents. (#4669) · 1d454733

Baizhou Zhang authored Sep 12, 2023

* update booster_api.md

* update booster_checkpoint.md

* update booster_plugins.md

* move transformers importing inside function

* fix Dict typing

* fix autodoc bug

* small fix

1d454733

11 Sep, 2023 4 commits

[Feature] The first PR to Add TP inference engine, kv-cache manager and... · bce0f167

Cuiqing Li authored Sep 12, 2023


[Feature] The first PR to Add TP inference engine, kv-cache manager and related kernels for our inference system (#4577)

* [infer] Infer/llama demo (#4503)

* add

* add infer example

* finish

* finish

* stash

* fix

* [Kernels]  add inference token attention kernel (#4505)

* add token forward

* fix tests

* fix comments

* add try import triton

* add adapted license

* add tests check

* [Kernels] add necessary kernels (llama & bloom) for attention forward and kv-cache manager  (#4485)

* added _vllm_rms_norm

* change place

* added tests

* added tests

* modify

* adding kernels

* added tests:

* adding kernels

* modify

* added

* updating kernels

* adding tests

* added tests

* kernel change

* submit

* modify

* added

* edit comments

* change name

* change commnets and fix import

* add

* added

* combine codes (#4509)

* [feature] add KV cache manager for llama & bloom inference (#4495)

* add kv cache memory manager

* add stateinfo during inference

* format

* format

* rename file

* add kv cache test

* revise on BatchInferState

* file dir change

* [Bug FIx] import llama context ops fix (#4524)

* added _vllm_rms_norm

* change place

* added tests

* added tests

* modify

* adding kernels

* added tests:

* adding kernels

* modify

* added

* updating kernels

* adding tests

* added tests

* kernel change

* submit

* modify

* added

* edit comments

* change name

* change commnets and fix import

* add

* added

* fix

* add ops into init.py

* add

* [Infer] Add TPInferEngine and fix file path (#4532)

* add engine for TP inference

* move file path

* update path

* fix TPInferEngine

* remove unused file

* add engine test demo

* revise TPInferEngine

* fix TPInferEngine, add test

* fix

* Add Inference test for llama (#4508)

* add kv cache memory manager

* add stateinfo during inference

* add

* add infer example

* finish

* finish

* format

* format

* rename file

* add kv cache test

* revise on BatchInferState

* add inference test for llama

* fix conflict

* feature: add some new features for llama engine

* adapt colossalai triton interface

* Change the parent class of llama  policy

* add nvtx

* move llama inference code to tensor_parallel

* fix __init__.py

* rm tensor_parallel

* fix: fix bugs in auto_policy.py

* fix:rm some unused codes

* mv colossalai/tpinference to colossalai/inference/tensor_parallel

* change __init__.py

* save change

* fix engine

* Bug fix: Fix hang

* remove llama_infer_engine.py

---------
Co-authored-by: yuanheng-zhao <jonathan.zhaoyh@gmail.com>
Co-authored-by: CjhHa1 <cjh18671720497@outlook.com>

* [infer] Add Bloom inference policy and replaced methods (#4512)

* add bloom inference methods and policy

* enable pass BatchInferState from model forward

* revise bloom infer layers/policies

* add engine for inference (draft)

* add test for bloom infer

* fix bloom infer policy and flow

* revise bloom test

* fix bloom file path

* remove unused codes

* fix bloom modeling

* fix dir typo

* fix trivial

* fix policy

* clean pr

* trivial fix

* Revert "[infer] Add Bloom inference policy and replaced methods (#4512)" (#4552)

This reverts commit 17cfa5714083a81a505c097f1c411cd28162d922.

* [Doc] Add colossal inference doc (#4549)

* create readme

* add readme.md

* fix typos

* [infer] Add Bloom inference policy and replaced methods (#4553)

* add bloom inference methods and policy

* enable pass BatchInferState from model forward

* revise bloom infer layers/policies

* add engine for inference (draft)

* add test for bloom infer

* fix bloom infer policy and flow

* revise bloom test

* fix bloom file path

* remove unused codes

* fix bloom modeling

* fix dir typo

* fix trivial

* fix policy

* clean pr

* trivial fix

* trivial

* Fix Bugs In Llama Model Forward (#4550)

* add kv cache memory manager

* add stateinfo during inference

* add

* add infer example

* finish

* finish

* format

* format

* rename file

* add kv cache test

* revise on BatchInferState

* add inference test for llama

* fix conflict

* feature: add some new features for llama engine

* adapt colossalai triton interface

* Change the parent class of llama  policy

* add nvtx

* move llama inference code to tensor_parallel

* fix __init__.py

* rm tensor_parallel

* fix: fix bugs in auto_policy.py

* fix:rm some unused codes

* mv colossalai/tpinference to colossalai/inference/tensor_parallel

* change __init__.py

* save change

* fix engine

* Bug fix: Fix hang

* remove llama_infer_engine.py

* bug fix: fix bugs about infer_state.is_context_stage

* remove pollcies

* fix: delete unused code

* fix: delete unused code

* remove unused coda

* fix conflict

---------
Co-authored-by: yuanheng-zhao <jonathan.zhaoyh@gmail.com>
Co-authored-by: CjhHa1 <cjh18671720497@outlook.com>

* [doc] add colossal inference fig (#4554)

* create readme

* add readme.md

* fix typos

* upload fig

* [NFC] fix docstring for colossal inference (#4555)

Fix docstring and comments in kv cache manager and bloom modeling

* fix docstring in llama modeling (#4557)

* [Infer] check import vllm (#4559)

* change import vllm

* import apply_rotary_pos_emb

* change import location

* [DOC] add installation req (#4561)

* add installation req

* fix

* slight change

* remove empty

* [Feature] rms-norm transfer into inference llama.py  (#4563)

* add installation req

* fix

* slight change

* remove empty

* add rmsnorm polciy

* add

* clean codes

* [infer] Fix tp inference engine (#4564)

* fix engine prepare data

* add engine test

* use bloom for testing

* revise on test

* revise on test

* reset shardformer llama (#4569)

* [infer] Fix engine - tensors on different devices (#4570)


* fix diff device in engine

* [codefactor] Feature/colossal inference (#4579)

* code factors

* remove

* change coding (#4581)

* [doc] complete README of colossal inference (#4585)

* complete fig

* Update README.md

* [doc]update readme (#4586)

* update readme

* Update README.md

* bug fix: fix bus in llama and bloom (#4588)

* [BUG FIX]Fix test engine in CI and non-vllm kernels llama forward  (#4592)

* fix tests

* clean

* clean

* fix bugs

* add

* fix llama non-vllm kernels bug

* modify

* clean codes

* [Kernel]Rmsnorm fix (#4598)

* fix tests

* clean

* clean

* fix bugs

* add

* fix llama non-vllm kernels bug

* modify

* clean codes

* add triton rmsnorm

* delete vllm kernel flag

* [Bug Fix]Fix bugs in llama (#4601)

* fix tests

* clean

* clean

* fix bugs

* add

* fix llama non-vllm kernels bug

* modify

* clean codes

* bug fix: remove rotary_positions_ids

---------
Co-authored-by: cuiqing.li <lixx3527@gmail.com>

* [kernel] Add triton layer norm & replace norm for bloom (#4609)

* add layernorm for inference

* add test for layernorm kernel

* add bloom layernorm replacement policy

* trivial: path

* [Infer] Bug fix rotary embedding in llama (#4608)

* fix rotary embedding

* delete print

* fix init seq len bug

* rename pytest

* add benchmark for llama

* refactor codes

* delete useless code

* [bench] Add bloom inference benchmark (#4621)

* add bloom benchmark

* readme - update benchmark res

* trivial - uncomment for testing (#4622)

* [Infer] add check triton and cuda version for tests (#4627)

* fix rotary embedding

* delete print

* fix init seq len bug

* rename pytest

* add benchmark for llama

* refactor codes

* delete useless code

* add check triton and cuda

* Update sharder.py (#4629)

* [Inference] Hot fix some bugs and typos (#4632)

* fix

* fix test

* fix conflicts

* [typo]Comments fix (#4633)

* fallback

* fix commnets

* bug fix: fix some bugs in test_llama and test_bloom (#4635)

* [Infer] delete benchmark in tests and fix bug for llama and bloom (#4636)

* fix rotary embedding

* delete print

* fix init seq len bug

* rename pytest

* add benchmark for llama

* refactor codes

* delete useless code

* add check triton and cuda

* delete benchmark and fix infer bugs

* delete benchmark for tests

* delete useless code

* delete bechmark function in utils

* [Fix] Revise TPInferEngine, inference tests and benchmarks (#4642)

* [Fix] revise TPInferEngine methods and inference tests

* fix llama/bloom infer benchmarks

* fix infer tests

* trivial fix: benchmakrs

* trivial

* trivial: rm print

* modify utils filename for infer ops test (#4657)

* [Infer] Fix TPInferEngine init & inference tests, benchmarks (#4670)

* fix engine funcs

* TPInferEngine: receive shard config in init

* benchmarks: revise TPInferEngine init

* benchmarks: remove pytest decorator

* trivial fix

* use small model for tests

* [NFC] use args for infer benchmarks (#4674)

* revise infer default (#4683)

* [Fix] optimize/shard model in TPInferEngine init (#4684)

* remove using orig model in engine

* revise inference tests

* trivial: rename

---------
Co-authored-by: Jianghai <72591262+CjhHa1@users.noreply.github.com>
Co-authored-by: Xu Kai <xukai16@foxmail.com>
Co-authored-by: Yuanheng Zhao <54058983+yuanheng-zhao@users.noreply.github.com>
Co-authored-by: yuehuayingxueluo <867460659@qq.com>
Co-authored-by: yuanheng-zhao <jonathan.zhaoyh@gmail.com>
Co-authored-by: CjhHa1 <cjh18671720497@outlook.com>

bce0f167

[shardformer]fix gpt2 double head (#4663) · eedaa3e1

flybird11111 authored Sep 11, 2023

* [shardformer]fix gpt2 test

[shardformer]fix gpt2 test

[shardformer]fix gpt2 test

* fix

* [shardformer] add todo

* [shardformer] add todo

eedaa3e1

[legacy] move communication and nn to legacy and refactor logger (#4671) · 554aa959

Hongxin Liu authored Sep 11, 2023

* [legacy] move communication to legacy (#4640)

* [legacy] refactor logger and clean up legacy codes (#4654)

* [legacy] make logger independent to gpc

* [legacy] make optim independent to registry

* [legacy] move test engine to legacy

* [legacy] move nn to legacy (#4656)

* [legacy] move nn to legacy

* [checkpointio] fix save hf config

* [test] remove useledd rpc pp test

* [legacy] fix nn init

* [example] skip tutorial hybriad parallel example

* [devops] test doc check

* [devops] test doc check

554aa959

[devops] fix concurrency group (#4667) · 536397cc
Hongxin Liu authored Sep 11, 2023

536397cc

09 Sep, 2023 1 commit

[shardformer] update llama2/opt finetune example and fix llama2 policy (#4645) · 7486ed7d

flybird11111 authored Sep 09, 2023

* [shardformer] update shardformer readme

[shardformer] update shardformer readme

[shardformer] update shardformer readme

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] change dataset

* [shardformer] change dataset

* [shardformer] fix CI

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

[example] update opt example

[example] resolve comments

fix

fix

7486ed7d

08 Sep, 2023 1 commit

[devops] fix concurrency group and compatibility test (#4665) · a686f9dd

Hongxin Liu authored Sep 08, 2023

* [devops] fix concurrency group

* [devops] fix compatibility test

* [devops] fix tensornvme install

* [devops] fix tensornvme install

* [devops] fix colossalai install

a686f9dd

07 Sep, 2023 3 commits

[example] update vit example for hybrid parallel plugin (#4641) · 295b38fe

Baizhou Zhang authored Sep 07, 2023

* update vit example for hybrid plugin

* reset tp/pp size

* fix dataloader iteration bug

* update optimizer passing in evaluation/add grad_accum

* change criterion

* wrap tqdm

* change grad_accum to grad_checkpoint

* fix pbar

295b38fe

[pipeline] set optimizer to optional in execute_pipeline (#4630) · 660eed91

Baizhou Zhang authored Sep 07, 2023

* set optimizer to optional in execute_pipeline

* arrange device and mixed precision in booster init

* fix execute_pipeline in booster.py

660eed91

[shardformer] Support customized policy for llamav2 based model with HybridParallelPlugin (#4624) · c3d5fa3b

eric8607242 authored Sep 07, 2023



* Enable policy assignment in HybridPlugin and enable llama policy for llamav2

* Remove Policy from Plugin

* revert changes of plugin

HybridParallelModule

* revert changes in plugin

* upgrade transformers

* revert transformers version

---------
Co-authored-by: flybird11111 <1829166702@qq.com>

c3d5fa3b