Commits · 611a5a80cac780e3a6e30c0f81f4ceb2a3b3f2c0 · OpenDAS / ColossalAI

16 Oct, 2023 1 commit

[inference] Add smmoothquant for llama (#4904) · 611a5a80

Xu Kai authored Oct 16, 2023

* [inference] add int8 rotary embedding kernel for smoothquant (#4843)

* [inference] add smoothquant llama attention (#4850)

* add smoothquant llama attention

* remove uselss code

* remove useless code

* fix import error

* rename file name

* [inference] add silu linear fusion for smoothquant llama mlp  (#4853)

* add silu linear

* update skip condition

* catch smoothquant cuda lib exception

* prcocess exception for tests

* [inference] add llama mlp for smoothquant (#4854)

* add llama mlp for smoothquant

* fix down out scale

* remove duplicate lines

* add llama mlp check

* delete useless code

* [inference] add smoothquant llama (#4861)

* add smoothquant llama

* fix attention accuracy

* fix accuracy

* add kv cache and save pretrained

* refactor example

* delete smooth

* refactor code

* [inference] add smooth function and delete useless code for smoothquant (#4895)

* add smooth function and delete useless code

* update datasets

* remove duplicate import

* delete useless file

* refactor codes (#4902)

* rafactor code

* add license

* add torch-int and smoothquant license

611a5a80

13 Oct, 2023 2 commits

[feature] support no master weights option for low level zero plugin (#4816) · a0684e7b

Zhongkai Zhao authored Oct 13, 2023

* [feature] support no master weights for low level zero plugin

* [feature] support no master weights for low level zero plugin, remove data copy when no master weights

* remove data copy and typecasting when no master weights

* not load weights to cpu when using no master weights

* fix grad: use fp16 grad when no master weights

* only do not update working param when no master weights

* fix: only do not update working param when no master weights

* fix: passing params in dict format in hybrid plugin

* fix: remove extra params (tp_process_group) in hybrid_parallel_plugin

a0684e7b

[inference] add llama2 support (#4898) · 77a93283
Xu Kai authored Oct 13, 2023
```
* add llama2 support

* fix multi group bug
```
77a93283

12 Oct, 2023 4 commits

[hotfix] fix lr scheduler bug in torch 2.0 (#4864) · 39f2582e
Baizhou Zhang authored Oct 12, 2023

39f2582e

[feature] Add clip_grad_norm for hybrid_parallel_plugin (#4837) · 83b52c56

littsk authored Oct 12, 2023

* Add clip_grad_norm for hibrid_parallel_plugin

* polish code

* add unittests

* Move tp to a higher-level optimizer interface.

* bug fix

* polish code

83b52c56

[gemini] support amp o3 for gemini (#4872) · df635641

Hongxin Liu authored Oct 12, 2023

* [gemini] support no reuse fp16 chunk

* [gemini] support no master weight for optim

* [gemini] support no master weight for gemini ddp

* [test] update gemini tests

* [test] update gemini tests

* [plugin] update gemini plugin

* [test] fix gemini checkpointio test

* [test] fix gemini checkpoint io

df635641

Merge pull request #4889 from ppt0011/main · c1fab951
ppt0011 authored Oct 12, 2023
```
[doc] add reminder for issue encountered with hybrid adam
```
c1fab951

11 Oct, 2023 4 commits

[hotfix] fix bug in sequence parallel test (#4887) · ffd9a3cb
littsk authored Oct 11, 2023

ffd9a3cb
[doc] add reminder for issue encountered with hybrid adam · 1dcaf249
ppt0011 authored Oct 11, 2023

1dcaf249
fix test llama (#4884) · fdec650b
Xu Kai authored Oct 11, 2023

fdec650b

[Pipeline Inference] Sync pipeline inference branch to main (#4820) · 08a9f76b

Bin Jia authored Oct 11, 2023

* [pipeline inference] pipeline inference (#4492)

* add pp stage manager as circle stage

* fix a bug when create process group

* add ppinfer basic framework

* add micro batch manager and support kvcache-pp gpt2 fwd

* add generate schedule

* use mb size to control mb number

* support generate with kv cache

* add output, remove unused code

* add test

* reuse shardformer to build model

* refactor some code and use the same attribute name of hf

* fix review and add test for generation

* remove unused file

* fix CI

* add cache clear

* fix code error

* fix typo

* [Pipeline inference] Modify to tieweight (#4599)

* add pp stage manager as circle stage

* fix a bug when create process group

* add ppinfer basic framework

* add micro batch manager and support kvcache-pp gpt2 fwd

* add generate schedule

* use mb size to control mb number

* support generate with kv cache

* add output, remove unused code

* add test

* reuse shardformer to build model

* refactor some code and use the same attribute name of hf

* fix review and add test for generation

* remove unused file

* modify the way of saving newtokens

* modify to tieweight

* modify test

* remove unused file

* solve review

* add docstring

* [Pipeline inference] support llama pipeline inference (#4647)

* support llama pipeline inference

* remove tie weight operation

* [pipeline inference] Fix the blocking of communication when ppsize is 2 (#4708)

* add benchmark verbose

* fix export tokens

* fix benchmark verbose

* add P2POp style to do p2p communication

* modify schedule as p2p type when ppsize is 2

* remove unused code and add docstring

* [Pipeline inference] Refactor code, add docsting, fix bug (#4790)

* add benchmark script

* update argparse

* fix fp16 load

* refactor code style

* add docstring

* polish code

* fix test bug

* [Pipeline inference] Add pipeline inference docs (#4817)

* add readme doc

* add a ico

* Add performance

* update table of contents

* refactor code (#4873)

08a9f76b

10 Oct, 2023 5 commits
- Update README.md · 652adc22
  Camille Zhong authored Oct 10, 2023
  
  652adc22
- Update README.md · afe10a85
  Camille Zhong authored Oct 10, 2023
  
  afe10a85
- Update main README.md · d6c4b9b3
  Camille Zhong authored Oct 10, 2023
```
add modelscope model link
```
  d6c4b9b3
- Update modelscope link in README.md · 3043d5d6
  Camille Zhong authored Oct 10, 2023
```
add modelscope link
```
  3043d5d6
- [doc] update advanced tutorials, training gpt with hybrid parallelism (#4866) · 6a21f96a
  flybird11111 authored Oct 10, 2023
```
* [doc]update advanced tutorials, training gpt with hybrid parallelism

* [doc]update advanced tutorials, training gpt with hybrid parallelism

* update vit tutorials

* update vit tutorials

* update vit tutorials

* update vit tutorials

* update en/train_vit_with_hybrid_parallel.py

* fix

* resolve comments

* fix
```
  6a21f96a
07 Oct, 2023 5 commits
- [nfc] fix minor typo in README (#4846) · 8aed02b9
  Blagoy Simandoff authored Oct 07, 2023
  
  8aed02b9
- [NFC] polish code style (#4799) · cd6a962e
  Camille Zhong authored Sep 27, 2023
  
  cd6a962e
- [NFC] polish colossalai/inference/quant/gptq/cai_gptq/__init__.py code style (#4792) · 07ed155e
  Michelle authored Sep 27, 2023
  
  07ed155e
- polish code for gptq (#4793) · eef96e08
  littsk authored Sep 25, 2023
  
  eef96e08
- [checkpointio] hotfix torch 2.0 compatibility (#4824) · cb3a25a0
  Hongxin Liu authored Oct 07, 2023
  
  cb3a25a0
06 Oct, 2023 2 commits
- Merge pull request #4856 from KKZ20/test/model_support_for_low_level_zero · ad23460c
  ppt0011 authored Oct 06, 2023
```
[test] remove the redundant code of model output transformation in torchrec
```
  ad23460c
- Merge pull request #4858 from Shawlleyw/main · 81ee91f2
  ppt0011 authored Oct 06, 2023
```
[doc]: typo in document of booster low_level_zero plugin
```
  81ee91f2
05 Oct, 2023 2 commits
- fix: typo in comment of low_level_zero plugin · c97a3523
  shaoyuw authored Oct 05, 2023
  
  c97a3523
- [test] modify model supporting part of low_level_zero plugin (including correspoding docs) · db40e086
  Zhongkai Zhao authored Oct 05, 2023
  
  db40e086
04 Oct, 2023 2 commits
- [infer] fix test bug (#4838) · d1fcc0fa
  Xu Kai authored Oct 04, 2023
```
* fix test bug

* delete useless code

* fix typo
```
  d1fcc0fa
- [inference]fix import bug and delete down useless init (#4830) · 013a4bed
  Jianghai authored Oct 04, 2023
```
* fix import bug and release useless init

* fix

* fix

* fix
```
  013a4bed
02 Oct, 2023 2 commits

[Infer] Serving example w/ ray-serve (multiple GPU case) (#4841) · 573f2705

Yuanheng Zhao authored Oct 02, 2023

* fix imports

* add ray-serve with Colossal-Infer tp

* trivial: send requests script

* add README

* fix worker port

* fix readme

* use app builder and autoscaling

* trivial: input args

* clean code; revise readme

* testci (skip example test)

* use auto model/tokenizer

* revert imports fix (fixed in other PRs)

573f2705

[Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) (#4771) · 3a74eb4b

Yuanheng Zhao authored Oct 02, 2023

* add Colossal-Inference serving example w/ TorchServe

* add dockerfile

* fix dockerfile

* fix dockerfile: fix commit hash, install curl

* refactor file structure

* revise readme

* trivial

* trivial: dockerfile format

* clean dir; revise readme

* fix comments: fix imports and configs

* fix formats

* remove unused requirements

3a74eb4b

28 Sep, 2023 2 commits
- update Colossal (#4832) · ed06731e
  Tong Li authored Sep 28, 2023
  
  ed06731e
- add autotune (#4822) · c3bef204
  Xu Kai authored Sep 28, 2023
  
  c3bef204
27 Sep, 2023 8 commits

[doc] update slack link (#4823) · 822051d8
binmakeswell authored Sep 27, 2023

822051d8
Update Qwen-7B results (#4821) · 1fa8c5e0
Yuanchen authored Sep 27, 2023
```
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
```
1fa8c5e0

[chat] fix gemini strategy (#4698) · be400a09

flybird11111 authored Sep 27, 2023

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* g# This is a combination of 2 commits.

[chat] fix gemini strategy

fox

* [chat] fix gemini strategy

update llama2 example

[chat] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* fix

* fix

* fix

* fix

* fix

* Update train_prompts.py

be400a09

fix format (#4815) · bbbcac26
Tong Li authored Sep 27, 2023

bbbcac26
[format] applied code formatting on changed files in pull request 4595 (#4602) · fb46d05c
github-actions[bot] authored Sep 27, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
fb46d05c
[hotfix] Correct several erroneous code comments (#4794) · 11f1e426
littsk authored Sep 27, 2023

11f1e426
[hotfix] fix norm type error in zero optimizer (#4795) · 54b3ad89
littsk authored Sep 27, 2023

54b3ad89
[doc] add lazy init docs (#4808) · da15fdb9
Hongxin Liu authored Sep 27, 2023

da15fdb9

26 Sep, 2023 1 commit
- [misc] add last_epoch in CosineAnnealingWarmupLR (#4778) · a2270633
  Yan haixu authored Sep 26, 2023
  
  a2270633