Commits · be400a09364942a02e0048d940f3699380cd6f1f · OpenDAS / ColossalAI

27 Sep, 2023 6 commits

[chat] fix gemini strategy (#4698) · be400a09

flybird11111 authored Sep 27, 2023

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* g# This is a combination of 2 commits.

[chat] fix gemini strategy

fox

* [chat] fix gemini strategy

update llama2 example

[chat] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* fix

* fix

* fix

* fix

* fix

* Update train_prompts.py

be400a09

fix format (#4815) · bbbcac26
Tong Li authored Sep 27, 2023

bbbcac26
[format] applied code formatting on changed files in pull request 4595 (#4602) · fb46d05c
github-actions[bot] authored Sep 27, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
fb46d05c
[hotfix] Correct several erroneous code comments (#4794) · 11f1e426
littsk authored Sep 27, 2023

11f1e426
[hotfix] fix norm type error in zero optimizer (#4795) · 54b3ad89
littsk authored Sep 27, 2023

54b3ad89
[doc] add lazy init docs (#4808) · da15fdb9
Hongxin Liu authored Sep 27, 2023

da15fdb9

26 Sep, 2023 8 commits
- [misc] add last_epoch in CosineAnnealingWarmupLR (#4778) · a2270633
  Yan haixu authored Sep 26, 2023
  
  a2270633
- [hotfix] change llama2 Colossal-LLaMA-2 script filename (#4800) · b6cf0aca
  Chandler-Bing authored Sep 26, 2023
```
change filename:
pretraining.py -> trainin.py
there is no file named pretraing.py. wrong writing
```
  b6cf0aca
- Merge pull request #4805 from TongLi3701/docs/fix · 62b6af10
  Desperado-Jia authored Sep 26, 2023
```
[doc] Update TODO in README of Colossal-LLaMA-2
```
  62b6af10
- update · 8cbce618
  Tong Li authored Sep 26, 2023
  
  8cbce618
- [lazy] support from_pretrained (#4801) · 4965c0da
  Hongxin Liu authored Sep 26, 2023
```
* [lazy] patch from pretrained

* [lazy] fix from pretrained and add tests

* [devops] update ci
```
  4965c0da
- update readme · bd014673
  Tong Li authored Sep 26, 2023
  
  bd014673
- [checkpointio] support unsharded checkpointIO for hybrid parallel (#4774) · 64a08b2d
  Baizhou Zhang authored Sep 26, 2023
```
* support unsharded saving/loading for model

* support optimizer unsharded saving

* update doc

* support unsharded loading for optimizer

* small fix
```
  64a08b2d
- [doc] polish shardformer doc (#4779) · a2db7554
  Baizhou Zhang authored Sep 26, 2023
```
* fix example format in docstring

* polish shardformer doc
```
  a2db7554
25 Sep, 2023 2 commits
- [fix] fix weekly runing example (#4787) · 26cd6d85
  flybird11111 authored Sep 25, 2023
```
* [fix] fix weekly runing example

* [fix] fix weekly runing example
```
  26cd6d85
- [doc] add llama2 domain-specific solution news (#4789) · d512a4d3
  binmakeswell authored Sep 25, 2023
```
* [doc] add llama2 domain-specific solution news
```
  d512a4d3
24 Sep, 2023 2 commits
- [feature] ColossalEval: Evaluation Pipeline for LLMs (#4786) · ce777853
  Yuanchen authored Sep 24, 2023
```
* Add ColossalEval

* Delete evaluate in Chat

---------
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
```
  ce777853
- initial commit: add colossal llama 2 (#4784) · 74aa7d96
  Tong Li authored Sep 24, 2023
  
  74aa7d96
22 Sep, 2023 4 commits

[release] update version (#4775) · 4146f1c0
Hongxin Liu authored Sep 22, 2023
```
* [release] update version

* [doc] revert versions
```
4146f1c0

[inference] chatglm2 infer demo (#4724) · ce7ade38

Jianghai authored Sep 22, 2023

* add chatglm2

* add

* gather needed kernels

* fix some bugs

* finish context forward

* finish context stage

* fix

* add

* pause

* add

* fix bugs

* finish chatglm

* fix bug

* change some logic

* fix bugs

* change some logics

* add

* add

* add

* fix

* fix tests

* fix

ce7ade38

[feature] add gptq for inference (#4754) · 946ab56c

Xu Kai authored Sep 22, 2023

* [gptq] add gptq kernel (#4416)

* add gptq

* refactor code

* fix tests

* replace auto-gptq

* rname inferance/quant

* refactor test

* add auto-gptq as an option

* reset requirements

* change assert and check auto-gptq

* add import warnings

* change test flash attn version

* remove example

* change requirements of flash_attn

* modify tests

* [skip ci] change requirements-test

* [gptq] faster gptq cuda kernel (#4494)

* [skip ci] add cuda kernels

* add license

* [skip ci] fix max_input_len

* format files & change test size

* [skip ci]

* [gptq] add gptq tensor parallel (#4538)

* add gptq tensor parallel

* add gptq tp

* delete print

* add test gptq check

* add test auto gptq check

* [gptq] combine gptq and kv cache manager (#4706)

* combine gptq and kv cache manager

* add init bits

* delete useless code

* add model path

* delete usless print and update test

* delete usless import

* move option gptq to shard config

* change replace linear to shardformer

* update bloom policy

* delete useless code

* fix import bug and delete uselss code

* change colossalai/gptq to colossalai/quant/gptq

* update import linear for tests

* delete useless code and mv gptq_kernel to kernel directory

* fix triton kernel

* add triton import

946ab56c

[bug] Fix the version check bug in colossalai run when generating the cmd. (#4713) · 1e0e0808
littsk authored Sep 22, 2023
```
* Fix the version check bug in colossalai run when generating the cmd.

* polish code
```
1e0e0808

21 Sep, 2023 5 commits
- [lazy] support torch 2.0 (#4763) · 3e05c07b
  Hongxin Liu authored Sep 21, 2023
```
* [lazy] support _like methods and clamp

* [lazy] pass transformers models

* [lazy] fix device move and requires grad

* [lazy] fix requires grad and refactor api

* [lazy] fix requires grad
```
  3e05c07b
- [chat]: add lora merge weights config (#4766) · 901ab1ee
  Wenhao Chen authored Sep 21, 2023
```
* feat: modify lora merge weights fn

* feat: add lora merge weights config
```
  901ab1ee
- [doc] add shardformer doc to sidebar (#4768) · 493a5efe
  Baizhou Zhang authored Sep 21, 2023
  
  493a5efe
- [doc] clean up outdated docs (#4765) · 66f39260
  Hongxin Liu authored Sep 21, 2023
```
* [doc] clean up outdated docs

* [doc] fix linking

* [doc] fix linking
```
  66f39260
- [bug] fix get_default_parser in examples (#4764) · df66741f
  Baizhou Zhang authored Sep 21, 2023
  
  df66741f
20 Sep, 2023 4 commits

[shardformer] fix master param sync for hybrid plugin/rewrite unwrapping logic (#4758) · c0a03370

Baizhou Zhang authored Sep 20, 2023

* fix master param sync for hybrid plugin

* rewrite unwrap for ddp/fsdp

* rewrite unwrap for zero/gemini

* rewrite unwrap for hybrid plugin

* fix geemini unwrap

* fix bugs

c0a03370

[chat]: update rm, add wandb and fix bugs (#4471) · 7b9b8644

Wenhao Chen authored Sep 20, 2023



* feat: modify forward fn of critic and reward model

* feat: modify calc_action_log_probs

* to: add wandb in sft and rm trainer

* feat: update train_sft

* feat: update train_rm

* style: modify type annotation and add warning

* feat: pass tokenizer to ppo trainer

* to: modify trainer base and maker base

* feat: add wandb in ppo trainer

* feat: pass tokenizer to generate

* test: update generate fn tests

* test: update train tests

* fix: remove action_mask

* feat: remove unused code

* fix: fix wrong ignore_index

* fix: fix mock tokenizer

* chore: update requirements

* revert: modify make_experience

* fix: fix inference

* fix: add padding side

* style: modify _on_learn_batch_end

* test: use mock tokenizer

* fix: use bf16 to avoid overflow

* fix: fix workflow

* [chat] fix gemini strategy

* [chat] fix

* sync: update colossalai strategy

* fix: fix args and model dtype

* fix: fix checkpoint test

* fix: fix requirements

* fix: fix missing import and wrong arg

* fix: temporarily skip gemini test in stage 3

* style: apply pre-commit

* fix: temporarily skip gemini test in stage 1&2

---------
Co-authored-by: Mingyan Jiang <1829166702@qq.com>

7b9b8644

Merge pull request #4757 from ppt0011/main · 07c2e3d0
ppt0011 authored Sep 20, 2023
```
[doc] explain suitable use case for each plugin
```
07c2e3d0
[doc] put native colossalai plugins first in description section · 4d7537ba
Pengtai Xu authored Sep 20, 2023

4d7537ba

19 Sep, 2023 4 commits
- [doc] add model examples for each plugin · e10d9f08
  Pengtai Xu authored Sep 19, 2023
  
  e10d9f08
- [doc] put individual plugin explanation in front · a04337bf
  Pengtai Xu authored Sep 19, 2023
  
  a04337bf
- [doc] explain suitable use case for each plugin · 10513f20
  Pengtai Xu authored Sep 19, 2023
  
  10513f20
- [misc] update pre-commit and run all files (#4752) · 079bf3cb
  Hongxin Liu authored Sep 19, 2023
```
* [misc] update pre-commit

* [misc] run pre-commit

* [misc] remove useless configuration files

* [misc] ignore cuda for clang-format
```
  079bf3cb
18 Sep, 2023 3 commits

[format] applied code formatting on changed files in pull request 4743 (#4750) · 3c6b831c
github-actions[bot] authored Sep 18, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
3c6b831c

[legacy] clean up legacy code (#4743) · b5f9e37c

Hongxin Liu authored Sep 18, 2023

* [legacy] remove outdated codes of pipeline (#4692)

* [legacy] remove cli of benchmark and update optim (#4690)

* [legacy] remove cli of benchmark and update optim

* [doc] fix cli doc test

* [legacy] fix engine clip grad norm

* [legacy] remove outdated colo tensor (#4694)

* [legacy] remove outdated colo tensor

* [test] fix test import

* [legacy] move outdated zero to legacy (#4696)

* [legacy] clean up utils (#4700)

* [legacy] clean up utils

* [example] update examples

* [legacy] clean up amp

* [legacy] fix amp module

* [legacy] clean up gpc (#4742)

* [legacy] clean up context

* [legacy] clean core, constants and global vars

* [legacy] refactor initialize

* [example] fix examples ci

* [example] fix examples ci

* [legacy] fix tests

* [example] fix gpt example

* [example] fix examples ci

* [devops] fix ci installation

* [example] fix examples ci

b5f9e37c

[kernel] update triton init #4740 (#4740) · 32e7f994
Xuanlei Zhao authored Sep 18, 2023

32e7f994

15 Sep, 2023 2 commits

[doc] explaination of loading large pretrained models (#4741) · d151dcab
Baizhou Zhang authored Sep 15, 2023

d151dcab

[example] llama2 add fine-tune example (#4673) · 4c4482f3

flybird11111 authored Sep 15, 2023

* [shardformer] update shardformer readme

[shardformer] update shardformer readme

[shardformer] update shardformer readme

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] update llama2/opt finetune example and shardformer update to llama2

* [shardformer] change dataset

* [shardformer] change dataset

* [shardformer] fix CI

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

* [shardformer] fix

[example] update opt example

[example] resolve comments

fix

fix

* [example] llama2 add finetune example

* [example] llama2 add finetune example

* [example] llama2 add finetune example

* [example] llama2 add finetune example

* fix

* update llama2 example

* update llama2 example

* fix

* update llama2 example

* update llama2 example

* update llama2 example

* update llama2 example

* update llama2 example

* update llama2 example

* Update requirements.txt

* update llama2 example

* update llama2 example

* update llama2 example

4c4482f3