- 21 Sep, 2023 1 commit
-
-
Baizhou Zhang authored
-
- 20 Sep, 2023 1 commit
-
-
Wenhao Chen authored
* feat: modify forward fn of critic and reward model * feat: modify calc_action_log_probs * to: add wandb in sft and rm trainer * feat: update train_sft * feat: update train_rm * style: modify type annotation and add warning * feat: pass tokenizer to ppo trainer * to: modify trainer base and maker base * feat: add wandb in ppo trainer * feat: pass tokenizer to generate * test: update generate fn tests * test: update train tests * fix: remove action_mask * feat: remove unused code * fix: fix wrong ignore_index * fix: fix mock tokenizer * chore: update requirements * revert: modify make_experience * fix: fix inference * fix: add padding side * style: modify _on_learn_batch_end * test: use mock tokenizer * fix: use bf16 to avoid overflow * fix: fix workflow * [chat] fix gemini strategy * [chat] fix * sync: update colossalai strategy * fix: fix args and model dtype * fix: fix checkpoint test * fix: fix requirements * fix: fix missing import and wrong arg * fix: temporarily skip gemini test in stage 3 * style: apply pre-commit * fix: temporarily skip gemini test in stage 1&2 --------- Co-authored-by:Mingyan Jiang <1829166702@qq.com>
-
- 19 Sep, 2023 1 commit
-
-
Hongxin Liu authored
* [misc] update pre-commit * [misc] run pre-commit * [misc] remove useless configuration files * [misc] ignore cuda for clang-format
-
- 18 Sep, 2023 2 commits
-
-
github-actions[bot] authored
Co-authored-by:github-actions <github-actions@github.com>
-
Hongxin Liu authored
* [legacy] remove outdated codes of pipeline (#4692) * [legacy] remove cli of benchmark and update optim (#4690) * [legacy] remove cli of benchmark and update optim * [doc] fix cli doc test * [legacy] fix engine clip grad norm * [legacy] remove outdated colo tensor (#4694) * [legacy] remove outdated colo tensor * [test] fix test import * [legacy] move outdated zero to legacy (#4696) * [legacy] clean up utils (#4700) * [legacy] clean up utils * [example] update examples * [legacy] clean up amp * [legacy] fix amp module * [legacy] clean up gpc (#4742) * [legacy] clean up context * [legacy] clean core, constants and global vars * [legacy] refactor initialize * [example] fix examples ci * [example] fix examples ci * [legacy] fix tests * [example] fix gpt example * [example] fix examples ci * [devops] fix ci installation * [example] fix examples ci
-
- 15 Sep, 2023 1 commit
-
-
Bin Jia authored
* add gpt2 HybridParallelPlugin example * update readme and testci * update test ci * fix test_ci bug * update requirements * add requirements * update requirements * add requirement * rename file
-
- 11 Sep, 2023 1 commit
-
-
Hongxin Liu authored
* [legacy] move communication to legacy (#4640) * [legacy] refactor logger and clean up legacy codes (#4654) * [legacy] make logger independent to gpc * [legacy] make optim independent to registry * [legacy] move test engine to legacy * [legacy] move nn to legacy (#4656) * [legacy] move nn to legacy * [checkpointio] fix save hf config * [test] remove useledd rpc pp test * [legacy] fix nn init * [example] skip tutorial hybriad parallel example * [devops] test doc check * [devops] test doc check
-
- 05 Sep, 2023 2 commits
-
-
Hongxin Liu authored
-
Hongxin Liu authored
* [legacy] move trainer to legacy * [doc] update docs related to trainer * [test] ignore legacy test
-
- 24 Aug, 2023 1 commit
-
-
Hongxin Liu authored
* [gemini] remove distributed-related part from colotensor (#4379) * [gemini] remove process group dependency * [gemini] remove tp part from colo tensor * [gemini] patch inplace op * [gemini] fix param op hook and update tests * [test] remove useless tests * [test] remove useless tests * [misc] fix requirements * [test] fix model zoo * [test] fix model zoo * [test] fix model zoo * [test] fix model zoo * [test] fix model zoo * [misc] update requirements * [gemini] refactor gemini optimizer and gemini ddp (#4398) * [gemini] update optimizer interface * [gemini] renaming gemini optimizer * [gemini] refactor gemini ddp class * [example] update gemini related example * [example] update gemini related example * [plugin] fix gemini plugin args * [test] update gemini ckpt tests * [gemini] fix checkpoint io * [example] fix opt example requirements * [example] fix opt example * [example] fix opt example * [example] fix opt example * [gemini] add static placement policy (#4443) * [gemini] add static placement policy * [gemini] fix param offload * [test] update gemini tests * [plugin] update gemini plugin * [plugin] update gemini plugin docstr * [misc] fix flash attn requirement * [test] fix gemini checkpoint io test * [example] update resnet example result (#4457) * [example] update bert example result (#4458) * [doc] update gemini doc (#4468) * [example] update gemini related examples (#4473) * [example] update gpt example * [example] update dreambooth example * [example] update vit * [example] update opt * [example] update palm * [example] update vit and opt benchmark * [hotfix] fix bert in model zoo (#4480) * [hotfix] fix bert in model zoo * [test] remove chatglm gemini test * [test] remove sam gemini test * [test] remove vit gemini test * [hotfix] fix opt tutorial example (#4497) * [hotfix] fix opt tutorial example * [hotfix] fix opt tutorial example
-
- 28 Jun, 2023 1 commit
-
-
digger yu authored
-
- 26 Jun, 2023 1 commit
-
-
Baizhou Zhang authored
-
- 19 Jun, 2023 1 commit
-
-
LuGY authored
-
- 08 Jun, 2023 1 commit
-
-
digger yu authored
-
- 30 May, 2023 1 commit
-
-
jiangmingyan authored
* [example]update gemini examples * [example]update gemini examples
-
- 18 May, 2023 1 commit
-
-
binmakeswell authored
-
- 26 Apr, 2023 1 commit
-
-
digger-yu authored
* Fixed several spelling errors under colossalai * Fix the spelling error in colossalai and docs directory * Cautious Changed the spelling error under the example folder * Update runtime_preparation_pass.py revert autograft to autograd * Update search_chunk.py utile to until * Update check_installation.py change misteach to mismatch in line 91 * Update 1D_tensor_parallel.md revert to perceptron * Update 2D_tensor_parallel.md revert to perceptron in line 73 * Update 2p5D_tensor_parallel.md revert to perceptron in line 71 * Update 3D_tensor_parallel.md revert to perceptron in line 80 * Update README.md revert to resnet in line 42 * Update reorder_graph.py revert to indice in line 7 * Update p2p.py revert to megatron in line 94 * Update initialize.py revert to torchrun in line 198 * Update routers.py change to detailed in line 63 * Update routers.py change to detailed in line 146 * Update README.md revert random number in line 402
-
- 06 Apr, 2023 1 commit
-
-
Frank Lee authored
* [test] added spawn decorator * polish code * polish code * polish code * polish code * polish code * polish code
-
- 04 Apr, 2023 1 commit
-
-
ver217 authored
* [zero] refactor low-level zero folder structure * [zero] fix legacy zero import path * [zero] fix legacy zero import path * [zero] remove useless import * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] fix test import path * [zero] fix test * [zero] fix circular import * [zero] update import
-
- 23 Mar, 2023 1 commit
-
-
Yan Fang authored
-
- 21 Mar, 2023 1 commit
-
-
Zihao authored
* add auto-offload feature * polish code * fix syn offload runtime pass bug * add offload example * fix offload testing bug * fix example testing bug
-
- 07 Mar, 2023 1 commit
-
-
Ziyue Jiang authored
* add alpa dp split * add alpa dp split * use fwd+bwd instead of fwd only --------- Co-authored-by:Ziyue Jiang <ziyue.jiang@gmail.com>
-
- 27 Feb, 2023 2 commits
-
-
github-actions[bot] authored
Co-authored-by:github-actions <github-actions@github.com>
-
binmakeswell authored
-
- 22 Feb, 2023 1 commit
-
-
dawei-wang authored
Fix hpcaitech/ColossalAI#2851
-
- 15 Feb, 2023 1 commit
-
-
cloudhuang authored
-
- 09 Feb, 2023 1 commit
-
-
Jiatong (Julius) Han authored
* [tutorial] polish readme.md * [example] Update README.md
-
- 31 Jan, 2023 1 commit
-
-
HELSON authored
-
- 30 Jan, 2023 1 commit
-
-
HELSON authored
-
- 28 Jan, 2023 1 commit
-
-
HELSON authored
* [zero] add strict ddp mode for chunk init * [gemini] update gpt example
-
- 20 Jan, 2023 1 commit
-
-
HELSON authored
* [zero] add strict ddp mode * [polish] add comments for strict ddp mode * [zero] fix test error
-
- 18 Jan, 2023 3 commits
-
-
Jiarui Fang authored
-
jiaruifang authored
-
jiaruifang authored
-
- 17 Jan, 2023 1 commit
-
-
binmakeswell authored
-
- 16 Jan, 2023 5 commits
-
-
Jiarui Fang authored
-
jiaruifang authored
-
jiaruifang authored
-
jiaruifang authored
-
jiaruifang authored
-