- 19 Apr, 2023 4 commits
-
-
digger-yu authored
Optimization Code change "vairable" to "variable"
-
digger-yu authored
Optimization Code change "requries" to "requires"
-
Hongxin Liu authored
* [gemini] save state dict support fp16 * [gemini] save state dict shard support fp16 * [gemini] fix state dict * [gemini] fix state dict
-
github-actions[bot] authored
Co-authored-by:github-actions <github-actions@github.com>
-
- 18 Apr, 2023 6 commits
-
-
digger-yu authored
Optimization Code The source code has not been modified, only a few spelling errors in the comments have been changed
-
binmakeswell authored
-
Yuanchen authored
Co-authored-by:Yuanchen Xu <yuanchen.xu00@gmail.com>
-
Hongxin Liu authored
* [meta] fix torch 1.13.1 * [meta] fix torch 2.0.0 * [meta] fix torch 1.13.0 * [meta] polish code
-
Camille Zhong authored
update Update test_ci.sh Update test_ci.sh Update test_ci.sh Update test_ci.sh Update test_ci.sh Update test_ci.sh Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update test_ci.sh Update test_ci.sh update Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml update ci Update test_ci.sh Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml Update test_ci.sh Update test_ci.sh Update run_chatgpt_examples.yml Update test_ci.sh Update test_ci.sh Update test_ci.sh update test ci RoBERTa for RLHF Stage 2 & 3 (still in testing) Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368. Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci Update test_ci.sh Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368. Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci Update test_ci.sh Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. update roberta with coati chat ci update Revert "chat ci update" This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846. [test]chat_update_ci Update test_ci.sh Update test_ci.sh test Update gpt_critic.py Update gpt_critic.py Update run_chatgpt_unit_tests.yml update test ci update update update update Update test_ci.sh update Update test_ci.sh Update test_ci.sh Update run_chatgpt_examples.yml Update run_chatgpt_examples.yml
-
digger-yu authored
Adjusted the style of Community Examples to be consistent with other titles
-
- 17 Apr, 2023 10 commits
-
-
Hongxin Liu authored
* [gemini] support state dict shard * [gemini] add test state dict shard * [gemini] polish docstr * [gemini] fix merge * [gemini] polish code
-
tingfeng cao authored
-
digger-yu authored
Optimization Code I think there were two extra $ entered here, which have been deleted
-
Fazzie-Maqianli authored
-
binmakeswell authored
-
csric authored
* run the base * working on dist ppo * sync * detached trainer * update detached trainer. no maker update function * facing init problem * 1 maker 1 trainer detached run. but no model update * facing cuda problem * fix save functions * verified maker update * nothing * add ignore * analyize loss issue * remove some debug codes * facing 2m1t stuck issue * 2m1t verified * do not use torchrun * working on 2m2t * working on 2m2t * initialize strategy in ray actor env * facing actor's init order issue * facing ddp model update issue (need unwarp ddp) * unwrap ddp actor * checking 1m2t stuck problem * nothing * set timeout for trainer choosing. It solves the stuck problem! * delete some debug output * rename to sync with upstream * rename to sync with upstream * coati rename * nothing * I am going to detach the replaybuffer from trainer and make it a Ray Actor. Two benefits: 1. support TP trainer. 2. asynchronized buffer operations * experience_maker_holder performs target-revolving _send_experience() instead of length comparison. * move code to ray subfolder * working on pipeline inference * apply comments --------- Co-authored-by:csric <richcsr256@gmail.com>
-
YH authored
-
digger-yu authored
Display format optimization , same as fix#3562 Simultaneous modification of en version
-
Hongxin Liu authored
* [misc] add print verbose * [gemini] add print verbose * [zero] add print verbose for low level * [misc] add print verbose for op builder
-
Hongxin Liu authored
-
- 14 Apr, 2023 2 commits
-
-
digger-yu authored
Display format optimization, fix bug#3562 Specific changes 1. "This is called a column-parallel fashion" Translate to Chinese 2. use the ```math code block syntax to display a math expression as a block, No modification of formula content Please check that the math formula is displayed correctly If OK, I will change the format of the English version of the formula in parallel
-
binmakeswell authored
-
- 13 Apr, 2023 4 commits
-
-
MisterLin1995 authored
* [feat][chatgpt]train prompts on ray example * [fix]simplify code * [fix]remove depreciated parameter * [fix]add dependencies * [fix]method calling * [fix]experience maker * [fix]missing loss function * [fix]init optimizer * [feat]add usage comment * [fix]rename files * [fix]add readme * [fix]file path * [fix]move directory --------- Co-authored-by:jiangwen <zxl265370@antgroup.com>
-
binmakeswell authored
* [chat] clean up duplicate tutorial * [chat] clean up duplicate tutorial * [chat] clean up duplicate tutorial * [chat] clean up duplicate tutorial
-
digger-yu authored
Format Optimization ,Add [] outside of DeepSpeed
-
digger-yu authored
Delete more ")"
-
- 12 Apr, 2023 5 commits
-
-
digger-yu authored
Fixing document link errors using absolute paths
-
natalie_cao authored
-
Hongxin Liu authored
* [gemini] fix nvme optimizer init * [gemini] gemini supports lazy init * [gemini] add init example * [gemini] add fool model * [zero] update gemini ddp * [zero] update init example * add chunk method * add chunk method * [lazyinit] fix lazy tensor tolist * [gemini] fix buffer materialization * [misc] remove useless file * [booster] update gemini plugin * [test] update gemini plugin test * [test] fix gemini plugin test * [gemini] fix import * [gemini] fix import * [lazyinit] use new metatensor * [lazyinit] use new metatensor * [lazyinit] fix __set__ method
-
jiangmingyan authored
[checkpoint] Shard saved checkpoint need to be compatible with the naming format of hf checkpoint files (#3479) * [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format * [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename --------- Co-authored-by:
luchen <luchen@luchendeMacBook-Pro.local> Co-authored-by:
luchen <luchen@luchendeMBP.lan>
-
Yuanchen authored
Co-authored-by:Yuanchen Xu <yuanchen.xu00@gmail.com>
-
- 11 Apr, 2023 1 commit
-
-
zhang-yi-chi authored
-
- 10 Apr, 2023 5 commits
-
-
ver217 authored
-
binmakeswell authored
- [ ] Stable Diffusion - [ ] Dreambooth It's easy for users to think that we don't support them yet. Add them after migrating them from example to application https://github.com/hpcaitech/ColossalAI/tree/main/examples/images
-
binmakeswell authored
* [doc] add requirement and highlight application * [doc] link example and application
-
NatalieC323 authored
* Update README.md * Update README.md * Update README.md * Update README.md --------- Co-authored-by:Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com>
-
YH authored
-
- 07 Apr, 2023 3 commits
-
-
gongenlei authored
* mv LlamaForCausalLM to LlamaModel * rm unused imports --------- Co-authored-by:gongenlei <gongenlei@baidu.com>
-
mandoxzhang authored
* update roberta example * update roberta example * modify conflict & update roberta
-
mandoxzhang authored
* update roberta example * update roberta example
-