- 13 Apr, 2023 2 commits
- 12 Apr, 2023 5 commits
-
-
digger-yu authored
Fixing document link errors using absolute paths
-
natalie_cao authored
-
Hongxin Liu authored
* [gemini] fix nvme optimizer init * [gemini] gemini supports lazy init * [gemini] add init example * [gemini] add fool model * [zero] update gemini ddp * [zero] update init example * add chunk method * add chunk method * [lazyinit] fix lazy tensor tolist * [gemini] fix buffer materialization * [misc] remove useless file * [booster] update gemini plugin * [test] update gemini plugin test * [test] fix gemini plugin test * [gemini] fix import * [gemini] fix import * [lazyinit] use new metatensor * [lazyinit] use new metatensor * [lazyinit] fix __set__ method
-
jiangmingyan authored
[checkpoint] Shard saved checkpoint need to be compatible with the naming format of hf checkpoint files (#3479) * [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format * [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename --------- Co-authored-by:
luchen <luchen@luchendeMacBook-Pro.local> Co-authored-by:
luchen <luchen@luchendeMBP.lan>
-
Yuanchen authored
Co-authored-by:Yuanchen Xu <yuanchen.xu00@gmail.com>
-
- 11 Apr, 2023 1 commit
-
-
zhang-yi-chi authored
-
- 10 Apr, 2023 5 commits
-
-
ver217 authored
-
binmakeswell authored
- [ ] Stable Diffusion - [ ] Dreambooth It's easy for users to think that we don't support them yet. Add them after migrating them from example to application https://github.com/hpcaitech/ColossalAI/tree/main/examples/images
-
binmakeswell authored
* [doc] add requirement and highlight application * [doc] link example and application
-
NatalieC323 authored
* Update README.md * Update README.md * Update README.md * Update README.md --------- Co-authored-by:Fazzie-Maqianli <55798671+Fazziekey@users.noreply.github.com>
-
YH authored
-
- 07 Apr, 2023 3 commits
-
-
gongenlei authored
* mv LlamaForCausalLM to LlamaModel * rm unused imports --------- Co-authored-by:gongenlei <gongenlei@baidu.com>
-
mandoxzhang authored
* update roberta example * update roberta example * modify conflict & update roberta
-
mandoxzhang authored
* update roberta example * update roberta example
-
- 06 Apr, 2023 14 commits
-
-
NatalieC323 authored
-
binmakeswell authored
-
NatalieC323 authored
* Update requirements.txt * Update environment.yaml * Update README.md * Update environment.yaml * Update README.md * Update README.md * Delete requirements_colossalai.txt * Update requirements.txt * Update README.md
-
Frank Lee authored
-
jiangmingyan authored
* [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint --------- Co-authored-by:luchen <luchen@luchendeMBP.lan>
-
Fazzie-Maqianli authored
-
Frank Lee authored
* [test] added spawn decorator * polish code * polish code * polish code * polish code * polish code * polish code
-
YY Lin authored
* Update ppo.py Fix the bug of fetching wrong batch data * Add peft model support in SFT and Prompts training In stage-1 and stage-3, the peft model supports are added. So the trained artifacts will be only a small lora additions instead of the whole bunch of files. * Delete test_prompts.txt * Delete test_pretrained.txt * Move the peft stuffs to a community folder. * Move the demo sft to community * delete dirty files * Add instructions to install peft using source * Remove Chinese comments * remove the Chinese comments
-
Dr-Corgi authored
The function save_model should be a part of PPOTrainer.
-
kingkingofall authored
* fix stage 2 fix stage 2 * add torch
-
Frank Lee authored
-
YH authored
-
ver217 authored
-
Camille Zhong authored
* Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * update roberta with coati * chat ci update * Revert "chat ci update" This reverts commit 17ae7ae01fa752bd3289fc39069868fde99cf846. * [Chat] fix the tokenizer "int too big to convert" error in SFT training fix the tokenizer error during SFT training using Bloom and OPT
-
- 05 Apr, 2023 2 commits
-
-
Hakjin Lee authored
-
Yuanchen authored
Co-authored-by:Yuanchen Xu <yuanchen.xu00@gmail.com>
-
- 04 Apr, 2023 6 commits
-
-
YuliangLiu0306 authored
* [autoparallel] integrate new analyzer in module level * unify the profiling method * polish * fix no codegen bug * fix pass bug * fix liveness test * polish
-
ver217 authored
* [zero] update legacy import * [zero] update examples * [example] fix opt tutorial * [example] fix opt tutorial * [example] fix opt tutorial * [example] fix opt tutorial * [example] fix import
-
Yuanchen authored
Co-authored-by:Yuanchen Xu <yuanchen.xu00@gmail.com>
-
Frank Lee authored
* [checkpoint] refactored the API and added safetensors support * polish code
-
ver217 authored
* [zero] refactor low-level zero folder structure * [zero] fix legacy zero import path * [zero] fix legacy zero import path * [zero] remove useless import * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] fix test import path * [zero] fix test * [zero] fix circular import * [zero] update import
-
Yuanchen authored
fix sft training for bloom, gpt and opt
-
- 03 Apr, 2023 2 commits
-
-
Frank Lee authored
* [test] fixed gemini plugin test * polish code * polish code
-
Camille Zhong authored
* Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * add test for reward model training * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * Add RoBERTa for RLHF Stage 2 & 3 (test) RoBERTa for RLHF Stage 2 & 3 (still in testing) * Revert "Add RoBERTa for RLHF Stage 2 & 3 (test)" This reverts commit 06741d894dcbe958acd4e10d771f22275e20e368. * Add RoBERTa for RLHF stage 2 & 3 1. add roberta folder under model folder 2. add roberta option in train_reward_model.py 3. add some test in testci * Update test_ci.sh * Revert "Update test_ci.sh" This reverts commit 9c7352b81766f3177d31eeec0ec178a301df966a. * update roberta with coati
-