- 25 Mar, 2024 1 commit
-
-
Wenhao Chen authored
* fix: simplify merge_batch * fix: use return_outputs=False to eliminate extra memory consumption * feat: add return_outputs warning * style: remove `return_outputs=False` as it is the default value
-
- 18 Mar, 2024 1 commit
-
-
flybird11111 authored
* fix * padding vocab_size when using pipeline parallellism padding vocab_size when using pipeline parallellism fix fix * fix * fix fix fix * fix gather output * fix * fix * fix fix resize embedding fix resize embedding * fix resize embedding fix * revert * revert * revert
-
- 13 Mar, 2024 1 commit
-
-
Hongxin Liu authored
* [devops] fix compatibility * [hotfix] update compatibility test on pr * [devops] fix compatibility * [devops] record duration during comp test * [test] decrease test duration * fix falcon
-
- 04 Mar, 2024 1 commit
-
-
flybird11111 authored
* benchmark gpt2 * fix fix fix fix * [doc] fix typo in Colossal-LLaMA-2/README.md (#5247) * [workflow] fixed build CI (#5240) * [workflow] fixed build CI * polish * polish * polish * polish * polish * [ci] fixed booster test (#5251) * [ci] fixed booster test * [ci] fixed booster test * [ci] fixed booster test * [ci] fixed ddp test (#5254) * [ci] fixed ddp test * polish * fix typo in applications/ColossalEval/README.md (#5250) * [ci] fix shardformer tests. (#5255) * fix ci fix * revert: revert p2p * feat: add enable_metadata_cache option * revert: enable t5 tests --------- Co-authored-by:
Wenhao Chen <cwher@outlook.com> * [doc] fix doc typo (#5256) * [doc] fix annotation display * [doc] fix llama2 doc * [hotfix]: add pp sanity check and fix mbs arg (#5268) * fix: fix misleading mbs arg * feat: add pp sanity check * fix: fix 1f1b sanity check * [workflow] fixed incomplete bash command (#5272) * [workflow] fixed oom tests (#5275) * [workflow] fixed oom tests * polish * polish * polish * [ci] fix test_hybrid_parallel_plugin_checkpoint_io.py (#5276) * fix ci fix * fix test * revert: revert p2p * feat: add enable_metadata_cache option * revert: enable t5 tests * fix --------- Co-authored-by:
Wenhao Chen <cwher@outlook.com> * [shardformer] hybridparallelplugin support gradients accumulation. (#5246) * support gradients acc fix fix fix fix fix fix fix fix fix fix fix fix fix * fix fix * fix fix fix * [hotfix] Fix ShardFormer test execution path when using sequence parallelism (#5230) * fix auto loading gpt2 tokenizer (#5279) * [doc] add llama2-13B disyplay (#5285) * Update README.md * fix 13b typo --------- Co-authored-by:
binmakeswell <binmakeswell@gmail.com> * fix llama pretrain (#5287) * fix * fix * fix fix * fix fix fix * fix fix * benchmark gpt2 * fix fix fix fix * [workflow] fixed build CI (#5240) * [workflow] fixed build CI * polish * polish * polish * polish * polish * [ci] fixed booster test (#5251) * [ci] fixed booster test * [ci] fixed booster test * [ci] fixed booster test * fix fix * fix fix fix * fix * fix fix fix fix fix * fix * Update shardformer.py --------- Co-authored-by:
digger yu <digger-yu@outlook.com> Co-authored-by:
Frank Lee <somerlee.9@gmail.com> Co-authored-by:
Wenhao Chen <cwher@outlook.com> Co-authored-by:
binmakeswell <binmakeswell@gmail.com> Co-authored-by:
Zhongkai Zhao <kanezz620@gmail.com> Co-authored-by:
Michelle <97082656+MichelleMa8@users.noreply.github.com> Co-authored-by:
Desperado-Jia <502205863@qq.com>
-
- 27 Feb, 2024 1 commit
-
-
QinLuo authored
-
- 08 Feb, 2024 1 commit
-
-
ver217 authored
-
- 07 Feb, 2024 1 commit
-
-
Xuanlei Zhao authored
-
- 06 Feb, 2024 1 commit
-
-
Hongxin Liu authored
-
- 02 Feb, 2024 1 commit
-
-
Wenhao Chen authored
* fix: remove unnecessary assert * test: add more 3d plugin tests * fix: add warning
-
- 01 Feb, 2024 1 commit
-
-
Hongxin Liu authored
* [checkpointio] fix hybrid parallel optim checkpoint * [extension] fix cuda extension * [checkpointio] fix gemini optimizer checkpoint * polish code
-
- 25 Jan, 2024 1 commit
-
-
Frank Lee authored
* [feat] refactored extension module * polish * polish * polish * polish * polish * polish * polish * polish * polish * polish
-
- 22 Jan, 2024 1 commit
-
-
Hongxin Liu authored
-
- 17 Jan, 2024 3 commits
-
-
Zhongkai Zhao authored
-
flybird11111 authored
* support gradients acc fix fix fix fix fix fix fix fix fix fix fix fix fix * fix fix * fix fix fix
-
flybird11111 authored
* fix ci fix * fix test * revert: revert p2p * feat: add enable_metadata_cache option * revert: enable t5 tests * fix --------- Co-authored-by:Wenhao Chen <cwher@outlook.com>
-
- 16 Jan, 2024 1 commit
-
-
Frank Lee authored
* [workflow] fixed oom tests * polish * polish * polish
-
- 15 Jan, 2024 1 commit
-
-
Wenhao Chen authored
* fix: fix misleading mbs arg * feat: add pp sanity check * fix: fix 1f1b sanity check
-
- 11 Jan, 2024 3 commits
-
-
flybird11111 authored
* fix ci fix * revert: revert p2p * feat: add enable_metadata_cache option * revert: enable t5 tests --------- Co-authored-by:Wenhao Chen <cwher@outlook.com>
-
Frank Lee authored
* [ci] fixed ddp test * polish
-
Frank Lee authored
* [ci] fixed booster test * [ci] fixed booster test * [ci] fixed booster test
-
- 10 Jan, 2024 1 commit
-
-
Frank Lee authored
* [workflow] fixed build CI * polish * polish * polish * polish * polish
-
- 09 Jan, 2024 1 commit
-
-
Hongxin Liu authored
* update accelerator * fix timer * fix amp * update * fix * update bug * add error raise * fix autocast * fix set device * remove doc accelerator * update doc * update doc * update doc * use nullcontext * update cpu * update null context * change time limit for example * udpate * update * update * update * [npu] polish accelerator code --------- Co-authored-by:
Xuanlei Zhao <xuanlei.zhao@gmail.com> Co-authored-by:
zxl <43881818+oahzxl@users.noreply.github.com>
-
- 08 Jan, 2024 2 commits
-
-
Elsa Granger authored
* A more general _communicate * feat: finish tree_flatten version p2p * fix: update p2p api calls --------- Co-authored-by:Wenhao Chen <cwher@outlook.com>
-
Xuanlei Zhao authored
* update extension * update cpu adam * update is * add doc for cpu adam * update kernel * update commit * update flash * update memory efficient * update flash attn * update flash attention loader * update api * fix * update doc * update example time limit * reverse change * fix doc * remove useless kernel * fix * not use warning * update * update
-
- 03 Jan, 2024 1 commit
-
-
Wenhao Chen authored
* fix: add fallback order option and update 1f1b * fix: fix deadlock comm in interleaved pp * test: modify p2p test
-
- 22 Dec, 2023 1 commit
-
-
Wenhao Chen authored
* test: add more p2p tests * fix: remove send_forward_recv_forward as p2p op list need to use the same group * fix: make send and receive atomic * feat: update P2PComm fn * feat: add metadata cache in 1f1b * feat: add metadata cache in interleaved pp * feat: modify is_xx_stage fn * revert: add _broadcast_object_list * feat: add interleaved pp in llama policy * feat: set NCCL_BUFFSIZE in HybridParallelPlugin
-
- 12 Dec, 2023 1 commit
-
-
flybird11111 authored
* fix aaa fix fix fix * fix * fix * test ci * fix ci fix * llama support dist-cross fix fix fix fix fix fix fix fix * fix * fix * fix fix * test ci * test ci * fix * [Colossal-Llama-2] Add finetuning Colossal-Llama-2 example (#4878) * Add finetuning Colossal-Llama-2 example * Add finetuning Colossal-Llama-2 example 2 * Add finetuning Colossal-Llama-2 example and support NEFTuning * Add inference example and refine neftune * Modify readme file * update the imports --------- Co-authored-by:
Xu Yuanchen <yuanchen.xu00@gmail.com> Co-authored-by:
Camille Zhong <44392324+Camille7777@users.noreply.github.com> * llama support dist-cross fix fix fix fix fix fix fix fix * fix * fix * fix fix * test ci * test ci * fix * fix ci * fix ci --------- Co-authored-by:
Yuanchen <70520919+chengeharrison@users.noreply.github.com> Co-authored-by:
Xu Yuanchen <yuanchen.xu00@gmail.com> Co-authored-by:
Camille Zhong <44392324+Camille7777@users.noreply.github.com>
-
- 08 Dec, 2023 1 commit
-
-
flybird11111 authored
* fix aaa fix fix fix * fix * fix * test ci * fix ci fix
-
- 30 Nov, 2023 1 commit
-
-
flybird11111 authored
* fix 3d checkpoint load when booster boost without optimizer fix 3d checkpoint load when booster boost without optimizer * test ci * revert ci * fix fix
-
- 29 Nov, 2023 1 commit
-
-
github-actions[bot] authored
Co-authored-by:github-actions <github-actions@github.com>
-
- 28 Nov, 2023 1 commit
-
-
Wenhao Chen authored
* [shardformer] implement policy for all GPT-J models and test * [shardformer] support interleaved pipeline parallel for bert finetune * [shardformer] shardformer support falcon (#4883) * [shardformer]: fix interleaved pipeline for bert model (#5048) * [hotfix]: disable seq parallel for gptj and falcon, and polish code (#5093) * Add Mistral support for Shardformer (#5103) * [shardformer] add tests to mistral (#5105) --------- Co-authored-by:
Pengtai Xu <henryxu880@gmail.com> Co-authored-by:
ppt0011 <143150326+ppt0011@users.noreply.github.com> Co-authored-by:
flybird11111 <1829166702@qq.com> Co-authored-by:
eric8607242 <e0928021388@gmail.com>
-
- 22 Nov, 2023 1 commit
-
-
Zhongkai Zhao authored
* hotfix/Fix get model policy strategy in ShardFormer * fix bug in auto policy
-
- 20 Nov, 2023 3 commits
-
-
Xu Kai authored
* update examples and engine * fix choices * update example
-
Bin Jia authored
-
Hongxin Liu authored
* [npu] setup device utils (#5047) * [npu] add npu device support * [npu] support low level zero * [test] update npu zero plugin test * [hotfix] fix import * [test] recover tests * [npu] gemini support npu (#5052) * [npu] refactor device utils * [gemini] support npu * [example] llama2+gemini support npu * [kernel] add arm cpu adam kernel (#5065) * [kernel] add arm cpu adam * [optim] update adam optimizer * [kernel] arm cpu adam remove bf16 support
-
- 19 Nov, 2023 1 commit
-
-
Xu Kai authored
* [inference] support only TP (#4998) * support only tp * enable tp * add support for bloom (#5008) * [refactor] refactor gptq and smoothquant llama (#5012) * refactor gptq and smoothquant llama * fix import error * fix linear import torch-int * fix smoothquant llama import error * fix import accelerate error * fix bug * fix import smooth cuda * fix smoothcuda * [Inference Refactor] Merge chatglm2 with pp and tp (#5023) merge chatglm with pp and tp * [Refactor] remove useless inference code (#5022) * remove useless code * fix quant model * fix test import bug * mv original inference legacy * fix chatglm2 * [Refactor] refactor policy search and quant type controlling in inference (#5035) * [Refactor] refactor policy search and quant type controling in inference * [inference] update readme (#5051) * update readme * update readme * fix architecture * fix table * fix table * [inference] udpate example (#5053) * udpate example * fix run.sh * fix rebase bug * fix some errors * update readme * add some features * update interface * update readme * update benchmark * add requirements-infer --------- Co-authored-by:
Bin Jia <45593998+FoolPlayer@users.noreply.github.com> Co-authored-by:
Zhongkai Zhao <kanezz620@gmail.com>
-
- 17 Nov, 2023 1 commit
-
-
Wenhao Chen authored
* feat: modify create_ep_hierarchical_group args * test: add ep tests * fix: remove get_process_group_ranks * fix: fix src_rank
-
- 16 Nov, 2023 2 commits
-
-
flybird11111 authored
* support ddp * fix * fix * fix fix * support ddp * fix * fix * fix fix * simplify tests * fix * fix * fix fix fix * fix
-
Cuiqing Li (李崔卿) authored
* update flash-context-attention * adding kernels * fix * reset * add build script * add building process * add llama2 exmaple * add colossal-llama2 test * clean * fall back test setting * fix test file * clean * clean * clean --------- Co-authored-by:cuiqing.li <lixx336@gmail.com>
-
- 10 Nov, 2023 1 commit
-
-
Zhongkai Zhao authored
* [refactor]: replace inference args with extra_kwargs in ShardConfig * modify shardconfig * polish code * fix policy bug in llama * fix bug in auto policy * remove setattr in ShardConfig
-