Commits · d882d18c6544d024dd181c04fbb8c10893d3a653 · OpenDAS / ColossalAI

27 Feb, 2024 1 commit
- [example] reuse flash attn patch (#5400) · d882d18c
  Hongxin Liu authored Feb 27, 2024
  
  d882d18c
15 Jan, 2024 1 commit
- [hotfix]: add pp sanity check and fix mbs arg (#5268) · ef4f0ee8
  Wenhao Chen authored Jan 15, 2024
```
* fix: fix misleading mbs arg

* feat: add pp sanity check

* fix: fix 1f1b sanity check
```
  ef4f0ee8
09 Jan, 2024 1 commit

[npu] change device to accelerator api (#5239) · d202cc28

Hongxin Liu authored Jan 09, 2024



* update accelerator

* fix timer

* fix amp

* update

* fix

* update bug

* add error raise

* fix autocast

* fix set device

* remove doc accelerator

* update doc

* update doc

* update doc

* use nullcontext

* update cpu

* update null context

* change time limit for example

* udpate

* update

* update

* update

* [npu] polish accelerator code

---------
Co-authored-by: Xuanlei Zhao <xuanlei.zhao@gmail.com>
Co-authored-by: zxl <43881818+oahzxl@users.noreply.github.com>

d202cc28

22 Dec, 2023 1 commit

[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp (#5134) · 4fa689fc

Wenhao Chen authored Dec 22, 2023

* test: add more p2p tests

* fix: remove send_forward_recv_forward as p2p op list need to use the same group

* fix: make send and receive atomic

* feat: update P2PComm fn

* feat: add metadata cache in 1f1b

* feat: add metadata cache in interleaved pp

* feat: modify is_xx_stage fn

* revert: add _broadcast_object_list

* feat: add interleaved pp in llama policy

* feat: set NCCL_BUFFSIZE in HybridParallelPlugin

4fa689fc

08 Dec, 2023 1 commit
- [gemini] hotfix NaN loss while using Gemini + tensor_parallel (#5150) · 21aa5de0
  flybird11111 authored Dec 08, 2023
```
* fix

aaa

fix

fix

fix

* fix

* fix

* test ci

* fix ci

fix
```
  21aa5de0
22 Nov, 2023 1 commit
- [npu] add npu support for hybrid plugin and llama (#5090) · 3acbf6d4
  Xuanlei Zhao authored Nov 22, 2023
```
* llama 3d

* update

* fix autocast
```
  3acbf6d4
20 Nov, 2023 2 commits

[format] applied code formatting on changed files in pull request 5067 (#5072) · 8921a73c
github-actions[bot] authored Nov 20, 2023
```
Co-authored-by: github-actions <github-actions@github.com>
```
8921a73c

[npu] add npu support for gemini and zero (#5067) · e5ce4c8e

Hongxin Liu authored Nov 20, 2023

* [npu] setup device utils (#5047)

* [npu] add npu device support

* [npu] support low level zero

* [test] update npu zero plugin test

* [hotfix] fix import

* [test] recover tests

* [npu] gemini support npu (#5052)

* [npu] refactor device utils

* [gemini] support npu

* [example] llama2+gemini support npu

* [kernel] add arm cpu adam kernel (#5065)

* [kernel] add arm cpu adam

* [optim] update adam optimizer

* [kernel] arm cpu adam remove bf16 support

e5ce4c8e

16 Nov, 2023 1 commit

[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading... · b2ad0d9e

Elsa Granger authored Nov 16, 2023


[pipeline,shardformer] Fix p2p efficiency in pipeline, allow skipping loading weight not in weight_map when `strict=False`, fix llama flash attention forward, add flop estimation by megatron in llama benchmark (#5017)

* Use p2p

* Cannot bidirectonal send p2p

* Refactor tensor creation and serialization in P2P
communication

* Fix llama forward args in flash attention

* Add flop estimate from megatron

* Support loading weight not in weight_map when strict=False in hybrid_parallel

* Use send_forward_recv_backward, etc in 1f1b

* Use dataclass for metdata
Remove torch.cuda.synchronize() as suggested

* Add comment about the torch.cuda.synchronize for potential error

* Typo

* Update hybrid_parallel_checkpoint_io.py

* Update p2p.py

* Update one_f_one_b.py

* Update p2p.py

---------
Co-authored-by: flybird11111 <1829166702@qq.com>

b2ad0d9e

19 Sep, 2023 1 commit

[misc] update pre-commit and run all files (#4752) · 079bf3cb

Hongxin Liu authored Sep 19, 2023

* [misc] update pre-commit

* [misc] run pre-commit

* [misc] remove useless configuration files

* [misc] ignore cuda for clang-format

079bf3cb

28 Aug, 2023 1 commit

[example] add llama2 example (#4527) · 0b00def8

Hongxin Liu authored Aug 28, 2023

* [example] transfer llama-1 example

* [example] fit llama-2

* [example] refactor scripts folder

* [example] fit new gemini plugin

* [cli] fix multinode runner

* [example] fit gemini optim checkpoint

* [example] refactor scripts

* [example] update requirements

* [example] update requirements

* [example] rename llama to llama2

* [example] update readme and pretrain script

* [example] refactor scripts

0b00def8