Commits · 5018ac6ac531aabdb05c8af1ba3d98a2235bdbde · gaoqiong / flash-attention

01 Jul, 2024 1 commit

Fix KeyError handling for non-existing key in state_dict.pop() (#898) · 0d810cfb

JDKWangGuan authored Jun 30, 2024

Update handling for KeyError in state_dict.pop() for non-existing keys.
Changed state_dict.pop(f"h.{d}.attn.bias") to state_dict.pop(f"h.{d}.attn.bias", None) to prevent KeyError exceptions.


The following code can re-produce the issue
```
from transformers import AutoTokenizer, GPT2Model, GPT2Config
from flash_attn.models.gpt import GPTLMHeadModel, GPTModel

# >>> transformers.__version__
# '4.38.2'

model_path = 'gpt2'
output_model_path = 'gpt2_model'
config = GPT2Config.from_pretrained(model_path, output_hidden_states=True)
model = GPT2Model.from_pretrained(model_path, from_tf=False, config=config)
'''
model fine-tuning here
'''
# dump the fine-tuned model
model.save_pretrained(output_model_path)

# load the fine-tuned model
config = GPT2Config.from_pretrained(output_model_path, output_hidden_states=True)
model = GPTModel.from_pretrained(output_model_path, config=config, strict=True)  # failed due to KeyError: 'h.0.attn.bias'
model = GPTLMHeadModel.from_pretrained(output_model_path, config=config, strict=True)  # failed due to KeyError: 'h.0.attn.bias'

```

0d810cfb

31 Jan, 2024 1 commit
- Add window_size option to MHA and GPT · ef0ed106
  Tri Dao authored Jan 31, 2024
  
  ef0ed106
05 Jan, 2024 1 commit
- [LayerNorm] Switch from CUDA to Triton implementation · abbc1311
  Tri Dao authored Jan 05, 2024
  
  abbc1311
25 Dec, 2023 4 commits
- Add test for BTLM init · 73df3be7
  Tri Dao authored Dec 25, 2023
  
  73df3be7
- Implement BTLM model · 7ffba9a5
  Tri Dao authored Dec 24, 2023
  
  7ffba9a5
- Implement muParam · 2e29dacf
  Tri Dao authored Dec 24, 2023
  
  2e29dacf
- Pass alibi slopes to flash_attn_with_kvcache during generation · 3f7d5786
  Tri Dao authored Dec 24, 2023
  
  3f7d5786
23 Dec, 2023 1 commit
- Implement norm head for Baichuan2 · 2c7d7b73
  Tri Dao authored Dec 22, 2023
  
  2c7d7b73
22 Dec, 2023 1 commit
- Add Alibi to MHA, test with Baichuan-13B · c3b21966
  Tri Dao authored Dec 21, 2023
  
  c3b21966
20 Dec, 2023 1 commit
- [Gen] Remove minor dead code · 0a146185
  Tri Dao authored Dec 19, 2023
  
  0a146185
21 Sep, 2023 2 commits
- Fix E1136 (#563) · 187c2a06
  Yuchao Dai authored Sep 22, 2023
  
  187c2a06
- [Llama] Fix some tests, add tests for Llama 2 and CodeLlama · 0705d271
  Tri Dao authored Sep 20, 2023
  
  0705d271
20 Sep, 2023 1 commit
- Fix Llama GQA/MQA (#546) · 42832575
  Kevin Hu authored Sep 19, 2023
```
* Fix llama MQA

* Fix permute shape

* Update llama.py
```
  42832575
13 Sep, 2023 1 commit
- Add tests for Pythia, GPT-JT, and RedPajama models · d0032700
  Tri Dao authored Sep 13, 2023
  
  d0032700
11 Sep, 2023 1 commit
- Add BigCode converters (#532) · 07005806
  Kevin Hu authored Sep 10, 2023
  
  07005806
09 Sep, 2023 1 commit
- Inverse state dict for BERT (#527) · 4c91621a
  Kevin Hu authored Sep 09, 2023
  
  4c91621a
04 Sep, 2023 2 commits
- Fix test_baichuan · 798858f9
  Tri Dao authored Sep 03, 2023
  
  798858f9
- [Gen] Add back num_last_tokens in gpt.py · 7b33743a
  Tri Dao authored Sep 03, 2023
  
  7b33743a
03 Sep, 2023 1 commit
- [Rotary] Implement rotary in Triton · 942fcbf0
  Tri Dao authored Sep 03, 2023
  
  942fcbf0
30 Aug, 2023 2 commits
- Support LLaMa2 and CodeLLaMa (#491) · c9d4a816
  dan_the_3rd authored Aug 30, 2023
```
Co-authored-by: danthe3rd <danthe3rd>
```
  c9d4a816
- Support MQA + MP for decoding (#490) · 011ec323
  dan_the_3rd authored Aug 30, 2023
```
Co-authored-by: danthe3rd <danthe3rd>
```
  011ec323
27 Aug, 2023 1 commit
- [GPT] Generalize last_token_only arg to num_last_tokens · f8aea6ea
  Tri Dao authored Aug 26, 2023
  
  f8aea6ea
24 Aug, 2023 1 commit
- add llama support to GPTPreTrainedModel.from_pretrained (#479) · e0b09891
  Aman Gupta Karmani authored Aug 24, 2023
  
  e0b09891
21 Aug, 2023 1 commit
- FEAT: add codes which supporting for baichuan-inc/Baichuan-7B (#425) · a8c35b4f
  GAOXinyu authored Aug 22, 2023
  
  a8c35b4f
20 Aug, 2023 1 commit
- handle uneven heads across ranks when combining state_dicts; resolves #467 (#468) · 25d6b1db
  Xuechen Li authored Aug 20, 2023
```
* q

* add comment.
```
  25d6b1db
19 Aug, 2023 1 commit

map custom model state_dict back to huggingface format (#465) · 7fcd3e6a

Xuechen Li authored Aug 18, 2023

* fix name.

* set inv function.

* add map back function.

* handle gqa.

* add type annotation to avoid confusion.

* fix docstr.

* test inverse remap logic.

7fcd3e6a

18 Aug, 2023 4 commits

Run isort and black on python files · f1a73d07
Tri Dao authored Aug 18, 2023

f1a73d07

support when num_heads is not divisible by world_size; resolves #459 (#461) · bb4cded1

Xuechen Li authored Aug 18, 2023

* uneql rank.

* trim.

* enable passing in number of heads for each rank.

* simplify.

* simplify.

* cleanup.

* fix col parallel.

* fix bug with row parallel.

* fit out proj.

* refac.

* fix sharding logic.

* refac sharding.

* refac.

* support multiple of.

* make fn reuseable.

* fix bug in dimensions.

* scaffold.

* test uneven heads.

* fix test by adding barrier.

* refac.

* reuse code.

* clean up.

bb4cded1

[ViT] Run black on vit.py · ada4710d
Tri Dao authored Aug 17, 2023

ada4710d
[ViT] Minor fix so it runs · a81900d4
Tri Dao authored Aug 17, 2023

a81900d4

17 Aug, 2023 1 commit
- [GPT] Run black on gpt.py · 4b661a56
  Tri Dao authored Aug 16, 2023
  
  4b661a56
15 Aug, 2023 1 commit

enable loading hf llama checkpoints for training (#446) · 0f7853c6

Xuechen Li authored Aug 15, 2023

* prelim.

* add hf convertion fn.

* mlp.

* change name.

* fix bug.

* inverse permute.

* change comment.

* revert style changes.

* fix.

* add doc.

* revert.

* enable load safe.

* fix safe load.

* fix import.

* fix typing-related lints.

* fix ckpt loading logic.

* make single gpu work.

* test with parallel.

* ckpt format.

* enable pretrained state dict.

* remove unused imports.

* remove unused.

* mark idea related.

0f7853c6

29 Jul, 2023 1 commit
- [GPT] Implement parallel LLaMa · 184b992d
  Tri Dao authored Jul 28, 2023
  
  184b992d
26 Jul, 2023 1 commit
- Implement ParallelGatedMlp (#251) · 8ee62efc
  Haodong Lyu authored Jul 27, 2023
  
  8ee62efc
23 Jul, 2023 5 commits
- [GPT] Implement Falcon · d38357dd
  Tri Dao authored Jul 23, 2023
  
  d38357dd
- Allow rotary embeddings for Bert (#363) · 684196b8
  Kiarash Jamali authored Jul 23, 2023
  
  684196b8
- [MHA] Implement MQA/GQA · 425dbcb6
  Tri Dao authored Jul 23, 2023
  
  425dbcb6
- [Rotary] Don't store inv_freq in state_dict · ec9f74ab
  Tri Dao authored Jul 22, 2023
  
  ec9f74ab
- [MLP] Add ParallelMLP · 75e334d4
  Tri Dao authored Jul 22, 2023
  
  75e334d4
30 May, 2023 1 commit
- [Gen] Add rotary base as an argument to FT attention kernel · 48bc6eac
  Tri Dao authored May 30, 2023
  
  48bc6eac