- 01 Jul, 2024 1 commit
-
-
JDKWangGuan authored
Update handling for KeyError in state_dict.pop() for non-existing keys. Changed state_dict.pop(f"h.{d}.attn.bias") to state_dict.pop(f"h.{d}.attn.bias", None) to prevent KeyError exceptions. The following code can re-produce the issue ``` from transformers import AutoTokenizer, GPT2Model, GPT2Config from flash_attn.models.gpt import GPTLMHeadModel, GPTModel # >>> transformers.__version__ # '4.38.2' model_path = 'gpt2' output_model_path = 'gpt2_model' config = GPT2Config.from_pretrained(model_path, output_hidden_states=True) model = GPT2Model.from_pretrained(model_path, from_tf=False, config=config) ''' model fine-tuning here ''' # dump the fine-tuned model model.save_pretrained(output_model_path) # load the fine-tuned model config = GPT2Config.from_pretrained(output_model_path, output_hidden_states=True) model = GPTModel.from_pretrained(output_model_path, config=config, strict=True) # failed due to KeyError: 'h.0.attn.bias' model = GPTLMHeadModel.from_pretrained(output_model_path, config=config, strict=True) # failed due to KeyError: 'h.0.attn.bias' ```
-
- 31 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 05 Jan, 2024 1 commit
-
-
Tri Dao authored
-
- 25 Dec, 2023 4 commits
- 23 Dec, 2023 1 commit
-
-
Tri Dao authored
-
- 22 Dec, 2023 1 commit
-
-
Tri Dao authored
-
- 20 Dec, 2023 1 commit
-
-
Tri Dao authored
-
- 21 Sep, 2023 2 commits
-
-
Yuchao Dai authored
-
Tri Dao authored
-
- 20 Sep, 2023 1 commit
-
-
Kevin Hu authored
* Fix llama MQA * Fix permute shape * Update llama.py
-
- 13 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 11 Sep, 2023 1 commit
-
-
Kevin Hu authored
-
- 09 Sep, 2023 1 commit
-
-
Kevin Hu authored
-
- 04 Sep, 2023 2 commits
- 03 Sep, 2023 1 commit
-
-
Tri Dao authored
-
- 30 Aug, 2023 2 commits
-
-
dan_the_3rd authored
Co-authored-by: danthe3rd <danthe3rd>
-
dan_the_3rd authored
Co-authored-by: danthe3rd <danthe3rd>
-
- 27 Aug, 2023 1 commit
-
-
Tri Dao authored
-
- 24 Aug, 2023 1 commit
-
-
Aman Gupta Karmani authored
-
- 21 Aug, 2023 1 commit
-
-
GAOXinyu authored
-
- 20 Aug, 2023 1 commit
-
-
Xuechen Li authored
* q * add comment.
-
- 19 Aug, 2023 1 commit
-
-
Xuechen Li authored
* fix name. * set inv function. * add map back function. * handle gqa. * add type annotation to avoid confusion. * fix docstr. * test inverse remap logic.
-
- 18 Aug, 2023 4 commits
-
-
Tri Dao authored
-
Xuechen Li authored
* uneql rank. * trim. * enable passing in number of heads for each rank. * simplify. * simplify. * cleanup. * fix col parallel. * fix bug with row parallel. * fit out proj. * refac. * fix sharding logic. * refac sharding. * refac. * support multiple of. * make fn reuseable. * fix bug in dimensions. * scaffold. * test uneven heads. * fix test by adding barrier. * refac. * reuse code. * clean up.
-
Tri Dao authored
-
Tri Dao authored
-
- 17 Aug, 2023 1 commit
-
-
Tri Dao authored
-
- 15 Aug, 2023 1 commit
-
-
Xuechen Li authored
* prelim. * add hf convertion fn. * mlp. * change name. * fix bug. * inverse permute. * change comment. * revert style changes. * fix. * add doc. * revert. * enable load safe. * fix safe load. * fix import. * fix typing-related lints. * fix ckpt loading logic. * make single gpu work. * test with parallel. * ckpt format. * enable pretrained state dict. * remove unused imports. * remove unused. * mark idea related.
-
- 29 Jul, 2023 1 commit
-
-
Tri Dao authored
-
- 26 Jul, 2023 1 commit
-
-
Haodong Lyu authored
-
- 23 Jul, 2023 5 commits
-
-
Tri Dao authored
-
Kiarash Jamali authored
-
Tri Dao authored
-
Tri Dao authored
-
Tri Dao authored
-
- 30 May, 2023 1 commit
-
-
Tri Dao authored
-