Commits · 786092a35e18154cacad62c30fe92bac2c27a1e1 · chenpangpang / transformers

20 Mar, 2023 1 commit

Rework a bit the LLaMA conversion script (#22236) · 786092a3

Sylvain Gugger authored Mar 20, 2023



* Update LLaMA conversion script

* Doc

* Fix the weight size for the 13B checkpoint

* Update src/transformers/models/llama/convert_llama_weights_to_hf.py
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

---------
Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

786092a3

17 Mar, 2023 3 commits

Add LlamaForSequenceClassification (#22209) · f2514413

lewtun authored Mar 17, 2023



* Add LlamaForSequenceClassification

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/llama/modeling_llama.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Add docstring

* Add test

* Add input embedding getter and setter

* Remove dead code

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

f2514413

LLaMA house-keeping (#22216) · 00934026
Sylvain Gugger authored Mar 17, 2023
```
* LLaMA house-keeping

* Doc links
```
00934026
fix typos in llama.mdx (#22223) · 33d033d6
Kevin Turner authored Mar 17, 2023

33d033d6

16 Mar, 2023 1 commit

LLaMA Implementation (#21955) · 0041be5b

Jason Phang authored Mar 16, 2023



* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------
Co-authored-by: Stella Biderman <stellabiderman@gmail.com>

0041be5b