• Jason Phang's avatar
    LLaMA Implementation (#21955) · 0041be5b
    Jason Phang authored
    
    
    * LLaMA
    
    * sharding and docs
    
    * tweak
    
    * black
    
    * inits
    
    * ruff
    
    * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP
    
    * init
    
    * no checkpoint
    
    * docs
    
    * ruff
    
    * type_vocab_size
    
    * tokenizer fixes
    
    * tokenizer fixes
    
    * Update tokenization_llama.py
    
    * Update tokenization_llama.py
    
    * Update configuration_llama.py
    
    * Update modeling_llama.py
    
    * tokenizer add_bos by default
    
    * licenses
    
    * remove decoder
    
    * norms and mlp
    
    * rope overhaul
    
    * tweaks
    
    * black
    
    * mention OPT implementation
    
    * off-by-one naming
    
    * typo
    
    * fix
    
    * tokenization fix and slicing bug
    
    * padding config
    
    * cleanup
    
    * black
    
    * update tests
    
    * undo typo
    
    * fix vocab caching logic
    
    * ruff
    
    * docbuilder
    
    * attn fix from BlackSamorez
    
    * initial feedback
    
    * typo
    
    * docs
    
    * llama case
    
    * llama case
    
    * load checkpoint docs
    
    * comment about tokenizer
    
    * tokenizer defaults
    
    * clear past_key_values if use_cache=False
    
    * last tweaks
    
    * last tweaks
    
    * last tweaks
    
    * last tweaks
    
    ---------
    Co-authored-by: default avatarStella Biderman <stellabiderman@gmail.com>
    0041be5b
README_es.md 83.7 KB