1. 16 Mar, 2023 7 commits
    • Yih-Dar's avatar
      Update tiny model creation script (#22202) · 4c5c0af7
      Yih-Dar authored
      
      
      * Update UNCONVERTIBLE_MODEL_ARCHITECTURES
      
      * Deal with 2 model tester classes in single test file
      
      * Deal with 2 model tester classes in single test file
      
      * Deal with 2 model tester classes in single test file
      
      * make style and quality
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      4c5c0af7
    • Jason Phang's avatar
      LLaMA Implementation (#21955) · 464d4207
      Jason Phang authored
      
      
      * LLaMA
      
      * sharding and docs
      
      * tweak
      
      * black
      
      * inits
      
      * ruff
      
      * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP
      
      * init
      
      * no checkpoint
      
      * docs
      
      * ruff
      
      * type_vocab_size
      
      * tokenizer fixes
      
      * tokenizer fixes
      
      * Update tokenization_llama.py
      
      * Update tokenization_llama.py
      
      * Update configuration_llama.py
      
      * Update modeling_llama.py
      
      * tokenizer add_bos by default
      
      * licenses
      
      * remove decoder
      
      * norms and mlp
      
      * rope overhaul
      
      * tweaks
      
      * black
      
      * mention OPT implementation
      
      * off-by-one naming
      
      * typo
      
      * fix
      
      * tokenization fix and slicing bug
      
      * padding config
      
      * cleanup
      
      * black
      
      * update tests
      
      * undo typo
      
      * fix vocab caching logic
      
      * ruff
      
      * docbuilder
      
      * attn fix from BlackSamorez
      
      * initial feedback
      
      * typo
      
      * docs
      
      * llama case
      
      * llama case
      
      * load checkpoint docs
      
      * comment about tokenizer
      
      * tokenizer defaults
      
      * clear past_key_values if use_cache=False
      
      * last tweaks
      
      * last tweaks
      
      * last tweaks
      
      * last tweaks
      
      ---------
      Co-authored-by: default avatarStella Biderman <stellabiderman@gmail.com>
      464d4207
    • Jason Phang's avatar
      LLaMA Implementation (#21955) · 0041be5b
      Jason Phang authored
      
      
      * LLaMA
      
      * sharding and docs
      
      * tweak
      
      * black
      
      * inits
      
      * ruff
      
      * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP
      
      * init
      
      * no checkpoint
      
      * docs
      
      * ruff
      
      * type_vocab_size
      
      * tokenizer fixes
      
      * tokenizer fixes
      
      * Update tokenization_llama.py
      
      * Update tokenization_llama.py
      
      * Update configuration_llama.py
      
      * Update modeling_llama.py
      
      * tokenizer add_bos by default
      
      * licenses
      
      * remove decoder
      
      * norms and mlp
      
      * rope overhaul
      
      * tweaks
      
      * black
      
      * mention OPT implementation
      
      * off-by-one naming
      
      * typo
      
      * fix
      
      * tokenization fix and slicing bug
      
      * padding config
      
      * cleanup
      
      * black
      
      * update tests
      
      * undo typo
      
      * fix vocab caching logic
      
      * ruff
      
      * docbuilder
      
      * attn fix from BlackSamorez
      
      * initial feedback
      
      * typo
      
      * docs
      
      * llama case
      
      * llama case
      
      * load checkpoint docs
      
      * comment about tokenizer
      
      * tokenizer defaults
      
      * clear past_key_values if use_cache=False
      
      * last tweaks
      
      * last tweaks
      
      * last tweaks
      
      * last tweaks
      
      ---------
      Co-authored-by: default avatarStella Biderman <stellabiderman@gmail.com>
      0041be5b
    • Baelish03's avatar
      Italian Translation of migration.mdx (#22183) · 09922da4
      Baelish03 authored
      * Tranlstion Italian: migration
      
      * Update migration.mdx
      
      minor fixes
      
      * Update _toctree.yml
      
      * Delete migration.mdx
      
      * Add italian translation of migration.mdx
      
      * Update of migration.mdx translation and toctree
      09922da4
    • Yih-Dar's avatar
      52a57f7c
    • Alara Dirik's avatar
      Fix typo in Align docs (#22199) · 1485bd9c
      Alara Dirik authored
      Fix align docs typo
      1485bd9c
    • Yih-Dar's avatar
      Fix DeepSpeed CI (#22194) · 1c4a9acc
      Yih-Dar authored
      
      
      * Deal with torch-tensorrt
      
      ---------
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      1c4a9acc
  2. 15 Mar, 2023 5 commits
  3. 14 Mar, 2023 13 commits
  4. 13 Mar, 2023 15 commits