1. 06 Aug, 2024 15 commits
    • timdalxx's avatar
      🌐 [i18n-KO] Translated `mask_generation.md` to Korean (#32257) · 5301b981
      timdalxx authored
      
      
      * docs: ko: tasks/mask_generation.md
      
      * feat: nmt draft
      
      * fix : toc local
      
      * fix : manual edits
      
      * fix : ko-toctree
      
      * fix: resolve suggestions
      Co-authored-by: default avatarboyunJang <gobook1234@naver.com>
      Co-authored-by: default avatarChaewon Song <chaewon1019@ewhain.net>
      
      * fix: resolve suggestions
      Co-authored-by: default avatarboyunJang <gobook1234@naver.com>
      Co-authored-by: default avatarChaewon Song <chaewon1019@ewhain.net>
      
      * fix: resolve suggestions
      
      * fix: resolve suggestions
      
      * fix: resolve suggestions
      
      ---------
      Co-authored-by: default avatarboyunJang <gobook1234@naver.com>
      Co-authored-by: default avatarChaewon Song <chaewon1019@ewhain.net>
      5301b981
    • Matthew Douglas's avatar
      Revert "fixes to properly shard FSDP across cpu and meta for... · ac2707e8
      Matthew Douglas authored
      Revert "fixes to properly shard FSDP across cpu and meta for cpu_effcient_loading for prequantized 4bit (#32276)" (#32477)
      
      * Revert "fixes to properly shard FSDP across cpu and meta for cpu_efficient_loading for prequantized 4bit (#32276)"
      
      This reverts commit 62c60a30
      
      .
      
      We uncovered an issue with this change that caused our training runs to hang.
      
      * `is_torchdynamo_compiling` -- cast a wide exception net (#32476)
      
      * cast a wide net
      
      * make fix-copies with a few manual changes
      
      * add copied from
      
      ---------
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      ac2707e8
    • Joao Gante's avatar
      `is_torchdynamo_compiling` -- cast a wide exception net (#32476) · 4fdc7020
      Joao Gante authored
      * cast a wide net
      
      * make fix-copies with a few manual changes
      
      * add copied from
      4fdc7020
    • Arthur Zucker's avatar
      dev version 4.45.0 · 26a9443d
      Arthur Zucker authored
      26a9443d
    • Chris Toukmaji's avatar
      Documentation: BOS token_id deprecation change for NLLB (#32443) · 50c3ba88
      Chris Toukmaji authored
      Update nllb.md
      50c3ba88
    • Zach Mueller's avatar
      Migrate import checks not need accelerate, and be more clear on min versions (#32292) · 194cf1f3
      Zach Mueller authored
      * Migrate import checks to secondary accelerate calls
      
      * better errs too
      
      * Revert, just keep the import checks + remove accelerate-specific things
      
      * Rm extra'
      
      * Empty commit for ci
      
      * Small nits
      
      * Final
      194cf1f3
    • Pablo Montalvo's avatar
      Add codestral mamba2 (#32080) · 80b90e7b
      Pablo Montalvo authored
      * add new model like
      
      * draft cuda forward - mismatched keys (sharding on conv1)
      
      * match keys successfully
      
      * fix split
      
      * get generation/forward running (wrong gens, norm?)
      
      * :update
      
      * some refactoring
      
      * fixes
      
      * works up until copy to cache
      
      * fix
      
      * update
      
      * NON WORKING VERSION
      
      * version that work?
      
      * nit
      
      * fix config
      
      * fix conversion script
      
      * working cuda forward
      
      * nit
      
      * update
      
      * simplifcation
      
      * make mamba slow simple work
      
      * no einops
      
      * todo
      
      * fix style
      
      * no einops
      
      * update fix no einsum
      
      * nit
      
      * remove einops
      
      * bug: scan_output differs strongly
      
      * add rms norm option
      
      * fix fast + slow generation with and w/o cache 
      
      
      
      * draft integration tests
      
      * remove a big chunk of the einsum
      
      * fix slow, fast generations, without any einsum
      
      * fix copies
      
      * fix structure
      
      * fix up modeling and tests
      
      * fix tests
      
      * clamping is indeed worse
      
      * recover mamba2 cache test
      
      * fix copies
      
      * no cache position (yet)
      
      * fix tf tests
      
      * fix matmul for generate
      
      * fixup
      
      * skip cache tests for now
      
      * [run-slow]mamba2
      
      * tune out hidden states for padding
      
      * test batched generation
      
      * propagate attention mask changes
      
      * fix past length
      
      * fix integration test
      
      * style
      
      * address comments
      
      * update readme
      
      * add mamba2 version check
      
      * fix tests
      
      * [run-slow]mamba2
      
      * skip edge tests
      
      * [run-slow]mamba2
      
      * last fixup
      
      * [run-slow]mamba2
      
      * update README
      
      ---------
      Co-authored-by: default avatarArthur Zucker <arthur.zucker@gmail.com>
      80b90e7b
    • Joao Gante's avatar
      Generate: fix end to end compilation (#32465) · 3d8bd119
      Joao Gante authored
      3d8bd119
    • Ao Tang's avatar
      Add Nemotron HF Support (#31699) · 6a03942d
      Ao Tang authored
      * Add nemotron support
      
      * fix inference
      
      * add unit test
      
      * add layernorm1p as a class to avoid meta device mismatch
      
      * test fixed
      
      * Add copied_from statements
      
      * remove pretraining_tp args
      
      * remove nemotronlayernorm
      
      * force LN computation done in FP32
      
      * remove nemotrontokenizer and use llamatokenizer
      
      * license update
      
      * add option for kv_channels for minitron8b
      
      * remove assert
      
      * o_proj fixed
      
      * o_proj reshape
      
      * add gated_proj option
      
      * typo
      
      * remove todos
      
      * fix broken test after merging latest main
      
      * remove nezha/nat after meging main
      
      * chnage default config to 15b model
      
      * add nemo conversion script
      
      * rename conversion script
      
      * remove gate_proj option
      
      * pr comment resolved
      
      * fix unit test
      
      * rename kv_channels to head_dim
      
      * resolve PR issue
      
      * add nemotron md
      
      * fix broken tests
      
      * refactor rope for nemotron
      
      * test fix
      
      * remove linearscaling
      
      * whitespace and import
      
      * fix some copied-from
      
      * code style fix
      
      * reformatted
      
      * add position_embedding to nemotronattention
      
      * rope refactor to only use config, copied-from fix
      
      * format
      
      * Run make fix-copies
      
      * nemotron md with autodoc
      
      * doc  fix
      
      * fix order
      
      * pass check_config_docstrings.py
      
      * fix config_attributes
      
      * remove all llama BC related code
      
      * Use PreTrainedTokenizerFast
      
      * ruff check examples
      
      * conversion script update
      
      * add nemotron to toctree
      6a03942d
    • Joao Gante's avatar
      Dependencies: fix typo (#32389) · 36fd35e1
      Joao Gante authored
      deps_2
      36fd35e1
    • Francisco Kurucz's avatar
    • Pavel Iakubovskii's avatar
      Update kwargs validation for `preprocess` with decorator (#32024) · fb66ef81
      Pavel Iakubovskii authored
      * BLIP preprocess
      
      * BIT preprocess
      
      * BRIDGETOWER preprocess
      
      * CHAMELEON preprocess
      
      * CHINESE_CLIP preprocess
      
      * CONVNEXT preprocess
      
      * DEIT preprocess
      
      * DONUT preprocess
      
      * DPT preprocess
      
      * FLAVA preprocess
      
      * EFFICIENTNET preprocess
      
      * FUYU preprocess
      
      * GLPN preprocess
      
      * IMAGEGPT preprocess
      
      * INTRUCTBLIPVIDEO preprocess
      
      * VIVIT preprocess
      
      * ZOEDEPTH preprocess
      
      * VITMATTE preprocess
      
      * VIT preprocess
      
      * VILT preprocess
      
      * VIDEOMAE preprocess
      
      * VIDEOLLAVA
      
      * TVP processing
      
      * TVP fixup
      
      * SWIN2SR preprocess
      
      * SIGLIP preprocess
      
      * SAM preprocess
      
      * RT-DETR preprocess
      
      * PVT preprocess
      
      * POOLFORMER preprocess
      
      * PERCEIVER preprocess
      
      * OWLVIT preprocess
      
      * OWLV2 preprocess
      
      * NOUGAT preprocess
      
      * MOBILEVIT preprocess
      
      * MOBILENETV2 preprocess
      
      * MOBILENETV1 preprocess
      
      * LEVIT preprocess
      
      * LAYOUTLMV2 preprocess
      
      * LAYOUTLMV3 preprocess
      
      * Add test
      
      * Update tests
      fb66ef81
    • Fanli Lin's avatar
      add the missing flash attention test marker (#32419) · e85d8639
      Fanli Lin authored
      * add flash attention check
      
      * fix
      
      * fix
      
      * add the missing marker
      
      * bug fix
      
      * add one more
      
      * remove order
      
      * add one more
      e85d8639
    • Prakarsh Kaushik's avatar
      Llava: fix checkpoint_doc (#32458) · 0aa83282
      Prakarsh Kaushik authored
      fix: add new llava like model bug
      0aa83282
    • Raushan Turganbay's avatar
      Cache: create docs (#32150) · 37c5ca5e
      Raushan Turganbay authored
      
      
      * draft
      
      * updates
      
      * works?
      
      * try adding python example in hidden section
      
      * another try
      
      * hwo do i render python
      
      * format as html code?
      
      * Update docs/source/en/kv_cache.md
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update docs/source/en/kv_cache.md
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update docs/source/en/kv_cache.md
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update docs/source/en/kv_cache.md
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update docs/source/en/kv_cache.md
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * one more small update
      
      * should render hidden secrtion now
      
      * add outputs
      
      * fix links
      
      * check links
      
      * update all links
      
      * update with offloaded cache
      
      * all cache is importable, so they appear in docs
      
      * fix copies
      
      * docstring...
      
      ---------
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      37c5ca5e
  2. 05 Aug, 2024 10 commits
  3. 03 Aug, 2024 2 commits
    • Xueshen Liu's avatar
      MixtralFlashAttention2: put "plus 1" inside parentheses when calculating... · 621fb3c0
      Xueshen Liu authored
      MixtralFlashAttention2: put "plus 1" inside parentheses when calculating rotary_seq_len, allowing None position_ids input. (#31500)
      
      * Mixtral: remove unnecessary plus 1 when calculating rotary_seq_len, allowing position_ids=None (no auto position_ids generation could be unsafe)
      
      * fix typo [:-1] to [:, -1]
      
      * to meet formatting requirement
      
      * to meet formatting requirement
      
      * remove white space
      
      * MixtralFlashAttention2: put "+ 1" inside parentheses when calculating rotary_seq_len, allowing None position_ids input. Fix format/style issue.
      
      * propagate to startcoder2, phi3, mixtral and qwen2
      
      * update qwen2_moe
      621fb3c0
    • Shaopeng Fu's avatar
      fix: (issue #32124) Exception raised when running... · 7c31d05b
      Shaopeng Fu authored
      fix: (issue #32124) Exception raised when running `transformers/examples/flax/language-modeling/t5_tokenizer_model.py`. (#32157)
      
      fix: Exception raised when running .
      7c31d05b
  4. 02 Aug, 2024 3 commits
  5. 01 Aug, 2024 10 commits