1. 24 Jun, 2024 1 commit
  2. 14 Jun, 2024 1 commit
  3. 23 May, 2024 1 commit
  4. 21 May, 2024 1 commit
  5. 15 May, 2024 1 commit
  6. 07 May, 2024 1 commit
  7. 06 May, 2024 1 commit
  8. 02 May, 2024 1 commit
    • mobicham's avatar
      Add HQQ quantization support (#29637) · 59952994
      mobicham authored
      
      
      * update HQQ transformers integration
      
      * push import_utils.py
      
      * add force_hooks check in modeling_utils.py
      
      * fix | with Optional
      
      * force bias as param
      
      * check bias is Tensor
      
      * force forward for multi-gpu
      
      * review fixes pass
      
      * remove torch grad()
      
      * if any key in linear_tags fix
      
      * add cpu/disk check
      
      * isinstance return
      
      * add multigpu test + refactor tests
      
      * clean hqq_utils imports in hqq.py
      
      * clean hqq_utils imports in quantizer_hqq.py
      
      * delete hqq_utils.py
      
      * Delete src/transformers/utils/hqq_utils.py
      
      * ruff init
      
      * remove torch.float16 from __init__ in test
      
      * refactor test
      
      * isinstance -> type in quantizer_hqq.py
      
      * cpu/disk device_map check in quantizer_hqq.py
      
      * remove type(module) nn.linear check in quantizer_hqq.py
      
      * add BaseQuantizeConfig import inside HqqConfig init
      
      * remove hqq import in hqq.py
      
      * remove accelerate import from test_hqq.py
      
      * quant config.py doc update
      
      * add hqqconfig to main_classes doc
      
      * make style
      
      * __init__ fix
      
      * ruff __init__
      
      * skip_modules list
      
      * hqqconfig format fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * hqqconfig doc fix
      
      * test_hqq.py remove mistral comment
      
      * remove self.using_multi_gpu is False
      
      * torch_dtype default val set and logger.info
      
      * hqq.py isinstance fix
      
      * remove torch=None
      
      * torch_device test_hqq
      
      * rename test_hqq
      
      * MODEL_ID in test_hqq
      
      * quantizer_hqq setattr fix
      
      * quantizer_hqq typo fix
      
      * imports quantizer_hqq.py
      
      * isinstance quantizer_hqq
      
      * hqq_layer.bias reformat quantizer_hqq
      
      * Step 2 as comment in quantizer_hqq
      
      * prepare_for_hqq_linear() comment
      
      * keep_in_fp32_modules fix
      
      * HqqHfQuantizer reformat
      
      * quantization.md hqqconfig
      
      * quantization.md model example reformat
      
      * quantization.md # space
      
      * quantization.md space   })
      
      * quantization.md space   })
      
      * quantization_config fix doc
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * axis value check in quantization_config
      
      * format
      
      * dynamic config explanation
      
      * quant config method in quantization.md
      
      * remove shard-level progress
      
      * .cuda fix modeling_utils
      
      * test_hqq fixes
      
      * make fix-copies
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      59952994
  9. 22 Apr, 2024 1 commit
  10. 11 Apr, 2024 1 commit
  11. 28 Mar, 2024 1 commit
  12. 27 Mar, 2024 1 commit
    • huismiling's avatar
      add Cambricon MLUs support (#29627) · 75769744
      huismiling authored
      * add Cambricon MLUs support
      
      * fix mlu device rng state
      
      * up for quality check
      
      * up mlu to support fp16
      
      * fix mlu device dependency error
      
      * fix mlu device dependency error
      
      * enable mlu device for bf16
      
      * fix mlu device memory tracker
      75769744
  13. 26 Mar, 2024 1 commit
  14. 19 Mar, 2024 1 commit
  15. 15 Mar, 2024 2 commits
  16. 11 Mar, 2024 1 commit
  17. 05 Mar, 2024 1 commit
    • Arthur's avatar
      [`Add Mamba`] Adds support for the `Mamba` models (#28094) · fb1c62e9
      Arthur authored
      
      
      * initial-commit
      
      * start cleaning
      
      * small nits
      
      * small nits
      
      * current updates
      
      * add kernels
      
      * small refactoring little step
      
      * add comments
      
      * styling
      
      * nit
      
      * nits
      
      * Style
      
      * Small changes
      
      * Push dummy mambda simple slow
      
      * nit
      
      * Use original names
      
      * Use original names and remove norm
      
      * Updates for inference params
      
      * Style nd updates
      
      * nits
      
      * Match logits
      
      * Add a test
      
      * Add expected generated text
      
      * nits doc, imports and styling
      
      * style
      
      * oups
      
      * dont install kernels, invite users to install the required kernels
      
      * let use use the original packages
      
      * styling
      
      * nits
      
      * fix some copieds
      
      * update doc
      
      * fix-copies
      
      * styling done
      
      * nits
      
      * fix import check
      
      * run but wrong cuda ress
      
      * mamba CUDA works :)
      
      * fix the fast path
      
      * config naming nits
      
      * conversion script is not required at this stage
      
      * finish fixing the fast path: generation make sense now!
      
      * nit
      
      * Let's start working on the CIs
      
      * style
      
      * better style
      
      * more nits
      
      * test nit
      
      * quick fix for now
      
      * nits
      
      * nit
      
      * nit
      
      * nit
      
      * nits
      
      * update test rest
      
      * fixup
      
      * update test
      
      * nit
      
      * some fixes
      
      * nits
      
      * update test values
      
      * fix styling
      
      * nit
      
      * support peft
      
      * integrations tests require torchg
      
      * also add slow markers
      
      * styling
      
      * chose forward wisely
      
      * nits
      
      * update tests
      
      * fix gradient checkpointing
      
      * fixup
      
      * nit
      
      * fix doc
      
      * check copies
      
      * fix the docstring
      
      * fix some more tests
      
      * style
      
      * fix beam search
      
      * add init schene
      
      * update
      
      * nit
      
      * fix
      
      * fixup the doc
      
      * fix the doc
      
      * fixup
      
      * tentative update but slow is no longer good
      
      * nit
      
      * should we always use float32?
      
      * nits
      
      * revert wrong changes
      
      * res in float32
      
      * cleanup
      
      * skip fmt for now
      
      * update generation values
      
      * update test values running original model
      
      * fixup
      
      * update tests + rename inference_params to cache_params + make sure training does not use cache_params
      
      * small nits
      
      * more nits
      
      * fix final CIs
      
      * style
      
      * nit doc
      
      * I hope final doc nits
      
      * nit
      
      * 馃珷
      
      * final touch!
      
      * fix torch import
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * Apply suggestions from code review
      
      * fix fix and fix
      
      * fix base model prefix!
      
      * nit
      
      * Update src/transformers/models/mamba/__init__.py
      
      * Update docs/source/en/model_doc/mamba.md
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * nit
      
      ---------
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      fb1c62e9
  18. 04 Mar, 2024 1 commit
  19. 26 Feb, 2024 2 commits
  20. 14 Feb, 2024 1 commit
  21. 13 Feb, 2024 1 commit
  22. 01 Feb, 2024 1 commit
  23. 11 Jan, 2024 1 commit
  24. 08 Jan, 2024 1 commit
  25. 03 Jan, 2024 1 commit
    • Connor Henderson's avatar
      Add FastSpeech2Conformer (#23439) · d83ff5ee
      Connor Henderson authored
      * start - docs, SpeechT5 copy and rename
      
      * add relevant code from FastSpeech2 draft, have tests pass
      
      * make it an actual conformer, demo ex.
      
      * matching inference with original repo, includes debug code
      
      * refactor nn.Sequentials, start more desc. var names
      
      * more renaming
      
      * more renaming
      
      * vocoder scratchwork
      
      * matching vocoder outputs
      
      * hifigan vocoder conversion script
      
      * convert model script, rename some config vars
      
      * replace postnet with speecht5's implementation
      
      * passing common tests, file cleanup
      
      * expand testing, add output hidden states and attention
      
      * tokenizer + passing tokenizer tests
      
      * variety of updates and tests
      
      * g2p_en pckg setup
      
      * import structure edits
      
      * docstrings and cleanup
      
      * repo consistency
      
      * deps
      
      * small cleanup
      
      * forward signature param order
      
      * address comments except for masks and labels
      
      * address comments on attention_mask and labels
      
      * address second round of comments
      
      * remove old unneeded line
      
      * address comments part 1
      
      * address comments pt 2
      
      * rename auto mapping
      
      * fixes for failing tests
      
      * address comments part 3 (bart-like, train loss)
      
      * make style
      
      * pass config where possible
      
      * add forward method + tests to WithHifiGan model
      
      * make style
      
      * address arg passing and generate_speech comments
      
      * address Arthur comments
      
      * address Arthur comments pt2
      
      * lint  changes
      
      * Sanchit comment
      
      * add g2p-en to doctest deps
      
      * move up self.encoder
      
      * onnx compatible tensor method
      
      * fix is symbolic
      
      * fix paper url
      
      * move models to espnet org
      
      * make style
      
      * make fix-copies
      
      * update docstring
      
      * Arthur comments
      
      * update docstring w/ new updates
      
      * add model architecture images
      
      * header size
      
      * md wording update
      
      * make style
      d83ff5ee
  26. 11 Dec, 2023 1 commit
  27. 08 Dec, 2023 1 commit
    • fxmarty's avatar
      F.scaled_dot_product_attention support (#26572) · 80377eb0
      fxmarty authored
      
      
      * add sdpa
      
      * wip
      
      * cleaning
      
      * add ref
      
      * yet more cleaning
      
      * and more :)
      
      * wip llama
      
      * working llama
      
      * add output_attentions=True support
      
      * bigcode sdpa support
      
      * fixes
      
      * gpt-bigcode support, require torch>=2.1.1
      
      * add falcon support
      
      * fix conflicts falcon
      
      * style
      
      * fix attention_mask definition
      
      * remove output_attentions from attnmaskconverter
      
      * support whisper without removing any Copied from statement
      
      * fix mbart default to eager renaming
      
      * fix typo in falcon
      
      * fix is_causal in SDPA
      
      * check is_flash_attn_2_available in the models init as well in case the model is not initialized through from_pretrained
      
      * add warnings when falling back on the manual implementation
      
      * precise doc
      
      * wip replace _flash_attn_enabled by config.attn_implementation
      
      * fix typo
      
      * add tests
      
      * style
      
      * add a copy.deepcopy on the config in from_pretrained, as we do not want to modify it inplace
      
      * obey to config.attn_implementation if a config is passed in from_pretrained
      
      * fix is_torch_sdpa_available when torch is not installed
      
      * remove dead code
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/bart/modeling_bart.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * remove duplicate pretraining_tp code
      
      * add dropout in llama
      
      * precise comment on attn_mask
      
      * add fmt: off for _unmask_unattended docstring
      
      * precise num_masks comment
      
      * nuke pretraining_tp in LlamaSDPAAttention following Arthur's suggestion
      
      * cleanup modeling_utils
      
      * backward compatibility
      
      * fix style as requested
      
      * style
      
      * improve documentation
      
      * test pass
      
      * style
      
      * add _unmask_unattended tests
      
      * skip meaningless tests for idefics
      
      * hard_check SDPA requirements when specifically requested
      
      * standardize the use if XXX_ATTENTION_CLASSES
      
      * fix SDPA bug with mem-efficient backend on CUDA when using fp32
      
      * fix test
      
      * rely on SDPA is_causal parameter to handle the causal mask in some cases
      
      * fix FALCON_ATTENTION_CLASSES
      
      * remove _flash_attn_2_enabled occurences
      
      * fix test
      
      * add OPT to the list of supported flash models
      
      * improve test
      
      * properly test on different SDPA backends, on different dtypes & properly handle separately the pad tokens in the test
      
      * remove remaining _flash_attn_2_enabled occurence
      
      * Update src/transformers/modeling_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/modeling_attn_mask_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update docs/source/en/perf_infer_gpu_one.md
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * remove use_attn_implementation
      
      * fix docstring & slight bug
      
      * make attn_implementation internal (_attn_implementation)
      
      * typos
      
      * fix tests
      
      * deprecate use_flash_attention_2=True
      
      * fix test
      
      * add back llama that was removed by mistake
      
      * fix tests
      
      * remove _flash_attn_2_enabled occurences bis
      
      * add check & test that passed attn_implementation is valid
      
      * fix falcon torchscript export
      
      * fix device of mask in tests
      
      * add tip about torch.jit.trace and move bt doc below sdpa
      
      * fix parameterized.expand order
      
      * move tests from test_modeling_attn_mask_utils to test_modeling_utils as a relevant test class is already there
      
      * update sdpaattention class with the new cache
      
      * Update src/transformers/configuration_utils.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/bark/modeling_bark.py
      
      * address review comments
      
      * WIP torch.jit.trace fix. left: test both eager & sdpa
      
      * add test for torch.jit.trace for both eager/sdpa
      
      * fix falcon with torch==2.0 that needs to use sdpa
      
      * fix doc
      
      * hopefully last fix
      
      * fix key_value_length that has no default now in mask converter
      
      * is it flacky?
      
      * fix speculative decoding bug
      
      * tests do pass
      
      * fix following #27907
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      80377eb0
  28. 04 Dec, 2023 1 commit
  29. 24 Nov, 2023 1 commit
  30. 14 Nov, 2023 1 commit
  31. 02 Nov, 2023 1 commit
  32. 01 Nov, 2023 1 commit
  33. 26 Oct, 2023 1 commit
  34. 24 Oct, 2023 1 commit
    • Alex McKinney's avatar
      Device agnostic testing (#25870) · 9da45171
      Alex McKinney authored
      
      
      * adds agnostic decorators and availability fns
      
      * renaming decorators and fixing imports
      
      * updating some representative example tests
      bloom, opt, and reformer for now
      
      * wip device agnostic functions
      
      * lru cache to device checking functions
      
      * adds `TRANSFORMERS_TEST_DEVICE_SPEC`
      if present, imports the target file and updates device to function
      mappings
      
      * comments `TRANSFORMERS_TEST_DEVICE_SPEC` code
      
      * extra checks on device name
      
      * `make style; make quality`
      
      * updates default functions for agnostic calls
      
      * applies suggestions from review
      
      * adds `is_torch_available` guard
      
      * Add spec file to docs, rename function dispatch names to backend_*
      
      * add backend import to docs example for spec file
      
      * change instances of  to
      
      * Move register backend to before device check as per @statelesshz changes
      
      * make style
      
      * make opt test require fp16 to run
      
      ---------
      Co-authored-by: default avatararsalanu <arsalanu@graphcore.ai>
      Co-authored-by: default avatararsalanu <hzji210@gmail.com>
      9da45171
  35. 17 Oct, 2023 1 commit
  36. 13 Oct, 2023 1 commit
  37. 11 Oct, 2023 1 commit
  38. 26 Sep, 2023 1 commit
    • NielsRogge's avatar
      Add Nougat (#25942) · ace74d16
      NielsRogge authored
      
      
      * Add conversion script
      
      * Add NougatImageProcessor
      
      * Add crop margin
      
      * More improvements
      
      * Add docs, READMEs
      
      * Remove print statements
      
      * Include model_max_length
      
      * Add NougatTokenizerFast
      
      * Fix imports
      
      * Improve postprocessing
      
      * Improve image processor
      
      * Fix image processor
      
      * Improve normalize method
      
      * More improvements
      
      * More improvements
      
      * Add processor, improve docs
      
      * Simplify fast tokenizer
      
      * Remove test file
      
      * Fix docstrings
      
      * Use NougatProcessor in conversion script
      
      * Add is_levensthein_available
      
      * Add tokenizer tests
      
      * More improvements
      
      * Use numpy instead of opencv
      
      * Add is_cv2_available
      
      * Fix cv2_available
      
      * Add is_nltk_available
      
      * Add image processor tests, improve crop_margin
      
      * Add integration tests
      
      * Improve integration test
      
      * Use do_rescale instead of hacks, thanks Amy
      
      * Remove random_padding
      
      * Address comments
      
      * Address more comments
      
      * Add import
      
      * Address more comments
      
      * Address more comments
      
      * Address comment
      
      * Address comment
      
      * Set max_model_input_sizes
      
      * Add tests
      
      * Add requires_backends
      
      * Add Nougat to exotic tests
      
      * Use to_pil_image
      
      * Address comment regarding nltk
      
      * Add NLTK
      
      * Improve variable names, integration test
      
      * Add test
      
      * refactor, document, and test regexes
      
      * remove named capture groups, add comments
      
      * format
      
      * add non-markdown fixed tokenization
      
      * format
      
      * correct flakyness of args parse
      
      * add regex comments
      
      * test functionalities for crop_image, align long axis and expected output
      
      * add regex tests
      
      * remove cv2 dependency
      
      * test crop_margin equality between cv2 and python
      
      * refactor table regexes to markdown
      
      add newline
      
      * change print to log, improve doc
      
      * fix high count tables correction
      
      * address PR comments: naming, linting, asserts
      
      * Address comments
      
      * Add copied from
      
      * Update conversion script
      
      * Update conversion script to convert both small and base versions
      
      * Add inference example
      
      * Add more info
      
      * Fix style
      
      * Add require annotators to test
      
      * Define all keyword arguments explicitly
      
      * Move cv2 annotator
      
      * Add tokenizer init method
      
      * Transfer checkpoints
      
      * Add reference to Donut
      
      * Address comments
      
      * Skip test
      
      * Remove cv2 method
      
      * Add copied from statements
      
      * Use cached_property
      
      * Fix docstring
      
      * Add file to not doctested
      
      ---------
      Co-authored-by: default avatarPablo Montalvo <pablo.montalvo.leroux@gmail.com>
      ace74d16