"tests/models/gpt_bigcode/__init__.py" did not exist on "5b396457e5035a8b16ddee14b205c098598fe6bb"
  1. 27 Mar, 2024 1 commit
    • Bo Zheng's avatar
      Add Qwen2MoE (#29377) · 1c39974a
      Bo Zheng authored
      
      
      * add support for qwen2 MoE models
      
      * update docs
      
      * add support for qwen2 MoE models
      
      * update docs
      
      * update model name & test
      
      * update readme
      
      * update class names & readme & model_doc of Qwen2MoE.
      
      * update architecture name
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * update modeling_qwen2_moe.py
      
      * fix model architecture
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * update modeling_qwen2_moe.py
      
      * fix model architecture
      
      * fix style
      
      * fix test when there are sparse and non sparse layers
      
      * fixup
      
      * Update README.md
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fixup
      
      * fixup
      
      * add archive back
      
      * add support for qwen2 MoE models
      
      * update docs
      
      * update model name & test
      
      * update readme
      
      * update class names & readme & model_doc of Qwen2MoE.
      
      * update architecture name
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * update modeling_qwen2_moe.py
      
      * fix model architecture
      
      * fixup
      
      * fix qwen2_moe tests
      
      * use Qwen2Tokenizer instead of Qwen2MoeTokenizer
      
      * fix style
      
      * fix test when there are sparse and non sparse layers
      
      * fixup
      
      * add archive back
      
      * fix integration test
      
      * fixup
      
      ---------
      Co-authored-by: default avatarbozheng-hit <dsoul0621@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      1c39974a
  2. 26 Mar, 2024 1 commit
  3. 18 Mar, 2024 1 commit
    • Yoach Lacombe's avatar
      Add MusicGen Melody (#28819) · c43b380e
      Yoach Lacombe authored
      
      
      * first modeling code
      
      * make repository
      
      * still WIP
      
      * update model
      
      * add tests
      
      * add latest change
      
      * clean docstrings and copied from
      
      * update docstrings md and readme
      
      * correct chroma function
      
      * correct copied from and remove unreleated test
      
      * add doc to toctree
      
      * correct imports
      
      * add convert script to notdoctested
      
      * Add suggestion from Sanchit
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * correct get_uncoditional_inputs docstrings
      
      * modify README according to SANCHIT feedback
      
      * add chroma to audio utils
      
      * clean librosa and torchaudio hard dependencies
      
      * fix FE
      
      * refactor audio decoder -> audio encoder for consistency with previous musicgen
      
      * refactor conditional -> encoder
      
      * modify sampling rate logics
      
      * modify license at the beginning
      
      * refactor all_self_attns->all_attentions
      
      * remove ignore copy from causallm generate
      
      * add copied from for from_sub_models
      
      * fix make copies
      
      * add warning if audio is truncated
      
      * add copied from where relevant
      
      * remove artefact
      
      * fix convert script
      
      * fix torchaudio and FE
      
      * modify chroma method according to feedback-> better naming
      
      * refactor input_values->input_features
      
      * refactor input_values->input_features and fix import fe
      
      * add input_features to docstrigs
      
      * correct inputs_embeds logics
      
      * remove dtype conversion
      
      * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation
      
      * change warning for chroma length
      
      * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * change way to save wav, using soundfile
      
      * correct docs and change to soundfile
      
      * fix import
      
      * fix init proj layers
      
      * remove line breaks from md
      
      * fix issue with docstrings
      
      * add FE suggestions
      
      * improve is in logics and remove useless imports
      
      * remove custom from_pretrained
      
      * simplify docstring code
      
      * add suggestions for modeling tests
      
      * make style
      
      * update converting script with sanity check
      
      * remove encoder attention mask from conditional generation
      
      * replace musicgen melody checkpoints with official orga
      
      * rename ylacombe->facebook in checkpoints
      
      * fix copies
      
      * remove unecessary warning
      
      * add shape in code docstrings
      
      * add files to slow doc tests
      
      * fix md bug and add md to not_tested
      
      * make fix-copies
      
      * fix hidden states test and batching
      
      ---------
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      c43b380e
  4. 15 Mar, 2024 1 commit
    • Saurabh Dash's avatar
      Cohere Model Release (#29622) · 0e4a1c34
      Saurabh Dash authored
      
      
      * Cohere Model Release (#1)
      
      Cohere Model Release
      
      * Remove unnecessary files and code (#2)
      
      Some cleanup
      
      * Delete cohere-model directory (#3)
      
      * Make Fix (#5)
      
      * Pr fixes (#6)
      
      * fixes for pr
      
      * pr fixes for the format
      
      * pr fixes for the format
      
      * src/transformers/models/auto/tokenization_auto.py
      
      * Tokenizer test (#8)
      
      * tokenizer test
      
      * format fix
      
      * Adding Docs and other minor changes (#7)
      
      * Add modeling tests (#9)
      
      * Smol Fix (#11)
      
      * tokenization tests are fixed
      
      * format fixes
      
      * fix pr doc tests
      
      * fix pr doc tests
      
      * fix pr doc tests
      
      * fix pr style check
      
      * small changes in cohere.md
      
      * FIX: Address final comments for transformers integration (#13)
      
      * fix modeling final nits and add proper test file
      
      * for now leave empty tests
      
      * add integration test
      
      * push new test
      
      * fix modeling cohere (#14)
      
      * Update chat templates to use the new API (#15)
      
      ---------
      Co-authored-by: default avatarahmetustun <ahmetustun89@gmail.com>
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      0e4a1c34
  5. 13 Mar, 2024 1 commit
    • Nate Cibik's avatar
      Add PvT-v2 Model (#26812) · 1fc505b8
      Nate Cibik authored
      
      
      * Added pytests for pvt-v2, all passed
      
      * Added pvt_v2 to docs/source/end/model_doc
      
      * Ran fix-copies and fixup. All checks passed
      
      * Added additional ReLU for linear attention mode
      
      * pvt_v2_b2_linear converted and working
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * PvT-v2 now works in AutoModel
      
      * Reverted batch eval changes for PR
      
      * Expanded type support for Pvt-v2 config
      
      * Fixed config docstring. Added channels property
      
      * Fixed model names in tests
      
      * Fixed config backbone compat. Added additional type support for image size in config
      
      * Fixed config backbone compat
      
      * Allowed for batching of eval metrics
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * Set key and value layers to use separate linear modules. Fixed pruning function
      
      * Set AvgPool to 7
      
      * Fixed issue in init
      
      * PvT-v2 now works in AutoModel
      
      * Successful conversion of pretrained weights for PVT-v2
      
      * Successful conversion of pretrained weights for PVT-v2 models
      
      * Added pytests for pvt-v2, all passed
      
      * Ran fix-copies and fixup. All checks passed
      
      * Added additional ReLU for linear attention mode
      
      * pvt_v2_b2_linear converted and working
      
      * Allowed for batching of eval metrics
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * Set key and value layers to use separate linear modules. Fixed pruning function
      
      * Set AvgPool to 7
      
      * Fixed issue in init
      
      * PvT-v2 now works in AutoModel
      
      * Successful conversion of pretrained weights for PVT-v2
      
      * Successful conversion of pretrained weights for PVT-v2 models
      
      * Added pytests for pvt-v2, all passed
      
      * Ran fix-copies and fixup. All checks passed
      
      * Added additional ReLU for linear attention mode
      
      * pvt_v2_b2_linear converted and working
      
      * Reverted batch eval changes for PR
      
      * Updated index.md
      
      * Expanded type support for Pvt-v2 config
      
      * Fixed config docstring. Added channels property
      
      * Fixed model names in tests
      
      * Fixed config backbone compat
      
      * Ran fix-copies
      
      * Fixed PvtV2Backbone tests
      
      * Added TFRegNet to OBJECTS_TO_IGNORE in check_docstrings.py
      
      * Fixed backbone stuff and fixed tests: all passing
      
      * Ran make fixup
      
      * Made modifications for code checks
      
      * Remove ONNX config from configuration_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Use explicit image size dict in test_modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Make image_size optional in test_modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Remove _ntuple use in modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Remove reference to fp16_enabled
      
      * Model modules now take config as first argument even when not used
      
      * Replaced abbreviations for "SR" and "AP" with explicit "spatialreduction" and "averagepooling"
      
      * All LayerNorm now instantiates with config.layer_norm_eps
      
      * Added docstring for depth-wise conv layer
      
      * PvtV2Config now only takes Union[int, Tuple[int, int]] for image size
      
      * Refactored PVTv2 in prep for gradient checkpointing
      
      * Gradient checkpointing ready to test
      
      * Removed override of _set_gradient_checkpointing
      
      * Cleaned out old code
      
      * Applied code fixup
      
      * Applied code fixup
      
      * Began debug of pvt_v2 tests
      
      * Leave handling of num_labels to base pretrained config class
      
      * Deactivated gradient checkpointing tests until it is fixed
      
      * Removed PvtV2ImageProcessor which duped PvtImageProcessor
      
      * Allowed for batching of eval metrics
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * Set key and value layers to use separate linear modules. Fixed pruning function
      
      * Set AvgPool to 7
      
      * Fixed issue in init
      
      * PvT-v2 now works in AutoModel
      
      * Successful conversion of pretrained weights for PVT-v2
      
      * Successful conversion of pretrained weights for PVT-v2 models
      
      * Added pytests for pvt-v2, all passed
      
      * Added pvt_v2 to docs/source/end/model_doc
      
      * Ran fix-copies and fixup. All checks passed
      
      * Added additional ReLU for linear attention mode
      
      * pvt_v2_b2_linear converted and working
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * PvT-v2 now works in AutoModel
      
      * Reverted batch eval changes for PR
      
      * Expanded type support for Pvt-v2 config
      
      * Fixed config docstring. Added channels property
      
      * Fixed model names in tests
      
      * Fixed config backbone compat. Added additional type support for image size in config
      
      * Fixed config backbone compat
      
      * Allowed for batching of eval metrics
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * Set key and value layers to use separate linear modules. Fixed pruning function
      
      * Set AvgPool to 7
      
      * Fixed issue in init
      
      * PvT-v2 now works in AutoModel
      
      * Successful conversion of pretrained weights for PVT-v2
      
      * Successful conversion of pretrained weights for PVT-v2 models
      
      * Added pytests for pvt-v2, all passed
      
      * Ran fix-copies and fixup. All checks passed
      
      * Added additional ReLU for linear attention mode
      
      * pvt_v2_b2_linear converted and working
      
      * Allowed for batching of eval metrics
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * Set key and value layers to use separate linear modules. Fixed pruning function
      
      * Set AvgPool to 7
      
      * Fixed issue in init
      
      * PvT-v2 now works in AutoModel
      
      * Successful conversion of pretrained weights for PVT-v2
      
      * Successful conversion of pretrained weights for PVT-v2 models
      
      * Added pytests for pvt-v2, all passed
      
      * Ran fix-copies and fixup. All checks passed
      
      * Added additional ReLU for linear attention mode
      
      * pvt_v2_b2_linear converted and working
      
      * Reverted batch eval changes for PR
      
      * Expanded type support for Pvt-v2 config
      
      * Fixed config docstring. Added channels property
      
      * Fixed model names in tests
      
      * Fixed config backbone compat
      
      * Ran fix-copies
      
      * Fixed PvtV2Backbone tests
      
      * Added TFRegNet to OBJECTS_TO_IGNORE in check_docstrings.py
      
      * Fixed backbone stuff and fixed tests: all passing
      
      * Ran make fixup
      
      * Made modifications for code checks
      
      * Remove ONNX config from configuration_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Use explicit image size dict in test_modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Make image_size optional in test_modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Remove _ntuple use in modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Remove reference to fp16_enabled
      
      * Model modules now take config as first argument even when not used
      
      * Replaced abbreviations for "SR" and "AP" with explicit "spatialreduction" and "averagepooling"
      
      * All LayerNorm now instantiates with config.layer_norm_eps
      
      * Added docstring for depth-wise conv layer
      
      * PvtV2Config now only takes Union[int, Tuple[int, int]] for image size
      
      * Refactored PVTv2 in prep for gradient checkpointing
      
      * Gradient checkpointing ready to test
      
      * Removed override of _set_gradient_checkpointing
      
      * Cleaned out old code
      
      * Applied code fixup
      
      * Applied code fixup
      
      * Allowed for batching of eval metrics
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * PvT-v2 now works in AutoModel
      
      * Ran fix-copies and fixup. All checks passed
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * PvT-v2 now works in AutoModel
      
      * Reverted batch eval changes for PR
      
      * Fixed config docstring. Added channels property
      
      * Fixed config backbone compat
      
      * Allowed for batching of eval metrics
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * PvT-v2 now works in AutoModel
      
      * Ran fix-copies and fixup. All checks passed
      
      * Allowed for batching of eval metrics
      
      * copied models/pvt to adapt to pvt_v2
      
      * First commit of pvt_v2
      
      * PvT-v2 now works in AutoModel
      
      * Fixed config backbone compat
      
      * Ran fix-copies
      
      * Began debug of pvt_v2 tests
      
      * Leave handling of num_labels to base pretrained config class
      
      * Deactivated gradient checkpointing tests until it is fixed
      
      * Removed PvtV2ImageProcessor which duped PvtImageProcessor
      
      * Fixed issue from rebase
      
      * Fixed issue from rebase
      
      * Set tests for gradient checkpointing to skip those using reentrant since it isn't supported
      
      * Fixed issue from rebase
      
      * Fixed issue from rebase
      
      * Changed model name in docs
      
      * Removed duplicate PvtV2Backbone
      
      * Work around type switching issue in tests
      
      * Fix model name in config comments
      
      * Update docs/source/en/model_doc/pvt_v2.md
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Changed name of variable from 'attn_reduce' to 'sr_type'
      
      * Changed name of variable from 'attn_reduce' to 'sr_type'
      
      * Changed from using 'sr_type' to 'linear_attention' for clarity
      
      * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py
      
      Removed old code
      
      * Changed from using 'sr_type' to 'linear_attention' for clarity
      
      * Fixed Class names to be more descriptive
      
      * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py
      
      Removed outdated code
      
      * Moved paper abstract to single line in pvt_v2.md
      
      * Added usage tips to pvt_v2.md
      
      * Simplified module inits by passing layer_idx
      
      * Fixed typing for hidden_act in PvtV2Config
      
      * Removed unusued import
      
      * Add pvt_v2 to docs/source/en/_toctree.yml
      
      * Updated documentation in docs/source/en/model_doc/pvt_v2.md to be more comprehensive.
      
      * Updated documentation in docs/source/en/model_doc/pvt_v2.md to be more comprehensive.
      
      * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py
      
      Move function parameters to single line
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py
      
      Update year of copyright to 2024
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py
      
      Make code more explicit
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Updated sr_ratio to be more explicit spatial_reduction_ratio
      
      * Removed excess type hints in modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Move params to single line in modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Removed needless comment in modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update copyright date in pvt_v2.md
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Moved params to single line in modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Updated copyright date in configuration_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Cleaned comments in modeling_pvt_v2.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Renamed spatial_reduction Conv2D operation
      
      * Revert "Update src/transformers/models/pvt_v2/modeling_pvt_v2.py
      "
      
      This reverts commit c4a04416dde8f3475ab405d1feb368600e0f8538.
      
      * Updated conversion script to reflect module name change
      
      * Deprecated reshape_last_stage option in config
      
      * Removed unused imports
      
      * Code formatting
      
      * Fixed outdated decorators on test_inference_fp16
      
      * Added "Copied from" comments in test_modeling_pvt_v2.py
      
      * Fixed import listing
      
      * Updated model name
      
      * Force empty commit for PR refresh
      
      * Fixed linting issue
      
      * Removed # Copied from comments
      
      * Added PVTv2 to README_fr.md
      
      * Ran make fix-copies
      
      * Replace all FoamoftheSea hub references with OpenGVLab
      
      * Fixed out_indices and out_features logic in configuration_pvt_v2.py
      
      * Made ImageNet weight conversion verification optional in convert_pvt_v2_to_pytorch.py
      
      * Ran code fixup
      
      * Fixed order of parent classes in PvtV2Config to fix the to_dict method override
      
      ---------
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      1fc505b8
  6. 05 Mar, 2024 1 commit
    • Arthur's avatar
      [`Add Mamba`] Adds support for the `Mamba` models (#28094) · fb1c62e9
      Arthur authored
      
      
      * initial-commit
      
      * start cleaning
      
      * small nits
      
      * small nits
      
      * current updates
      
      * add kernels
      
      * small refactoring little step
      
      * add comments
      
      * styling
      
      * nit
      
      * nits
      
      * Style
      
      * Small changes
      
      * Push dummy mambda simple slow
      
      * nit
      
      * Use original names
      
      * Use original names and remove norm
      
      * Updates for inference params
      
      * Style nd updates
      
      * nits
      
      * Match logits
      
      * Add a test
      
      * Add expected generated text
      
      * nits doc, imports and styling
      
      * style
      
      * oups
      
      * dont install kernels, invite users to install the required kernels
      
      * let use use the original packages
      
      * styling
      
      * nits
      
      * fix some copieds
      
      * update doc
      
      * fix-copies
      
      * styling done
      
      * nits
      
      * fix import check
      
      * run but wrong cuda ress
      
      * mamba CUDA works :)
      
      * fix the fast path
      
      * config naming nits
      
      * conversion script is not required at this stage
      
      * finish fixing the fast path: generation make sense now!
      
      * nit
      
      * Let's start working on the CIs
      
      * style
      
      * better style
      
      * more nits
      
      * test nit
      
      * quick fix for now
      
      * nits
      
      * nit
      
      * nit
      
      * nit
      
      * nits
      
      * update test rest
      
      * fixup
      
      * update test
      
      * nit
      
      * some fixes
      
      * nits
      
      * update test values
      
      * fix styling
      
      * nit
      
      * support peft
      
      * integrations tests require torchg
      
      * also add slow markers
      
      * styling
      
      * chose forward wisely
      
      * nits
      
      * update tests
      
      * fix gradient checkpointing
      
      * fixup
      
      * nit
      
      * fix doc
      
      * check copies
      
      * fix the docstring
      
      * fix some more tests
      
      * style
      
      * fix beam search
      
      * add init schene
      
      * update
      
      * nit
      
      * fix
      
      * fixup the doc
      
      * fix the doc
      
      * fixup
      
      * tentative update but slow is no longer good
      
      * nit
      
      * should we always use float32?
      
      * nits
      
      * revert wrong changes
      
      * res in float32
      
      * cleanup
      
      * skip fmt for now
      
      * update generation values
      
      * update test values running original model
      
      * fixup
      
      * update tests + rename inference_params to cache_params + make sure training does not use cache_params
      
      * small nits
      
      * more nits
      
      * fix final CIs
      
      * style
      
      * nit doc
      
      * I hope final doc nits
      
      * nit
      
      * 馃珷
      
      * final touch!
      
      * fix torch import
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * Apply suggestions from code review
      
      * fix fix and fix
      
      * fix base model prefix!
      
      * nit
      
      * Update src/transformers/models/mamba/__init__.py
      
      * Update docs/source/en/model_doc/mamba.md
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * nit
      
      ---------
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      fb1c62e9
  7. 28 Feb, 2024 1 commit
  8. 27 Feb, 2024 1 commit
  9. 21 Feb, 2024 1 commit
    • Arthur's avatar
      [ `gemma`] Adds support for Gemma 馃拵 (#29167) · 594c1277
      Arthur authored
      * inital commit
      
      * update
      
      * update conversion checkpoint
      
      * update conversion script
      
      * nits
      
      * some fixes
      
      * nits
      
      * merge
      
      * fix permute
      
      * nits
      
      * fix
      
      * nits
      
      * nits
      
      * nits
      
      * fix rope
      
      * fix both rope
      
      * nites
      
      * style
      
      * make sure flax works
      
      * fix flax init code
      
      * fix foward
      
      * nits
      
      * print flax generation out
      
      * current code
      
      * nits
      
      * SIIIIIIIIIIIIIIIIIII
      
      * update
      
      * add new tokenizer
      
      * correct fast tokenizer
      
      * fix conversion
      
      * more comments
      
      * fix modeling and conversion
      
      * nits and nits
      
      * nits testing
      
      * add some tokenization tests
      
      * add some edge cases
      
      * add slow tests and fix them
      
      * fixup
      
      * fix copies for modeling
      
      * fix copies
      
      * add 7B slow tests
      
      * fix
      
      * fix
      
      * fix tests
      
      * make tokenizer cis go green
      
      * styling
      
      * last tokenizer nits
      
      * update jax tests
      
      * fix flax for 7b
      
      * add jit testing 馃
      
      
      
      * cleanups
      
      * isolated nit, inv_freq for rotary_emb.inv_freq
      
      * propagate to jax
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * adjust test
      
      * fix conversion script
      
      * change name
      
      * correct file names
      
      * update conversion script
      
      * Fix bos and eos token ids in the model configuration (#3)
      
      * update modelling
      
      * update conversion script
      
      * add static cache for gemma
      
      * fix sdpa generate
      
      * fix batched
      
      * multiple fixes
      
      * fix FA2
      
      * final fix
      
      * Rename a few missing strings and filenames (#4)
      
      * merge with upstream main
      
      * fix copies
      
      * fix copies
      
      * fix fixup
      
      * fix fixup
      
      * fix
      
      * fix
      
      * final tests
      
      * fix fx gemma tests
      
      * fix fx bf16/fp16 tests
      
      * update slow fx tests
      
      * fx slow tests: one logits, one generation
      
      * move jit test standalone
      
      * Apply suggestions from code review
      
      * nits
      
      * tokenizer updates
      
      * more tokenization updates: custom GemmaSentencepieceExtrator
      
      * style
      
      * Update src/transformers/cache_utils.py
      
      * Update src/transformers/models/gemma/__init__.py
      
      * Update tests/models/gemma/test_modeling_flax_gemma.py
      
      * small nits
      
      * style
      
      * update tokenization test
      
      * fix the rotary embedding
      
      * with style
      
      * fix slow tests
      
      * WARNING this commit might be very important for precisions
      
      * Update tests/models/gemma/test_modeling_flax_gemma.py
      
      * Update src/transformers/models/gemma/configuration_gemma.py
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * Update src/transformers/models/gemma/modeling_flax_gemma.py
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * small nits here and there!
      
      * forgotten nit
      
      * remove on the fly computation of inv_freq
      
      * revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float
      
      * Apply suggestions from code review
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_flax_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_tokenization_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_tokenization_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_tokenization_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_tokenization_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * Update tests/models/gemma/test_modeling_gemma.py
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      
      * nit conversion script link
      
      * fix some tests
      
      * add not doctest and pr doctest
      
      * repo consistency
      
      * fix last CIs 馃殌
      
      
      
      * update all readmes
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      Co-authored-by: default avatarPedro Cuenca <pedro@huggingface.co>
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      Co-authored-by: default avatarsanchit-gandhi <sanchit@huggingface.co>
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      594c1277
  10. 19 Feb, 2024 1 commit
    • Winton Davies's avatar
      fix the post-processing link (#29091) · 593230f0
      Winton Davies authored
      The link in evaluation was missing a hyphen between post and processing. I fixed this, for English only. Someone with the ability to do a global search/replace should fix the other languages (if indeed they have this issue)/
      593230f0
  11. 16 Feb, 2024 1 commit
  12. 14 Feb, 2024 3 commits
  13. 12 Feb, 2024 2 commits
  14. 08 Feb, 2024 1 commit
  15. 06 Feb, 2024 2 commits
  16. 02 Feb, 2024 1 commit
    • Klaus Hipp's avatar
      [Docs] Fix spelling and grammar mistakes (#28825) · 721ee783
      Klaus Hipp authored
      * Fix typos and grammar mistakes in docs and examples
      
      * Fix typos in docstrings and comments
      
      * Fix spelling of `tokenizer` in model tests
      
      * Remove erroneous spaces in decorators
      
      * Remove extra spaces in Markdown link texts
      721ee783
  17. 01 Feb, 2024 1 commit
    • JB (Don)'s avatar
      Adding [T5/MT5/UMT5]ForTokenClassification (#28443) · 0d26abdd
      JB (Don) authored
      * Adding [T5/MT5/UMT5]ForTokenClassification
      
      * Add auto mappings for T5ForTokenClassification and variants
      
      * Adding ForTokenClassification to the list of models
      
      * Adding attention_mask param to the T5ForTokenClassification test
      
      * Remove outdated comment in test
      
      * Adding EncoderOnly and Token Classification tests for MT5 and UMT5
      
      * Fix typo in umt5 string
      
      * Add tests for all the existing MT5 models
      
      * Fix wrong comment in dependency_versions_table
      
      * Reverting change to common test for _keys_to_ignore_on_load_missing
      
      The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing.
      
      * Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model
      
      * Add fix-copies to MT5ModelTest
      0d26abdd
  18. 26 Jan, 2024 1 commit
  19. 25 Jan, 2024 2 commits
    • Yusuf's avatar
      Update question_answering.md (#28694) · 24f1a00e
      Yusuf authored
      fix typo:
      
      from:
      
       "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")"
      
      to:
      model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")
      24f1a00e
    • NielsRogge's avatar
      Add Depth Anything (#28654) · 963db81a
      NielsRogge authored
      * First draft
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * Add docs
      
      * Remove file
      
      * Add copied from
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Fix style
      
      * Update docs
      
      * Convert all checkpoints, add integration test
      
      * Rename checkpoints
      
      * Add pretrained backbone attributes
      
      * Fix default config
      
      * Address comment
      
      * Add figure to docs
      
      * Fix bug thanks to @xenova
      
      * Update conversion script
      
      * Fix integration test
      963db81a
  20. 18 Jan, 2024 1 commit
    • Yoach Lacombe's avatar
      Add new meta w2v2-conformer BERT-like model (#28165) · d2cdefb9
      Yoach Lacombe authored
      
      
      * first commit
      
      * correct default value non causal
      
      * update config and modeling code
      
      * update converting checkpoint
      
      * clean modeling and fix tests
      
      * make style
      
      * add new config parameters to docstring
      
      * fix copied from statements
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * make position_embeddings_type docstrings clearer
      
      * clean converting script
      
      * remove function not used
      
      * clean modeling file
      
      * apply suggestion for test file + add convert script to not_doctested
      
      * modify tests according to review - cleaner logic and more tests
      
      * Apply nit suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * add checker of valid position embeddings type
      
      * instantiate new layer norm layer with the right eps
      
      * fix freeze_feature_encoder since it can be None in some cases
      
      * add test same output in convert script
      
      * restore wav2vec2conformer and add new model
      
      * create processor and FE + clean
      
      * add new model code
      
      * fix convert script and set default config parameters
      
      * correct model id paths
      
      * make style
      
      * make fix-copies and cleaning files
      
      * fix copied from statements
      
      * complete .md and fixe copies
      
      * clean convert script argument defaults
      
      * fix config parameters docstrings
      
      * fix config docstring
      
      * add copied from and enrich FE tests
      
      * fix copied from and repo-consistency
      
      * add autotokenizer
      
      * make test input length shorter and change docstring code
      
      * fix docstrings and copied from
      
      * add add_adapter to ASR training example
      
      * make testing of adapters more robust
      
      * adapt to multi adapter layers
      
      * refactor input_values->input_features and remove w2v2-bert feature extractor
      
      * remove pretraining model
      
      * remove depreciated features and useless lines
      
      * add copied from and ignore statements to modeling tests
      
      * remove pretraining model #2
      
      * change import in convert script
      
      * change default in convert script
      
      * update readme and remove useless line
      
      * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * refactor BERT to Bert for consistency
      
      * remove useless ignore copy statement
      
      * add persistent to buffer in rotary
      
      * add eps in LayerNorm init and remove copied from
      
      * add adapter activation parameters and add copied from statements
      
      * Fix copied statements and add unitest.skip reasons
      
      * add copied statement in test_processor
      
      * refactor processor
      
      * make style
      
      * replace numpy random by torch rand
      
      * remove expected output CTC
      
      * improve converting script with processor class
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * remove gumbel class
      
      * remove tests related to previously deleted class
      
      * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * correct typos
      
      * remove uused parameters
      
      * update processor to takes both text and audio
      
      * update checkpoints
      
      * update expected output and add ctc expected output
      
      * add label_attention_mask
      
      * replace pt with np in processor tests
      
      * fix typo
      
      * revert to behaviour with labels_attention_mask
      
      ---------
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      d2cdefb9
  21. 17 Jan, 2024 1 commit
    • Junyang Lin's avatar
      Add qwen2 (#28436) · d6ffe74d
      Junyang Lin authored
      
      
      * add config, modeling, and tokenization
      
      * add auto and init
      
      * update readme
      
      * update readme
      
      * update team name
      
      * fixup
      
      * fixup
      
      * update config
      
      * update code style
      
      * update for fixup
      
      * update for fixup
      
      * update for fixup
      
      * update for testing
      
      * update for testing
      
      * fix bug for config and tokenization
      
      * fix bug for bos token
      
      * not doctest
      
      * debug tokenizer
      
      * not doctest
      
      * debug tokenization
      
      * debug init for tokenizer
      
      * fix style
      
      * update init
      
      * delete if in token auto
      
      * add tokenizer doc
      
      * add tokenizer in init
      
      * Update dummy_tokenizers_objects.py
      
      * update
      
      * update
      
      * debug
      
      * Update tokenization_qwen2.py
      
      * debug
      
      * Update convert_slow_tokenizer.py
      
      * add copies
      
      * add copied from and make style
      
      * update files map
      
      * update test
      
      * fix style
      
      * fix merge reading and update tests
      
      * fix tests
      
      * fix tests
      
      * fix style
      
      * debug a variable in readme
      
      * Update src/transformers/models/qwen2/configuration_qwen2.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * update test and copied from
      
      * fix style
      
      * update qwen2 tokenization  and tests
      
      * Update tokenization_qwen2.py
      
      * delete the copied from after property
      
      * fix style
      
      * update tests
      
      * update tests
      
      * add copied from
      
      * fix bugs
      
      * update doc
      
      * add warning for sliding window attention
      
      * update qwen2 tokenization
      
      * fix style
      
      * Update src/transformers/models/qwen2/modeling_qwen2.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fix tokenizer fast
      
      ---------
      Co-authored-by: default avatarRen Xuancheng <jklj077@users.noreply.github.com>
      Co-authored-by: default avatarrenxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      d6ffe74d
  22. 03 Jan, 2024 2 commits
    • Connor Henderson's avatar
      Add FastSpeech2Conformer (#23439) · d83ff5ee
      Connor Henderson authored
      * start - docs, SpeechT5 copy and rename
      
      * add relevant code from FastSpeech2 draft, have tests pass
      
      * make it an actual conformer, demo ex.
      
      * matching inference with original repo, includes debug code
      
      * refactor nn.Sequentials, start more desc. var names
      
      * more renaming
      
      * more renaming
      
      * vocoder scratchwork
      
      * matching vocoder outputs
      
      * hifigan vocoder conversion script
      
      * convert model script, rename some config vars
      
      * replace postnet with speecht5's implementation
      
      * passing common tests, file cleanup
      
      * expand testing, add output hidden states and attention
      
      * tokenizer + passing tokenizer tests
      
      * variety of updates and tests
      
      * g2p_en pckg setup
      
      * import structure edits
      
      * docstrings and cleanup
      
      * repo consistency
      
      * deps
      
      * small cleanup
      
      * forward signature param order
      
      * address comments except for masks and labels
      
      * address comments on attention_mask and labels
      
      * address second round of comments
      
      * remove old unneeded line
      
      * address comments part 1
      
      * address comments pt 2
      
      * rename auto mapping
      
      * fixes for failing tests
      
      * address comments part 3 (bart-like, train loss)
      
      * make style
      
      * pass config where possible
      
      * add forward method + tests to WithHifiGan model
      
      * make style
      
      * address arg passing and generate_speech comments
      
      * address Arthur comments
      
      * address Arthur comments pt2
      
      * lint  changes
      
      * Sanchit comment
      
      * add g2p-en to doctest deps
      
      * move up self.encoder
      
      * onnx compatible tensor method
      
      * fix is symbolic
      
      * fix paper url
      
      * move models to espnet org
      
      * make style
      
      * make fix-copies
      
      * update docstring
      
      * Arthur comments
      
      * update docstring w/ new updates
      
      * add model architecture images
      
      * header size
      
      * md wording update
      
      * make style
      d83ff5ee
    • lain's avatar
      fix documentation for zero_shot_object_detection (#28267) · 6eba901d
      lain authored
      remove broken space
      6eba901d
  23. 22 Dec, 2023 1 commit
  24. 18 Dec, 2023 1 commit
  25. 11 Dec, 2023 3 commits
  26. 07 Dec, 2023 1 commit
  27. 04 Dec, 2023 1 commit
  28. 30 Nov, 2023 1 commit
    • Yoach Lacombe's avatar
      Add SeamlessM4T v2 (#27779) · 29f1aee3
      Yoach Lacombe authored
      
      
      * add working convertion script
      
      * first non-working version of modeling code
      
      * update modeling code (working)
      
      * make style
      
      * make fix-copies
      
      * add config docstrings
      
      * add config to ignore docstrings formatage due to unconventional markdown
      
      * fix copies
      
      * fix generation num_return_sequences
      
      * enrich docs
      
      * add and fix tests beside integration tests
      
      * update integration tests
      
      * update repo id
      
      * add tie weights and make style
      
      * correct naming in .md
      
      * fix imports and so on
      
      * correct docstrings
      
      * fix fp16 speech forward
      
      * fix speechencoder attention
      
      * make style
      
      * fix copied from
      
      * rename SeamlessM4Tv2-v2 to SeamlessM4Tv2
      
      * Apply suggestions on configuration
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * remove useless public models
      
      * fix private models + better naming for T2U models
      
      * clean speech encoder relative position embeddings
      
      * refactor chunk attention
      
      * add docstrings to chunk attention method
      
      * improve naming and docstrings
      
      * rename some attention variables + add temperature sampling in T2U model
      
      * rename DOCSTRINGS variable names
      
      * make style + remove 2 useless config parameters
      
      * enrich model card
      
      * remove any attention_head reference + fix temperature in T2U
      
      * new fmt and make style
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * rename spkr_id->speaker_id and change docstrings of get_char_input_ids
      
      * simplify v2attention
      
      * make style
      
      * Update seamless_m4t_v2.md
      
      * update code and tests with last update
      
      * update repo ids
      
      * fill article name, abstract andauthors
      
      * update not_doctested and slow_doc tests
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      29f1aee3
  29. 24 Nov, 2023 1 commit
  30. 23 Nov, 2023 1 commit
  31. 17 Nov, 2023 1 commit
  32. 16 Nov, 2023 1 commit
    • Arthur's avatar
      [`Styling`] stylify using ruff (#27144) · 651408a0
      Arthur authored
      
      
      * try to stylify using ruff
      
      * might need to remove these changes?
      
      * use ruf format andruff check
      
      * use isinstance instead of type comparision
      
      * use # fmt: skip
      
      * use # fmt: skip
      
      * nits
      
      * soem styling changes
      
      * update ci job
      
      * nits isinstance
      
      * more files update
      
      * nits
      
      * more nits
      
      * small nits
      
      * check and format
      
      * revert wrong changes
      
      * actually use formatter instead of checker
      
      * nits
      
      * well docbuilder is overwriting this commit
      
      * revert notebook changes
      
      * try to nuke docbuilder
      
      * style
      
      * fix feature exrtaction test
      
      * remve `indent-width = 4`
      
      * fixup
      
      * more nits
      
      * update the ruff version that we use
      
      * style
      
      * nuke docbuilder styling
      
      * leve the print for detected changes
      
      * nits
      
      * Remove file I/O
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      
      * style
      
      * nits
      
      * revert notebook changes
      
      * Add # fmt skip when possible
      
      * Add # fmt skip when possible
      
      * Fix
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * More `  # fmt: skip` usage
      
      * NIts
      
      * more fixes
      
      * fix tapas
      
      * Another way to skip
      
      * Recommended way
      
      * Fix two more fiels
      
      * Remove asynch
      Remove asynch
      
      ---------
      Co-authored-by: default avatarcharliermarsh <charlie.r.marsh@gmail.com>
      651408a0