- 27 Mar, 2024 2 commits
-
-
Anton Vlasjuk authored
* FIX: Cached slow forward in mamba - additionally added mamba cached test - added unused test (mamba causal lm forward and backward) - fixed typo: "causl" --> "causal" * formatting * fix: use real `slow_forward` call instead of torch module's * add shape assertion for mixer block test * adjust shape assertion
-
Bo Zheng authored
* add support for qwen2 MoE models * update docs * add support for qwen2 MoE models * update docs * update model name & test * update readme * update class names & readme & model_doc of Qwen2MoE. * update architecture name * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fix style * fix test when there are sparse and non sparse layers * fixup * Update README.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * fixup * add archive back * add support for qwen2 MoE models * update docs * update model name & test * update readme * update class names & readme & model_doc of Qwen2MoE. * update architecture name * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * update modeling_qwen2_moe.py * fix model architecture * fixup * fix qwen2_moe tests * use Qwen2Tokenizer instead of Qwen2MoeTokenizer * fix style * fix test when there are sparse and non sparse layers * fixup * add archive back * fix integration test * fixup --------- Co-authored-by:
bozheng-hit <dsoul0621@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 25 Mar, 2024 1 commit
-
-
Lysandre Debut authored
* [test_all] Remove static pretrained maps from the library's internals * Deprecate archive maps instead of removing them * Revert init changes * [test_all] Deprecate instead of removing * [test_all] PVT v2 support * [test_all] Tests should all pass * [test_all] Style * Address review comments * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * [test_all] trigger tests * [test_all] LLAVA * [test_all] Bad rebase --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 22 Mar, 2024 1 commit
-
-
fxmarty authored
* correct llava mask * fix vipllava as wlel * mask out embedding for padding tokens * add test * fix style * add setter * fix test on suggestion
-
- 21 Mar, 2024 1 commit
-
-
Raushan Turganbay authored
* prepend "bos" to blip generation * minor changes * Update src/transformers/models/blip_2/modeling_blip_2.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/instructblip/modeling_instructblip.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add generation tester mixin --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 20 Mar, 2024 5 commits
-
-
Arthur authored
* attempt to fix * the actual fix that works with compilation! * this? * temporary update * nit? * dispatcg to memory efficient? * update both models that have static cache support * fix copies fix compile * make sure fix * fix cohere and gemma * fix beams? * nit * slipped through the cracks * nit * nits * update * fix-copies * skip failing tests * nits
-
NielsRogge authored
* First draft * Fix tests, add docs * Improve docstrings * Fix test * Address comments * Address comments * Remove vocab_size attribute * Remove batch_size * Address comment * Add image processor tests * Support fx * Update docstring * Add support for 34b * Convert 34b model * Add integration tests * Update checkpoints * Convert vicuna-13b, remove doc tests * Remove script * Remove file * Address comments * Improve docstrings * Deprecate vocab_size * Remove aspect_ratio_setting * Address comments * Update READMEs * Add tips about chat templates * Fix tests * Deprecate vocab_size safely * Update tests --------- Co-authored-by:Amy Roberts <22614925+amyeroberts@users.noreply.github.com>
-
Matt authored
* Add correct batched handling for apply_chat_template * Fix warning method * Add error for incompatible options * expand tests * Add a skip for markuplm * Add skips for other layout models * Skip for LayoutLMv2 * Slightly update the warning message * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * typo fix * Update docstring for conversation kwarg * Update return docstring * Remove the warning, improve error message * Update src/transformers/tokenization_utils_base.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/tokenization_utils_base.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/test_tokenization_common.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/test_tokenization_common.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove return_dict=None * Fix up some merge cruft * More merge cruft * Add another skip * Add another skip --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
amyeroberts authored
-
Joao Gante authored
* make fix-copies * some tests fixed * tests fixed
-
- 19 Mar, 2024 3 commits
-
-
Raushan Turganbay authored
* clean-up tests * refine comments * fix musicgen tests * make style * remove slow decorator from a test * more clean-up * fix other failing tests
-
StevenBucaille authored
* Added SuperPoint docs * Added tests * Removed commented part * Commit to create and fix add_superpoint branch with a new branch * Fixed dummy_pt_objects * Committed missing files * Fixed README.md * Apply suggestions from code review Fixed small changes Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Moved ImagePointDescriptionOutput from modeling_outputs.py to modeling_superpoint.py * Removed AutoModelForKeypointDetection and related stuff * Fixed inconsistencies in image_processing_superpoint.py * Moved infer_on_model logic simply in test_inference * Fixed bugs, added labels to forward method with checks whether it is properly a None value, also added tests about this logic in test_modeling_superpoint.py * Added tests to SuperPointImageProcessor to ensure that images are properly converted to grayscale * Removed remaining mentions of MODEL_FOR_KEYPOINT_DETECTION_MAPPING * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fixed from (w, h) to (h, w) as input for tests * Removed unnecessary condition * Moved last_hidden_state to be the first returned * Moved last_hidden_state to be the first returned (bis) * Moved last_hidden_state to be the first returned (ter) * Switched image_width and image_height in tests to match recent changes * Added config as first SuperPointConvBlock init argument * Reordered README's after merge * Added missing first config argument to SuperPointConvBlock instantiations * Removed formatting error * Added SuperPoint to README's de, pt-br, ru, te and vi * Checked out README_fr.md * Fixed README_fr.md * Test fix README_fr.md * Test fix README_fr.md * Last make fix-copies ! * Updated checkpoint path * Removed unused SuperPoint doc * Added missing image * Update src/transformers/models/superpoint/modeling_superpoint.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Removed unnecessary import * Update src/transformers/models/superpoint/modeling_superpoint.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added SuperPoint to _toctree.yml --------- Co-authored-by:
steven <steven.bucaillle@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Steven Bucaille <steven.bucaille@buawei.com>
-
Arthur authored
* use user_defined_symbols * fixup * nit * add a very robust test * make sure all models are tested with the `pretrained_tokenizer_to_test` * should we make sure we test all of them? * merge * remove the id * fix test * update * ousies * oups * fixup * fix copies check * remove `pretrained_tokenizer_to_test`
-
- 18 Mar, 2024 1 commit
-
-
Yoach Lacombe authored
* first modeling code * make repository * still WIP * update model * add tests * add latest change * clean docstrings and copied from * update docstrings md and readme * correct chroma function * correct copied from and remove unreleated test * add doc to toctree * correct imports * add convert script to notdoctested * Add suggestion from Sanchit Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct get_uncoditional_inputs docstrings * modify README according to SANCHIT feedback * add chroma to audio utils * clean librosa and torchaudio hard dependencies * fix FE * refactor audio decoder -> audio encoder for consistency with previous musicgen * refactor conditional -> encoder * modify sampling rate logics * modify license at the beginning * refactor all_self_attns->all_attentions * remove ignore copy from causallm generate * add copied from for from_sub_models * fix make copies * add warning if audio is truncated * add copied from where relevant * remove artefact * fix convert script * fix torchaudio and FE * modify chroma method according to feedback-> better naming * refactor input_values->input_features * refactor input_values->input_features and fix import fe * add input_features to docstrigs * correct inputs_embeds logics * remove dtype conversion * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation * change warning for chroma length * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change way to save wav, using soundfile * correct docs and change to soundfile * fix import * fix init proj layers * remove line breaks from md * fix issue with docstrings * add FE suggestions * improve is in logics and remove useless imports * remove custom from_pretrained * simplify docstring code * add suggestions for modeling tests * make style * update converting script with sanity check * remove encoder attention mask from conditional generation * replace musicgen melody checkpoints with official orga * rename ylacombe->facebook in checkpoints * fix copies * remove unecessary warning * add shape in code docstrings * add files to slow doc tests * fix md bug and add md to not_tested * make fix-copies * fix hidden states test and batching --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 15 Mar, 2024 2 commits
-
-
Yoach Lacombe authored
* fix speech_to_test generation tests * Add details to comment * Update tests/models/speech_to_text/test_modeling_speech_to_text.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Saurabh Dash authored
* Cohere Model Release (#1) Cohere Model Release * Remove unnecessary files and code (#2) Some cleanup * Delete cohere-model directory (#3) * Make Fix (#5) * Pr fixes (#6) * fixes for pr * pr fixes for the format * pr fixes for the format * src/transformers/models/auto/tokenization_auto.py * Tokenizer test (#8) * tokenizer test * format fix * Adding Docs and other minor changes (#7) * Add modeling tests (#9) * Smol Fix (#11) * tokenization tests are fixed * format fixes * fix pr doc tests * fix pr doc tests * fix pr doc tests * fix pr style check * small changes in cohere.md * FIX: Address final comments for transformers integration (#13) * fix modeling final nits and add proper test file * for now leave empty tests * add integration test * push new test * fix modeling cohere (#14) * Update chat templates to use the new API (#15) --------- Co-authored-by:
ahmetustun <ahmetustun89@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com>
-
- 14 Mar, 2024 2 commits
-
-
Yih-Dar authored
* update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* add arg --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 13 Mar, 2024 5 commits
-
-
Nate Cibik authored
* Added pytests for pvt-v2, all passed * Added pvt_v2 to docs/source/end/model_doc * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Reverted batch eval changes for PR * Expanded type support for Pvt-v2 config * Fixed config docstring. Added channels property * Fixed model names in tests * Fixed config backbone compat. Added additional type support for image size in config * Fixed config backbone compat * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * Reverted batch eval changes for PR * Updated index.md * Expanded type support for Pvt-v2 config * Fixed config docstring. Added channels property * Fixed model names in tests * Fixed config backbone compat * Ran fix-copies * Fixed PvtV2Backbone tests * Added TFRegNet to OBJECTS_TO_IGNORE in check_docstrings.py * Fixed backbone stuff and fixed tests: all passing * Ran make fixup * Made modifications for code checks * Remove ONNX config from configuration_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use explicit image size dict in test_modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make image_size optional in test_modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove _ntuple use in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove reference to fp16_enabled * Model modules now take config as first argument even when not used * Replaced abbreviations for "SR" and "AP" with explicit "spatialreduction" and "averagepooling" * All LayerNorm now instantiates with config.layer_norm_eps * Added docstring for depth-wise conv layer * PvtV2Config now only takes Union[int, Tuple[int, int]] for image size * Refactored PVTv2 in prep for gradient checkpointing * Gradient checkpointing ready to test * Removed override of _set_gradient_checkpointing * Cleaned out old code * Applied code fixup * Applied code fixup * Began debug of pvt_v2 tests * Leave handling of num_labels to base pretrained config class * Deactivated gradient checkpointing tests until it is fixed * Removed PvtV2ImageProcessor which duped PvtImageProcessor * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Added pvt_v2 to docs/source/end/model_doc * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Reverted batch eval changes for PR * Expanded type support for Pvt-v2 config * Fixed config docstring. Added channels property * Fixed model names in tests * Fixed config backbone compat. Added additional type support for image size in config * Fixed config backbone compat * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * Set key and value layers to use separate linear modules. Fixed pruning function * Set AvgPool to 7 * Fixed issue in init * PvT-v2 now works in AutoModel * Successful conversion of pretrained weights for PVT-v2 * Successful conversion of pretrained weights for PVT-v2 models * Added pytests for pvt-v2, all passed * Ran fix-copies and fixup. All checks passed * Added additional ReLU for linear attention mode * pvt_v2_b2_linear converted and working * Reverted batch eval changes for PR * Expanded type support for Pvt-v2 config * Fixed config docstring. Added channels property * Fixed model names in tests * Fixed config backbone compat * Ran fix-copies * Fixed PvtV2Backbone tests * Added TFRegNet to OBJECTS_TO_IGNORE in check_docstrings.py * Fixed backbone stuff and fixed tests: all passing * Ran make fixup * Made modifications for code checks * Remove ONNX config from configuration_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Use explicit image size dict in test_modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make image_size optional in test_modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove _ntuple use in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove reference to fp16_enabled * Model modules now take config as first argument even when not used * Replaced abbreviations for "SR" and "AP" with explicit "spatialreduction" and "averagepooling" * All LayerNorm now instantiates with config.layer_norm_eps * Added docstring for depth-wise conv layer * PvtV2Config now only takes Union[int, Tuple[int, int]] for image size * Refactored PVTv2 in prep for gradient checkpointing * Gradient checkpointing ready to test * Removed override of _set_gradient_checkpointing * Cleaned out old code * Applied code fixup * Applied code fixup * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Ran fix-copies and fixup. All checks passed * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Reverted batch eval changes for PR * Fixed config docstring. Added channels property * Fixed config backbone compat * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Ran fix-copies and fixup. All checks passed * Allowed for batching of eval metrics * copied models/pvt to adapt to pvt_v2 * First commit of pvt_v2 * PvT-v2 now works in AutoModel * Fixed config backbone compat * Ran fix-copies * Began debug of pvt_v2 tests * Leave handling of num_labels to base pretrained config class * Deactivated gradient checkpointing tests until it is fixed * Removed PvtV2ImageProcessor which duped PvtImageProcessor * Fixed issue from rebase * Fixed issue from rebase * Set tests for gradient checkpointing to skip those using reentrant since it isn't supported * Fixed issue from rebase * Fixed issue from rebase * Changed model name in docs * Removed duplicate PvtV2Backbone * Work around type switching issue in tests * Fix model name in config comments * Update docs/source/en/model_doc/pvt_v2.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Changed name of variable from 'attn_reduce' to 'sr_type' * Changed name of variable from 'attn_reduce' to 'sr_type' * Changed from using 'sr_type' to 'linear_attention' for clarity * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Removed old code * Changed from using 'sr_type' to 'linear_attention' for clarity * Fixed Class names to be more descriptive * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Removed outdated code * Moved paper abstract to single line in pvt_v2.md * Added usage tips to pvt_v2.md * Simplified module inits by passing layer_idx * Fixed typing for hidden_act in PvtV2Config * Removed unusued import * Add pvt_v2 to docs/source/en/_toctree.yml * Updated documentation in docs/source/en/model_doc/pvt_v2.md to be more comprehensive. * Updated documentation in docs/source/en/model_doc/pvt_v2.md to be more comprehensive. * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Move function parameters to single line Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Update year of copyright to 2024 Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/pvt_v2/modeling_pvt_v2.py Make code more explicit Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated sr_ratio to be more explicit spatial_reduction_ratio * Removed excess type hints in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Move params to single line in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Removed needless comment in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update copyright date in pvt_v2.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Moved params to single line in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Updated copyright date in configuration_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Cleaned comments in modeling_pvt_v2.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Renamed spatial_reduction Conv2D operation * Revert "Update src/transformers/models/pvt_v2/modeling_pvt_v2.py " This reverts commit c4a04416dde8f3475ab405d1feb368600e0f8538. * Updated conversion script to reflect module name change * Deprecated reshape_last_stage option in config * Removed unused imports * Code formatting * Fixed outdated decorators on test_inference_fp16 * Added "Copied from" comments in test_modeling_pvt_v2.py * Fixed import listing * Updated model name * Force empty commit for PR refresh * Fixed linting issue * Removed # Copied from comments * Added PVTv2 to README_fr.md * Ran make fix-copies * Replace all FoamoftheSea hub references with OpenGVLab * Fixed out_indices and out_features logic in configuration_pvt_v2.py * Made ImageNet weight conversion verification optional in convert_pvt_v2_to_pytorch.py * Ran code fixup * Fixed order of parent classes in PvtV2Config to fix the to_dict method override --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Yih-Dar authored
update Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Raushan Turganbay authored
* fix batchinng tests for new models * Update tests/models/seggpt/test_modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Lysandre Debut authored
* Adds pretrained IDs directly in the tests * Fix tests * Fix tests * Review!
-
bytebarde authored
* initial implementation of flash attention for gptj * modify flash attention and overwrite test_flash_attn_2_generate_padding_right * update flash attention support list * remove the copy line in the `CodeGenBlock` * address copy mechanism * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add GPTJ attention classes * add expected outputs in the gptj test * Ensure repo consistency with 'make fix-copies' --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 12 Mar, 2024 2 commits
-
-
Raushan Turganbay authored
* add tests for batching support * Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/fastspeech2_conformer/modeling_fastspeech2_conformer.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/test_modeling_common.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/test_modeling_common.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/test_modeling_common.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * fixes and comments * use cosine distance for conv models * skip mra model testing * Update tests/models/vilt/test_modeling_vilt.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * finzalize and make style * check model type by input names * Update tests/models/vilt/test_modeling_vilt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fixed batch size for all testers * Revert "fixed batch size for all testers" This reverts commit 525f3a0a058f069fbda00352cf202b728d40df99. * add batch_size for all testers * dict from model output * do not skip layoutlm * bring back some code from git revert * Update tests/test_modeling_common.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/test_modeling_common.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * clean-up * where did minus go in tolerance * make whisper happy * deal with consequences of losing minus * deal with consequences of losing minus * maskformer needs its own test for happiness * fix more models * tag flaky CV models from Amy's approval * make codestyle --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Yih-Dar authored
* update * update * update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 11 Mar, 2024 1 commit
-
-
Arthur authored
* post merge update * nit * oups
-
- 08 Mar, 2024 2 commits
-
-
Fanli Lin authored
* add sacremoses check * fix style * for FlaubertTokenizer * HerbertTokenizer fix * add typeHint * Update src/transformers/testing_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make less skipped * make quality * remove import --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Joao Gante authored
* left-padding test revisited * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 07 Mar, 2024 3 commits
-
-
Nick DeGroot authored
*
🐛 Fix vision encoder decoder positional arg *✅ Add test for VisionEncoderDecoder with LayoutLMv3 encoder --------- Co-authored-by:Nick DeGroot <1966472+nickthegroot@users.noreply.github.com>
-
Raushan Turganbay authored
* flava multimodal add attn mask * make style * check mask is not None
-
regisss authored
* Enable BLIP for auto VQA * Make style * Add VQA to BLIP pipeline tests
-
- 05 Mar, 2024 5 commits
-
-
Arthur authored
* initial-commit * start cleaning * small nits * small nits * current updates * add kernels * small refactoring little step * add comments * styling * nit * nits * Style * Small changes * Push dummy mambda simple slow * nit * Use original names * Use original names and remove norm * Updates for inference params * Style nd updates * nits * Match logits * Add a test * Add expected generated text * nits doc, imports and styling * style * oups * dont install kernels, invite users to install the required kernels * let use use the original packages * styling * nits * fix some copieds * update doc * fix-copies * styling done * nits * fix import check * run but wrong cuda ress * mamba CUDA works :) * fix the fast path * config naming nits * conversion script is not required at this stage * finish fixing the fast path: generation make sense now! * nit * Let's start working on the CIs * style * better style * more nits * test nit * quick fix for now * nits * nit * nit * nit * nits * update test rest * fixup * update test * nit * some fixes * nits * update test values * fix styling * nit * support peft * integrations tests require torchg * also add slow markers * styling * chose forward wisely * nits * update tests * fix gradient checkpointing * fixup * nit * fix doc * check copies * fix the docstring * fix some more tests * style * fix beam search * add init schene * update * nit * fix * fixup the doc * fix the doc * fixup * tentative update but slow is no longer good * nit * should we always use float32? * nits * revert wrong changes * res in float32 * cleanup * skip fmt for now * update generation values * update test values running original model * fixup * update tests + rename inference_params to cache_params + make sure training does not use cache_params * small nits * more nits * fix final CIs * style * nit doc * I hope final doc nits * nit * 🫠 * final touch! * fix torch import * Apply suggestions from code review Co-authored-by:
Lysandre Debut <hi@lysand.re> * Apply suggestions from code review * fix fix and fix * fix base model prefix! * nit * Update src/transformers/models/mamba/__init__.py * Update docs/source/en/model_doc/mamba.md Co-authored-by:
Lysandre Debut <hi@lysand.re> * nit --------- Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
Arthur authored
* fix udop imports * sort imports
-
Arthur authored
* style * revert with RP * nit * exact revert
-
Arthur Zucker authored
-
Arthur authored
* update * ... * nits * arf * 🧼 * beat the last guy * style everyone
-
- 04 Mar, 2024 3 commits
-
-
NielsRogge authored
* First draft * More improvements * More improvements * More fixes * Fix copies * More improvements * More fixes * More improvements * Convert checkpoint * More improvements, set up tests * Fix more tests * Add UdopModel * More improvements * Fix equivalence test * More fixes * Redesign model * Extend conversion script * Use real inputs for conversion script * Add image processor * Improve conversion script * Add UdopTokenizer * Add fast tokenizer * Add converter * Update README's * Add processor * Add fully fledged tokenizer * Add fast tokenizer * Use processor in conversion script * Add tokenizer tests * Fix one more test * Fix more tests * Fix tokenizer tests * Enable fast tokenizer tests * Fix more tests * Fix additional_special_tokens of fast tokenizer * Fix tokenizer tests * Fix more tests * Fix equivalence test * Rename image to pixel_values * Rename seg_data to bbox * More renamings * Remove vis_special_token * More improvements * Add docs * Fix copied from * Update slow tokenizer * Update fast tokenizer design * Make text input optional * Add first draft of processor tests * Fix more processor tests * Fix decoder_start_token_id * Fix test_initialization * Add integration test * More improvements * Improve processor, add test * Add more copied from * Add more copied from * Add more copied from * Add more copied from * Remove print statement * Update README and auto mapping * Delete files * Delete another file * Remove code * Fix test * Fix docs * Remove asserts * Add doc tests * Include UDOP in exotic model tests * Add expected tesseract decodings * Add sentencepiece * Use same design as T5 * Add UdopEncoderModel * Add UdopEncoderModel to tests * More fixes * Fix fast tokenizer * Fix one more test * Remove parallelisable attribute * Fix copies * Remove legacy file * Copy from T5Tokenizer * Fix rebase * More fixes, copy from T5 * More fixes * Fix init * Use ArthurZ/udop for tests * Make all model tests pass * Remove UdopForConditionalGeneration from auto mapping * Fix more tests * fixups * more fixups * fix the tokenizers * remove un-necessary changes * nits * nits * replace truncate_sequences_boxes with truncate_sequences for fix-copies * nit current path * add a test for input ids * ids that we should get taken from c9f7a32f57440d90ff79890270d376a1cc0acb68 * nits converting * nits * apply ruff * nits * nits * style * fix slow order of addition * fix udop fast range as well * fixup * nits * Add docstrings * Fix gradient checkpointing * Update code examples * Skip tests * Update integration test * Address comment * Make fixup * Remove extra ids from tokenizer * Skip test * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update year * Address comment * Address more comments * Address comments * Add copied from * Update CI * Rename script * Update model id * Add AddedToken, skip tests * Update CI * Fix doc tests * Do not use Tesseract for the doc tests * Remove kwargs * Add original inputs * Update casting * Fix doc test * Update question * Update question * Use LayoutLMv3ImageProcessor * Update organization * Improve docs * Update forward signature * Make images optional * Remove deprecated device argument * Add comment, add add_prefix_space * More improvements * Remove kwargs --------- Co-authored-by:
ArthurZucker <arthur.zucker@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Donggeun Yu authored
* Update ms_deform_attn_cuda.cu * Update ms_deform_attn_cuda.cuh * Update modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update modeling_deformable_detr.py * python utils/check_copies.py --fix_and_overwrite * Fix dtype missmatch error * Update test_modeling_deformable_detr.py * Update test_modeling_deformable_detr.py * Update modeling_deformable_detr.py * Update modeling_deformable_detr.py * Support DeformableDETR with bfloat16 * Add test code * Use AT_DISPATCH_FLOATING_TYPES_AND2 Use AT_DISPATCH_FLOATING_TYPES_AND2 * Update tests/models/deformable_detr/test_modeling_deformable_detr.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/deformable_detr/test_modeling_deformable_detr.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix not found require_torch_bf16 function --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Nick DeGroot authored
*
🐛 Fix oneformer instance post processing when using panoptic task type *✅ Add unit test for oneformer instance post processing panoptic bug --------- Co-authored-by:Nick DeGroot <1966472+nickthegroot@users.noreply.github.com>
-
- 01 Mar, 2024 1 commit
-
-
Marc Sun authored
-