- 28 Mar, 2024 1 commit
-
-
Minseo Kang authored
-
- 25 Mar, 2024 1 commit
-
-
Lysandre Debut authored
* [test_all] Remove static pretrained maps from the library's internals * Deprecate archive maps instead of removing them * Revert init changes * [test_all] Deprecate instead of removing * [test_all] PVT v2 support * [test_all] Tests should all pass * [test_all] Style * Address review comments * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * [test_all] trigger tests * [test_all] LLAVA * [test_all] Bad rebase --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 16 Feb, 2024 1 commit
-
-
Lysandre Debut authored
* Script & Manual edition * Update
-
- 01 Feb, 2024 1 commit
-
-
JB (Don) authored
* Adding [T5/MT5/UMT5]ForTokenClassification * Add auto mappings for T5ForTokenClassification and variants * Adding ForTokenClassification to the list of models * Adding attention_mask param to the T5ForTokenClassification test * Remove outdated comment in test * Adding EncoderOnly and Token Classification tests for MT5 and UMT5 * Fix typo in umt5 string * Add tests for all the existing MT5 models * Fix wrong comment in dependency_versions_table * Reverting change to common test for _keys_to_ignore_on_load_missing The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing. * Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model * Add fix-copies to MT5ModelTest
-
- 02 Nov, 2023 1 commit
-
-
Pietro Lesci authored
* remove redundant code * update * add typecasting * make `attention_mask` float again
-
- 01 Nov, 2023 1 commit
-
-
Lysandre Debut authored
Fix disk offload tests + weight sharing issues
-
- 27 Oct, 2023 1 commit
-
-
Younes Belkada authored
* fix * more fixes * fix other models * fix long t5 * use `gradient_checkpointing_func` instead * fix copies * set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * replace it with `is_gradient_checkpointing_set` * remove default * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 25 Oct, 2023 1 commit
-
-
Younes Belkada authored
* v1 * fix * remove `create_custom_forward` * fixup * fixup * add test and fix all failing GC tests * remove all remaining `create_custom_forward` methods * fix idefics bug * fixup * replace with `__call__` * add comment * quality
-
- 12 Oct, 2023 1 commit
-
-
Tom Aarsen authored
Add missing spaces in adjacent strings
-
- 11 Oct, 2023 1 commit
-
-
Billy Bradley authored
In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) (#25242) * In assisted decoding, pass model_kwargs to model's forward call Previously, assisted decoding would ignore any additional kwargs that it doesn't explicitly handle. This was inconsistent with other generation methods, which pass the model_kwargs through prepare_inputs_for_generation and forward the returned dict to the model's forward call. The prepare_inputs_for_generation method needs to be amended in all models, as previously it only kept the last input ID when a past_key_values was passed. * Improve variable names in _extend_attention_mask * Refactor extending token_type_ids into a function * Replace deepcopy with copy to optimize performance * Update new persimmon model with llama changes for assisted generation * Update new mistral model for assisted generation with prepare_inputs_for_generation * Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation
-
- 28 Sep, 2023 1 commit
-
-
fleance authored
Do not warn about unexpected decoder weights when loading T5EncoderModel and LongT5EncoderModel (#26211) Ignore decoder weights when using T5EncoderModel and LongT5EncoderModel Both T5EncoderModel and LongT5EncoderModel do not have any decoder layers, so loading a pretrained model checkpoint such as t5-small will give warnings about keys found in the model checkpoint that are not in the model itself. To prevent this log warning, r"decoder" has been added to _keys_to_ignore_on_load_unexpected for both T5EncoderModel and LongT5EncoderModel
-
- 25 Jul, 2023 1 commit
-
-
Sebastian Husch Lee authored
* Initial addition of t5forsequenceclassification * Adding imports and adding tests * Formatting * Running make fix-copies * Adding mt5forseq * Formatting * run make fix-copies * Adding to docs * Add model_parallel * Fix bug * Fix * Remove TODO * Fixing tests for T5ForSequenceClassification * Undo changes to dependency_versions_table.py * Change classification head to work with T5Config directly * Change seq length to let tests pass * PR comments for formatting * Formatting * Initial addition of UMT5ForSequenceClassification * Adding to inits and formatting * run make fix-copies * Add doc for UMT5ForSeqClass * Update UMT5 config * Fix docs * Skip torch fx test for SequenceClassification * Formatting * Add skip to UMT5 tests as well * Fix umt5 tests * Running make fix-copies * PR comments * Fix for change to sentence_representation * Rename seq_len to hidden_size since that's what it is * Use base_model to follow format of the rest of the library * Update docs * Extract the decoder_input_ids changes and make one liner * Make one-liner
-
- 10 Jul, 2023 1 commit
-
-
Sebastian Husch Lee authored
[`T5`] Adding model_parallel = False to `T5ForQuestionAnswering` and `MT5ForQuestionAnswering` (#24684) Adding model_parallel = False
-
- 27 Jun, 2023 2 commits
-
-
Sylvain Gugger authored
* Preliminary work on some models * Fix test load missing and make sure nonpersistent buffers are tested * Always ignore nonpersistent buffers if in state_dict * Treat models * More models * Treat remaining models * Fix quality * Fix tests * Remove draft * This test is not needed anymore * Fix copies * Fix last test * Newly added models * Fix last tests * Address review comments
-
Sebastian authored
* Adding T5ForQuestionAnswering * Changed weight initialization that results in better initial loss when fine-tuning * Update to class variables * Running make fixup * Running make fix-copies * Remove model_parallel * Adding MT5ForQuestionAnswering * Adding docs * Fix wrong doc * Update src/transformers/models/mt5/modeling_mt5.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * File formatting * Undoing change --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 22 Jun, 2023 1 commit
-
-
Younes Belkada authored
Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)" This reverts commit 285a4801.
-
- 21 Jun, 2023 1 commit
-
-
Younes Belkada authored
* fix gc bug * continue PoC on OPT * fixes * :exploding_head: * fix tests * remove pytest.mark * fixup * forward contrib credits from discussions * forward contrib credits from discussions * reverting changes on untouched files. --------- Co-authored-by:
zhaoqf123 <zhaoqf123@users.noreply.github.com> Co-authored-by:
7eu7d7 <7eu7d7@users.noreply.github.com>
-
- 13 Jun, 2023 1 commit
-
-
Sylvain Gugger authored
* First test * Add info for all models * style * Repo consistency * Fix last model and cleanup prints * Repo consistency * Use consistent function for detecting tied weights
-
- 12 May, 2023 1 commit
-
-
Susnato Dhar authored
replaced assert with raise ValueError for t5, switch_transformers, pix2struct, mt5, longt5, gptsan_japanese. (#23273) * replaced assert with raise ValueError * one liner * reverse one liner and cache-decoder check
-
- 20 Apr, 2023 1 commit
-
-
Aashiq Muhamed authored
-
- 03 Apr, 2023 1 commit
-
-
Younes Belkada authored
* enable PP for T5 * make fixup * fix failing tests
-
- 15 Mar, 2023 1 commit
-
-
Prathik Rao authored
* t5 remove data dependency * make style * make fix-copies --------- Co-authored-by:Prathik Rao <prathikrao@microsoft.com>
-
- 09 Mar, 2023 1 commit
-
-
Nipun Jindal authored
* [21737][T5]: Fix gradient checkpoint bug * [21737][T5]: Fix gradient checkpoint bug * [21737][T5]: Fix gradient checkpoint bug * Update src/transformers/models/mt5/modeling_mt5.py * Update src/transformers/models/t5/modeling_t5.py --------- Co-authored-by:
njindal <njindal@adobe.com> Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
- 28 Feb, 2023 1 commit
-
-
Younes Belkada authored
* fix torchquant issue * add tests
-
- 27 Feb, 2023 1 commit
-
-
Stas Bekman authored
* logger.warning_once * style
-
- 07 Feb, 2023 2 commits
-
-
Arthur authored
* fix past renamed to past_key_value * update more `past`that were ski^锚d * fixup * remove changes made to rag * refactor `_reorder_cache` to use `past_key_values` * fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache
-
Sylvain Gugger authored
* Deprecate parallelize API * Add documentation * Fix copies
-
- 06 Feb, 2023 1 commit
-
-
Sylvain Gugger authored
* Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies
-
- 24 Jan, 2023 1 commit
-
-
Younes Belkada authored
* attempts to fix: - upcast input for `T5DenseActDense` - add the condition `self.wo.weight.dtype != torch.int8` - added tests on `test/mixed_int8` - `make fixup` * fix ci test
-
- 23 Jan, 2023 1 commit
-
-
Sylvain Gugger authored
* Clean all models * Style * Last to remove * address review comments * Address review comments
-
- 08 Jan, 2023 1 commit
-
-
Arthur authored
* start cleanup * more updates * more models are affected * more updates * update generation utils * style * revert change that removed reorder cachce * update generation utils * style * style * remove reorder cache
-
- 03 Jan, 2023 1 commit
-
-
ivanllt authored
Fix start_docstring for deparallelize method
-
- 15 Dec, 2022 1 commit
-
-
Lars Mennen authored
* Workaround for #20287: FlanT5-XXL 8bit support * Make fix-copies * revert unrelated change * Dont apply to longt5 and switch transformers
-
- 13 Dec, 2022 1 commit
-
-
Younes Belkada authored
* add `keep_in_fp32_modules` support * pass it as class attribute * few modifs - make tests `slow` - fix logic * better logic * fix failing test * `bfloat16` support * Update src/transformers/modeling_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix * simplify tests * simplify tests * fix test * modify message * more checks * fix failing tests * add more conditions - add `is_accelerate_available` - fixes pipleine tests that failed * add suggestions * Update src/transformers/modeling_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix failing `bnb` test * add last safety checker Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 06 Dec, 2022 1 commit
-
-
Sourab Mangrulkar authored
* updating T5 and BART models to support Prefix Tuning * `make fix-copies` * address comments * address comments
-
- 09 Nov, 2022 1 commit
-
-
Nicolas Patry authored
* Attempting to test automatically the `_keys_to_ignore`. * Style. * First fix pass. * Moving test on its own. * Another batch. * Second round removing BatchNorm * Fixing layoutlmv{2,3} + support older Python. * Disable miss missing warning. * Removing dodgy additions. * Big pass. * mbart. * More corrections. * Fixup. * Updating test_correct_missing_keys * Add escape hatch for when the head has no extra params so doesn't need the missing keys check. * Fixing test. * Greener. * Green ! (except for weird splinter bug). * Adding a test about `named_parameters` usage. * Shorten message. * Apply suggestions from code review Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * After rebase modifications. * More explicit condition checking. * Fixing slow tests issues. * Remove extra pdb. * Remove print. * Attempt to make failure consistent + fixing roc_bert. * Removing the seed (all tests passing with it). Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 06 Sep, 2022 2 commits
-
-
Ekagra Ranjan authored
* use tokenizer to output tensor * add preprocessing for decoder_input_ids for bare T5Model * add preprocessing to tf and flax * linting * linting * Update src/transformers/models/t5/modeling_flax_t5.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/t5/modeling_tf_t5.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Had authored
* add position bias head masking if heads pruned * fix pruning function in t5 encoder * make style * make fix-copies * Revert added folder Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
- 27 Jul, 2022 1 commit
-
-
Yanming Wang authored
-
- 06 Jul, 2022 1 commit
-
-
ADAning authored
* Add ALL_LAYERNORM_LAYERS for LayerNorm * fix bug of appending layer norm
-