- 17 Oct, 2022 2 commits
-
-
Matt authored
* Update checkpoints to point to `facebook/` * make fixup
-
Matt authored
* Partial TF port for ESM model * Add ESM-TF tests * Add the various imports for TF-ESM * TF weight conversion almost ready * Stop ignoring the decoder weights in PT * Add tests and lots of fixes * fix-copies * Fix imports, add model docs * Add get_vocab() to tokenizer * Fix vocab links for pretrained files * Allow multiple inputs with a sep * Use EOS as SEP token because ESM vocab lacks SEP * Correctly return special tokens mask from ESM tokenizer * make fixup * Stop testing unsupported embedding resizing * Handle TF bias correctly * Skip all models with slow tokenizers in the token classification test * Fixing the batch/unbatcher of pipelines to accomodate the `None` being passed around. * Fixing pipeline bug caused by slow tokenizer being different. * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/esm/modeling_tf_esm.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update set_input_embeddings and the copyright notices Co-authored-by:
Your Name <you@example.com> Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
- 30 Sep, 2022 1 commit
-
-
Matt authored
* Rebase ESM PR and update all file formats * Fix test relative imports * Add __init__.py to the test dir * Disable gradient checkpointing * Remove references to TFESM... FOR NOW >:| * Remove completed TODOs from tests * Convert docstrings to mdx, fix-copies from BERT * fix-copies for the README and index * Update ESM's __init__.py to the modern format * Add to _toctree.yml * Ensure we correctly copy the pad_token_id from the original ESM model * Ensure we correctly copy the pad_token_id from the original ESM model * Tiny grammar nitpicks * Make the layer norm after embeddings an optional flag * Make the layer norm after embeddings an optional flag * Update the conversion script to handle other model classes * Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py * Break the copied from link from BertModel.forward to remove token_type_ids * Remove debug array saves * Begin ESM-2 porting * Add a hacky workaround for the precision issue in original repo * Code cleanup * Remove unused checkpoint conversion code * Remove unused checkpoint conversion code * Fix copyright notices * Get rid of all references to the TF weights conversion * Remove token_type_ids from the tests * Fix test code * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add credit * Remove _ args and __ kwargs in rotary embedding * Assertively remove asserts * Replace einsum with torch.outer() * Fix docstring formatting * Remove assertions in tokenization * Add paper citation to ESMModel docstring * Move vocab list to single line * Remove ESMLayer from init * Add Facebook copyrights * Clean up RotaryEmbedding docstring * Fix docstring formatting * Fix docstring for config object * Add explanation for new config methods * make fix-copies * Rename all the ESM- classes to Esm- * Update conversion script to allow pushing to hub * Update tests to point at my repo for now * Set config properly for tests * Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted * make fixup * Update expected values for slow tests * make fixup * Remove EsmForCausalLM for now * Remove EsmForCausalLM for now * Fix padding idx test * Updated README and docs with ESM-1b and ESM-2 separately (#19221) * Updated README and docs with ESM-1b and ESM-2 separately * Update READMEs, longer entry with 3 citations * make fix-copies Co-authored-by:
Your Name <you@example.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Tom Sercu <tsercu@fb.com> Co-authored-by:
Your Name <you@example.com>
-
- 14 Sep, 2022 1 commit
-
-
Sylvain Gugger authored
-
- 13 Sep, 2022 1 commit
-
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 05 Sep, 2022 1 commit
-
-
Sofia Oliveira authored
* Add type hints to XLM-Roberta-XL models * Format
-
- 03 Aug, 2022 1 commit
-
-
LSinev authored
Comparisons like version.parse(torch.__version__) > version.parse("1.6") are True for torch==1.6.0+cu101 or torch==1.6.0+cpu version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py
-
- 12 May, 2022 1 commit
-
-
Sylvain Gugger authored
* Black preview * Fixup too! * Fix check copies * Use the same version as the CI * Bump black
-
- 04 May, 2022 1 commit
-
-
karthikrangasai authored
* Type hint complete Albert model file. * Update typing. * Update src/transformers/models/albert/modeling_albert.py Co-authored-by:Matt <Rocketknight1@users.noreply.github.com>
-
- 03 May, 2022 1 commit
-
-
Pavel Belevich authored
-
- 12 Apr, 2022 1 commit
-
-
Anmol Joshi authored
* Moved functions to pytorch_utils.py * isort formatting * Reverted tf changes * isort, make fix-copies * documentation fix * Fixed Conv1D import * Reverted research examples file * backward compatibility for pytorch_utils * missing import * isort fix
-
- 31 Mar, 2022 1 commit
-
-
chenbohua3 authored
* make tuple annotation more specific to avoid failures during symbolic_trace * make tuple annotation more specific to avoid failures during symbolic_trace
-
- 25 Mar, 2022 1 commit
-
-
Sylvain Gugger authored
* Big file_utils cleanup * This one still needs to be treated separately
-
- 23 Mar, 2022 1 commit
-
-
Sylvain Gugger authored
* Split file_utils in several submodules * Fixes * Add back more objects * More fixes * Who exactly decided to import that from there? * Second suggestion to code with code review * Revert wront move * Fix imports * Adapt all imports * Adapt all imports everywhere * Revert this import, will fix in a separate commit
-
- 22 Mar, 2022 1 commit
-
-
Jacob Dineen authored
* undo black autoformat * minor fix to rembert forward with default * make fix-copies, make quality * Adding types to template model * Removing List from the template types * Remove `Optional` from a couple of types that don't accept `None` Co-authored-by:matt <rocketknight1@gmail.com>
-
- 11 Mar, 2022 1 commit
-
-
Matt authored
* Add type annotations for BERT and copies * make fixup
-
- 07 Feb, 2022 1 commit
-
-
Michael Benayoun authored
* Change the way tracing happens, enabling dynamic axes out of the box * Update the tests and modeling xlnet * Add the non recoding of leaf modules to avoid recording more values for the methods to record than what will be seen at tracing time (which would otherwise desynchronize the recorded values and the values that need to be given to the proxies during tracing, causing errors). * Comments and making tracing work for gpt-j and xlnet * Refactore things related to num_choices (and batch_size, sequence_length) * Update fx to work on PyTorch 1.10 * Postpone autowrap_function feature usage for later * Add copyrights * Remove unnecessary file * Fix issue with add_new_model_like * Apply suggestions
-
- 29 Jan, 2022 1 commit
-
-
Soonhwan-Kwon authored
* add xlm roberta xl * add convert xlm xl fairseq checkpoint to pytorch * fix init and documents for xlm-roberta-xl * fix indention * add test for XLM-R xl,xxl * fix model hub name * fix some stuff * up * correct init * fix more * fix as suggestions * add torch_device * fix default values of doc strings * fix leftovers * merge to master * up * correct hub names * fix docs * fix model * up * finalize * last fix * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add copied from * make style Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 28 Dec, 2021 1 commit
-
-
Sylvain Gugger authored
* Fix bad examples * Add black formatting to style_doc * Use first nonempty line * Put it at the right place * Don't add spaces to empty lines * Better templates * Deal with triple quotes in docstrings * Result of style_doc * Enable mdx treatment and fix code examples in MDXs * Result of doc styler on doc source files * Last fixes * Break copy from
-
- 27 Dec, 2021 1 commit
-
-
Sylvain Gugger authored
* New doc styler * Fix issue with args at the start * Code sample fixes * Style code examples in MDX * Fix more patterns * Typo * Typo * More patterns * Do without black for now * Get more info in error * Docstring style * Re-enable check * Quality * Fix add_end_docstring decorator * Fix docstring
-
- 21 Dec, 2021 1 commit
-
-
Sylvain Gugger authored
* Convert file_utils docstrings to Markdown * Test on BERT * Return block indent * Temporarily disable doc styler * Remove from quality checks as well * Remove doc styler mess * Remove check from circleCI * Fix typo * Convert file_utils docstrings to Markdown * Test on BERT * Return block indent * Temporarily disable doc styler * Remove from quality checks as well * Remove doc styler mess * Remove check from circleCI * Fix typo * Let's go on all other model files * Add templates too * Styling and quality
-
- 30 Nov, 2021 1 commit
-
-
Thomas Viehmann authored
* use functional interface instead of instantiating module and immediately calling it * fix torch.nn.functional to nn.functional. Thank you Stas!
-
- 18 Nov, 2021 2 commits
-
-
Stas Bekman authored
* fix early device assignment * more models
-
Sylvain Gugger authored
* Add a post init method to all models * Fix tests * Fix last tests * Fix templates * Add comment * Forgot to save
-
- 09 Nov, 2021 1 commit
-
-
Patrick von Platen authored
* [Bert2Bert] allow bert2bert + relative embeddings * up * Update README_ko.md * up * up
-
- 15 Oct, 2021 1 commit
-
-
Patrick von Platen authored
* up * finish * up * up * finish
-
- 11 Oct, 2021 1 commit
-
-
Lahfa Samy authored
Replace assert by ValueError of src/transformers/models/electra/modeling_{electra,tf_electra}.py and all other models that had copies (#13955) * Replace all assert by ValueError in src/transformers/models/electra * Reformat with black to pass check_code_quality test * Change some assert to ValueError of modeling_bert & modeling_tf_albert * Change some assert in multiples models * Change multiples models assertion to ValueError in order to validate check_code_style test and models template test. * Black reformat * Change some more asserts in multiples models * Change assert to ValueError in modeling_layoutlm.py to fix copy error in code_style_check * Add proper message to ValueError in modeling_tf_albert.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in models/bert/modeling_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add ValueError message to models/convbert/modeling_tf_convbert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add error message for ValueError to modeling_tf_electra.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in models/tapas/modeling_tapas.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in models/electra/modeling_electra.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add ValueError message in src/transformers/models/bert/modeling_tf_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in src/transformers/models/rembert/modeling_rembert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in src/transformers/models/albert/modeling_albert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 22 Sep, 2021 1 commit
-
-
Sylvain Gugger authored
* Make gradient_checkpointing a training argument * Update src/transformers/modeling_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Fix tests * Style * document Gradient Checkpointing as a performance feature * Small rename * PoC for not using the config * Adapt BC to new PoC * Forgot to save * Rollout changes to all other models * Fix typo Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas@stason.org>
-
- 31 Aug, 2021 1 commit
-
-
Jongheon Kim authored
Set missing seq_length variable when using inputs_embeds with ALBERT & Remove code duplication (#13152) * Set seq_length variable when using inputs_embeds * remove code duplication
-
- 06 Aug, 2021 1 commit
-
-
Sylvain Gugger authored
* Fix tied weights on TPU * Manually tie weights in no trainer examples * Fix for test * One last missing * Gettning owned by my scripts * Address review comments * Fix test * Fix tests * Fix reformer tests
-
- 03 Aug, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 26 Jul, 2021 1 commit
-
-
Philip May authored
* add classifier_dropout to Electra * no type annotations yet Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add classifier_dropout to Electra * add classifier_dropout to Electra ForTokenClass. * add classifier_dropout to bert * add classifier_dropout to roberta * add classifier_dropout to big_bird * add classifier_dropout to mobilebert * empty commit to trigger CI * add classifier_dropout to reformer * add classifier_dropout to ConvBERT * add classifier_dropout to Albert * add classifier_dropout to Albert Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 01 Jul, 2021 1 commit
-
-
Stas Bekman authored
* fix lm_head.decoder.weight ignore_key handling * fix the mutable class variable * Update src/transformers/models/roberta/modeling_roberta.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * replicate the comment * make deterministic Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 22 Jun, 2021 1 commit
-
-
Hamid Shojanazeri authored
* registering a buffer for token_type_ids, to pass the error of device-id getting hardcoded when tracing * sytle format * adding persistent flag to the resgitered buffers that prevent from adding them to the state_dict and addresses the Backward compatibility issue * adding the try catch to the fix as persistent flag is only available from PT >1.6 * adding version check * added the condition to only use the token_type_ids buffer when its autogenerated not passed by user * adding comments and making the conidtion where token_type_ids are None to use the registered buffer * taking out position-embeddding from the if block * adding comments * handling the case if buffer for position_ids was not registered * reverted the changes on position_ids, fix the issue with size of token_type_ids buffer, moved the modification for generated token_type_ids to Bertmodel, instead of Embeddings * reverting the token_type_ids in case of None to the previous version * reverting changes on position_ids adding back the if block * changes added by running make fix-copies * changes added by running make fix-copies and added the import version as it was getting used * changes added by running make fix-copies * changes added by running make fix-copies * fixing the import format * fixing the import format * modified to use temp tensor for trimed and expanded token_type_ids buffer * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * changes made by fix-copies after temp tensor modifications * clean up * clean up * clean up * clean up * Nit * Nit * Nit * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * modified according to support device conversion on traced models * changes based on latest in master * Adapt templates * Add version import Co-authored-by:
Ubuntu <ubuntu@ip-172-31-32-81.us-west-2.compute.internal> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 14 Jun, 2021 1 commit
-
-
Stas Bekman authored
* consistent nn. and nn.functional * fix glitch * fix glitch #2
-
- 07 Jun, 2021 1 commit
-
-
Fran莽ois Lagunas authored
* Fixing bug that appears when using distilation (and potentially other uses). During backward pass Pytorch complains with: RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails. * Fixing all models QA clamp_ bug.
-
- 01 Jun, 2021 1 commit
-
-
Fan Zhang authored
* modify qa-trainer * fix flax model
-
- 20 May, 2021 1 commit
-
-
Sylvain Gugger authored
* Fix regression in regression * Add test
-
- 04 May, 2021 1 commit
-
-
abhishek thakur authored
* add to bert * review comments * Update src/transformers/configuration_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * self.config.problem_type * fix style * fix * fin * fix * update doc * fix * test * Test more problem types * Update src/transformers/configuration_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix * remove * fix * quality * make fix-copies * remove test Co-authored-by:
abhishek thakur <abhishekkrthakur@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 26 Apr, 2021 1 commit
-
-
Patrick von Platen authored
-