- 12 Oct, 2021 6 commits
-
-
Yih-Dar authored
* Add cross attentions to TFGPT2Model * Add TFEncoderDecoderModel * Add TFBaseModelOutputWithPoolingAndCrossAttentions * Add cross attentions to TFBertModel * Fix past or past_key_values argument issue * Fix generation * Fix save and load * Add some checks and comments * Clean the code that deals with past keys/values * Add kwargs to processing_inputs * Add serving_output to TFEncoderDecoderModel * Some cleaning + fix use_cache value issue * Fix tests + add bert2bert/bert2gpt2 tests * Fix more tests * Ignore crossattention.bias when loading GPT2 weights into TFGPT2 * Fix return_dict_in_generate in tf generation * Fix is_token_logit_eos_token bug in tf generation * Finalize the tests after fixing some bugs * Fix another is_token_logit_eos_token bug in tf generation * Add/Update docs * Add TFBertEncoderDecoderModelTest * Clean test script * Add TFEncoderDecoderModel to the library * Add cross attentions to TFRobertaModel * Add TFRobertaEncoderDecoderModelTest * make style * Change the way of position_ids computation * bug fix * Fix copies in tf_albert * Remove some copied from and apply some fix-copies * Remove some copied * Add cross attentions to some other TF models * Remove encoder_hidden_states from TFLayoutLMModel.call for now * Make style * Fix TFRemBertForCausalLM * Revert the change to longformer + Remove copies * Revert the change to albert and convbert + Remove copies * make quality * make style * Add TFRembertEncoderDecoderModelTest * make quality and fix-copies * test TFRobertaForCausalLM * Fixes for failed tests * Fixes for failed tests * fix more tests * Fixes for failed tests * Fix Auto mapping order * Fix TFRemBertEncoder return value * fix tf_rembert * Check copies are OK * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined * Add TFEncoderDecoderModelSaveLoadTests * fix tf weight loading * check the change of use_cache * Revert the change * Add missing test_for_causal_lm for TFRobertaModelTest * Try cleaning past * fix _reorder_cache * Revert some files to original versions * Keep as many copies as possible * Apply suggested changes - Use raise ValueError instead of assert * Move import to top * Fix wrong require_torch * Replace more assert by raise ValueError * Add test_pt_tf_model_equivalence (the test won't pass for now) * add test for loading/saving * finish * finish * Remove test_pt_tf_model_equivalence * Update tf modeling template * Remove pooling, added in the prev. commit, from MainLayer * Update tf modeling test template * Move inputs["use_cache"] = False to modeling_tf_utils.py * Fix torch.Tensor in the comment * fix use_cache * Fix missing use_cache in ElectraConfig * Add a note to from_pretrained * Fix style * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt * Fix TFMLP (in TFGPT2) activation issue * Fix None past_key_values value in serving_output * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub * Apply review suggestions - style for cross_attns in serving_output * Apply review suggestions - change assert + docstrings * break the error message to respect the char limit * deprecate the argument past * fix docstring style * Update the encoder-decoder rst file * fix Unknown interpreted text role "method" * fix typo Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Nicolas Patry authored
384 // 4 < 128 would break `doc_stride`.
-
Patrick von Platen authored
* correct long to bool * up * correct code
-
Mishig Davaadorj authored
-
Hardian Lawi authored
-
Lysandre Debut authored
-
- 11 Oct, 2021 10 commits
-
-
Patrick von Platen authored
* adapt wav2vec2 * add example * add files * adapt * remove bogus file * Apply suggestions from code review * adapt files more * upload changes * del old files * up * up * up * up * up * correct gradient checkpoitning * add readme * finish * finish * up * more fixes * up * up * add demo run to readme * up
-
Lahfa Samy authored
Replace assert by ValueError of src/transformers/models/electra/modeling_{electra,tf_electra}.py and all other models that had copies (#13955) * Replace all assert by ValueError in src/transformers/models/electra * Reformat with black to pass check_code_quality test * Change some assert to ValueError of modeling_bert & modeling_tf_albert * Change some assert in multiples models * Change multiples models assertion to ValueError in order to validate check_code_style test and models template test. * Black reformat * Change some more asserts in multiples models * Change assert to ValueError in modeling_layoutlm.py to fix copy error in code_style_check * Add proper message to ValueError in modeling_tf_albert.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in models/bert/modeling_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add ValueError message to models/convbert/modeling_tf_convbert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add error message for ValueError to modeling_tf_electra.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in models/tapas/modeling_tapas.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in models/electra/modeling_electra.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add ValueError message in src/transformers/models/bert/modeling_tf_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in src/transformers/models/rembert/modeling_rembert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Simplify logic in src/transformers/models/albert/modeling_albert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Lukas Weiner authored
-
Sylvain Gugger authored
-
Midhun R Nair authored
-
Luis F. Talavera R authored
-
Jungwoo Park authored
-
Patrick von Platen authored
[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled (#13961) * up * correct test
-
Sylvain Gugger authored
* Honor existing attention mask in tokenzier.pad * Fix initialization of attention mask * Roll the implem on all subclasses * Fix tests
-
Lahfa Samy authored
* Raise ValueError exception instead of assert * Remove f unnecessary f-strings * Remove unused f-strings
-
- 09 Oct, 2021 1 commit
-
-
oraby8 authored
-
- 08 Oct, 2021 11 commits
-
-
Lysandre Debut authored
* Update bug-report.md * Update .github/ISSUE_TEMPLATE/bug-report.md Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update .github/ISSUE_TEMPLATE/bug-report.md Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update .github/ISSUE_TEMPLATE/bug-report.md Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update .github/ISSUE_TEMPLATE/bug-report.md Co-authored-by:
Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Anton Lozhkov <aglozhkov@gmail.com>
-
Chungman Lee authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Patrick von Platen authored
* up * Update src/transformers/generation_stopping_criteria.py * finish
-
Sylvain Gugger authored
-
Adam Kaczmarek authored
-
Stella Biderman authored
* Added `framework` attribute * Update modeling_utils.py * Update modeling_flax_utils.py * Update modeling_tf_utils.py * Update modeling_utils.py * Update modeling_tf_utils.py * Update modeling_tf_utils.py * Update modeling_flax_utils.py * Update modeling_tf_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_tf_utils.py * Update modeling_flax_utils.py * string -> str * Update modeling_tf_utils.py * string -> str * fixup * make flake happy Co-authored-by:patil-suraj <surajp815@gmail.com>
-
Nicolas Patry authored
* Adding support for tokens being suffixes or part of each other. * Better test name.
-
Mishig Davaadorj authored
* Implement img seg pipeline * Update src/transformers/pipelines/image_segmentation.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/pipelines/image_segmentation.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update output shape with individual masks * Rm dev change * Remove loops in test Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
- 07 Oct, 2021 8 commits
-
-
Stas Bekman authored
* [trainer] memory metrics: add memory at start * fix for no-gpu
-
Matt authored
* Fix issues with LED model * Style pass * Bugfixes * correct attentions as well Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Mishig Davaadorj authored
-
Patrick von Platen authored
* up * overwrite hubert
-
Alex Hedges authored
-
Dhananjay Shettigar authored
* #12789 Replace assert statements with exceptions * fix-copies: made copy changes to utils_qa.py in examples/pytorch/question-answering and examples/tensorflow/question-answering * minor refactor for clarity
-
Jay Zhang authored
* Add all example files. * Reformat files by black. * Style. * Remove unused imports. Co-authored-by:Morgan Funtowicz <funtowiczmo@gmail.com>
-
Максим Заякин authored
-
- 06 Oct, 2021 4 commits
-
-
Lysandre authored
-
Anton Lozhkov authored
-
Sylvain Gugger authored
-
Yanming Wang authored
* Fix logging_nan_inf_filter in torch_xla mode * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix format Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-