- 18 Oct, 2021 3 commits
-
-
Patrick von Platen authored
-
Anton Lozhkov authored
-
Patrick von Platen authored
* up * up * up * finish
-
- 16 Oct, 2021 1 commit
-
-
Suraj Patil authored
-
- 15 Oct, 2021 1 commit
-
-
Anton Lozhkov authored
* Working encoder * SEW-D and tests * Further conv fixes * Automodels and conv inits * Update integration tests, add docs * Docs cleanup, resolve todos * Conf fix * Fix docs * Fix tests, apply suggestions * Update src/transformers/models/sew/modeling_sew.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Model conversion and updated no-mask tests * Remove copy of feature_proj * Style * Update src/transformers/models/auto/feature_extraction_auto.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/models/auto/feature_extraction_auto.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Move orgs Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 14 Oct, 2021 5 commits
-
-
Lysandre Debut authored
* Scatter dummies + skip pipeline tests * Add torch scatter to build docs
-
Patrick von Platen authored
-
Lysandre Debut authored
-
Sylvain Gugger authored
* Add strong test for configuration attributes * Add fake modif to trigger all tests * Add a better fake modif * Ignore is_encoder_decoder * Fix faulty configs * Remove fake modif
-
Patrick von Platen authored
-
- 13 Oct, 2021 1 commit
-
-
NielsRogge authored
* First draft * Update self-attention of RoBERTa as proposition * Improve conversion script * Add TrOCR decoder-only model * More improvements * Make forward pass with pretrained weights work * More improvements * Some more improvements * More improvements * Make conversion work * Clean up print statements * Add documentation, processor * Add test files * Small improvements * Some more improvements * Make fix-copies, improve docs * Make all vision encoder decoder model tests pass * Make conversion script support other models * Update URL for OCR image * Update conversion script * Fix style & quality * Add support for the large-printed model * Fix some issues * Add print statement for debugging * Add print statements for debugging * Make possible fix for sinusoidal embedding * Further debugging * Potential fix v2 * Add more print statements for debugging * Add more print statements for debugging * Deubg more * Comment out print statements * Make conversion of large printed model possible, address review comments * Make it possible to convert the stage1 checkpoints * Clean up code, apply suggestions from code review * Apply suggestions from code review, use Microsoft models in tests * Rename encoder_hidden_size to cross_attention_hidden_size * Improve docs
-
- 12 Oct, 2021 3 commits
-
-
Yih-Dar authored
* Add cross attentions to TFGPT2Model * Add TFEncoderDecoderModel * Add TFBaseModelOutputWithPoolingAndCrossAttentions * Add cross attentions to TFBertModel * Fix past or past_key_values argument issue * Fix generation * Fix save and load * Add some checks and comments * Clean the code that deals with past keys/values * Add kwargs to processing_inputs * Add serving_output to TFEncoderDecoderModel * Some cleaning + fix use_cache value issue * Fix tests + add bert2bert/bert2gpt2 tests * Fix more tests * Ignore crossattention.bias when loading GPT2 weights into TFGPT2 * Fix return_dict_in_generate in tf generation * Fix is_token_logit_eos_token bug in tf generation * Finalize the tests after fixing some bugs * Fix another is_token_logit_eos_token bug in tf generation * Add/Update docs * Add TFBertEncoderDecoderModelTest * Clean test script * Add TFEncoderDecoderModel to the library * Add cross attentions to TFRobertaModel * Add TFRobertaEncoderDecoderModelTest * make style * Change the way of position_ids computation * bug fix * Fix copies in tf_albert * Remove some copied from and apply some fix-copies * Remove some copied * Add cross attentions to some other TF models * Remove encoder_hidden_states from TFLayoutLMModel.call for now * Make style * Fix TFRemBertForCausalLM * Revert the change to longformer + Remove copies * Revert the change to albert and convbert + Remove copies * make quality * make style * Add TFRembertEncoderDecoderModelTest * make quality and fix-copies * test TFRobertaForCausalLM * Fixes for failed tests * Fixes for failed tests * fix more tests * Fixes for failed tests * Fix Auto mapping order * Fix TFRemBertEncoder return value * fix tf_rembert * Check copies are OK * Fix missing TFBaseModelOutputWithPastAndCrossAttentions is not defined * Add TFEncoderDecoderModelSaveLoadTests * fix tf weight loading * check the change of use_cache * Revert the change * Add missing test_for_causal_lm for TFRobertaModelTest * Try cleaning past * fix _reorder_cache * Revert some files to original versions * Keep as many copies as possible * Apply suggested changes - Use raise ValueError instead of assert * Move import to top * Fix wrong require_torch * Replace more assert by raise ValueError * Add test_pt_tf_model_equivalence (the test won't pass for now) * add test for loading/saving * finish * finish * Remove test_pt_tf_model_equivalence * Update tf modeling template * Remove pooling, added in the prev. commit, from MainLayer * Update tf modeling test template * Move inputs["use_cache"] = False to modeling_tf_utils.py * Fix torch.Tensor in the comment * fix use_cache * Fix missing use_cache in ElectraConfig * Add a note to from_pretrained * Fix style * Change test_encoder_decoder_save_load_from_encoder_decoder_from_pt * Fix TFMLP (in TFGPT2) activation issue * Fix None past_key_values value in serving_output * Don't call get_encoderdecoder_model in TFEncoderDecoderModelTest.test_configuration_tie until we have a TF checkpoint on Hub * Apply review suggestions - style for cross_attns in serving_output * Apply review suggestions - change assert + docstrings * break the error message to respect the char limit * deprecate the argument past * fix docstring style * Update the encoder-decoder rst file * fix Unknown interpreted text role "method" * fix typo Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* correct long to bool * up * correct code
-
Mishig Davaadorj authored
-
- 11 Oct, 2021 4 commits
-
-
Patrick von Platen authored
* adapt wav2vec2 * add example * add files * adapt * remove bogus file * Apply suggestions from code review * adapt files more * upload changes * del old files * up * up * up * up * up * correct gradient checkpoitning * add readme * finish * finish * up * more fixes * up * up * add demo run to readme * up
-
Luis F. Talavera R authored
-
Patrick von Platen authored
[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled (#13961) * up * correct test
-
Sylvain Gugger authored
* Honor existing attention mask in tokenzier.pad * Fix initialization of attention mask * Roll the implem on all subclasses * Fix tests
-
- 08 Oct, 2021 3 commits
-
-
Patrick von Platen authored
* up * Update src/transformers/generation_stopping_criteria.py * finish
-
Nicolas Patry authored
* Adding support for tokens being suffixes or part of each other. * Better test name.
-
Mishig Davaadorj authored
* Implement img seg pipeline * Update src/transformers/pipelines/image_segmentation.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update src/transformers/pipelines/image_segmentation.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * Update output shape with individual masks * Rm dev change * Remove loops in test Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
- 07 Oct, 2021 2 commits
-
-
Matt authored
* Fix issues with LED model * Style pass * Bugfixes * correct attentions as well Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* up * overwrite hubert
-
- 06 Oct, 2021 2 commits
-
-
Nicolas Patry authored
Fixes #13846
-
Nicolas Patry authored
Co-authored-by:
Pierre Snell <pierre.snell@botpress.com> Co-authored-by:
Pierre Snell <pierre.snell@botpress.com>
-
- 05 Oct, 2021 5 commits
-
-
Nicolas Patry authored
* Tmp. * Fixing BC for question answering with long context. * Capping model_max_length to avoid tf overflow. * Bad workaround bugged roberta. * Fixing name.
-
Zhaofeng Wu authored
* Allow dataset to be an optional argument for (Distributed)LengthGroupedSampler * Fix
-
Michael Benayoun authored
* Symbolic trace dynamic axes support for BERT like models (albert, bert, distilbert, mobilebert, electra, megatron-bert) * Sanity checks before tracing that make sure the model to trace is supported * Adapted to PyTorch 1.9 Co-authored-by:Michael Benayoun <michael@huggingface.co>
-
Nicolas Patry authored
* Fixing empty prompts for text-generation when BOS exists. * Fixing odd case with Pegasus. * Fixing Bert is Assertion Error.
-
Nicolas Patry authored
-
- 04 Oct, 2021 2 commits
-
-
Bram Vanroy authored
* update no_* argument Changes the order so that the no_* argument is created after the original argument AND sets the default for this no_* argument to False * import copy * update test * make style * Use kwargs to set default=False * make style
-
Sidd Karamcheti authored
* Add layer-wise scaling * Add reorder & upcasting argument * Add OpenAI GPT-2 weight initialization scheme * start `layer_idx` count at zero for consistency * disentangle attn and reordered and upscaled attn function * rename `scale_attn_by_layer` to `scale_attn_by_layer_id` * make autocast from amp compatible with pytorch<1.6 * fix docstring * style fixes * Add fixes from PR feedback, style tweaks * Fix doc whitespace * Reformat * First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests * Rename scale_attn_by_layer_idx, add tip * Remove extra newline * add test for weight initialization * update code format * add assert check weights are fp32 * remove assert * Fix incorrect merge * Fix shape mismatch in baddbmm * Add generation test for Mistral flags Co-authored-by:
leandro <leandro.vonwerra@spoud.io> Co-authored-by:
Keshav Santhanam <keshav2@stanford.edu> Co-authored-by:
J38 <jebolton@stanford.edu>
-
- 30 Sep, 2021 2 commits
-
-
Suraj Patil authored
-
Patrick von Platen authored
* update * add to docs and init * make fix-copies
-
- 29 Sep, 2021 2 commits
-
-
Sylvain Gugger authored
* Fix length of IterableDatasetShard and add test * Add comments
-
Li-Huai (Allan) Lin authored
* Enable readme link synchronization * Style * Reuse regex pattern * Apply suggestions * Update
-
- 26 Sep, 2021 1 commit
-
-
Anton Lozhkov authored
-
- 25 Sep, 2021 1 commit
-
-
Patrick von Platen authored
-
- 24 Sep, 2021 2 commits
-
-
Patrick von Platen authored
-
Nicolas Patry authored
Fixes #13697
-