- 20 May, 2021 2 commits
-
-
yujun authored
* add roformer * Update docs/source/model_doc/roformer.rst Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update docs/source/model_doc/roformer.rst Co-authored-by:
Suraj Patil <surajp815@gmail.com> * update * add TFRoFormerSinusoidalPositionalEmbedding and fix TFMarianSinusoidalPositionalEmbedding * update docs * make style and make quality * roback * unchanged * rm copies from , this is a error in TFMarianSinusoidalPositionalEmbedding * update Copyright year * move # Add modeling imports here to the correct position * max_position_embeddings can be set to 1536 * # Copied from transformers.models.bert.modeling_bert.BertOutput with Bert->RoFormer * # Copied from transformers.models.bert.modeling_bert.BertLayer.__init__ with Bert->RoFormer * update tokenization_roformer * make style * add staticmethod apply_rotary_position_embeddings * add TF staticmethod apply_rotary_position_embeddings * update torch apply_rotary_position_embeddings * fix tf apply_rotary_position_embeddings error * make style * add pytorch RoFormerSelfAttentionRotaryPositionEmbeddingTest * add TF rotary_position_embeddings test * update test_modeling_rofomer * Update docs/source/model_doc/roformer.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/__init__.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/convert_roformer_original_tf_checkpoint_to_pytorch.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/modeling_roformer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/modeling_roformer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/roformer/modeling_tf_roformer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refact roformer tokenizer * add RoFormerTokenizerFast * add RoFormerTokenizationTest * add require_jieba * update Copyright * update tokenizer & add copy from * add option rotary_value * use rust jieba * use rjieba * use rust jieba * fix test_alignement_methods * slice normalized_string is too slow * add config.embedding_size when embedding_size!=hidden_size * fix pickle tokenizer * Update docs/source/model_doc/roformer.rst Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * make style and make quality Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Lysandre Debut authored
-
- 19 May, 2021 3 commits
-
-
Albert Villanova del Moral authored
-
Patrick von Platen authored
* refactor * update * update * update * refactor run mlm * finalize * refactor more * fix typo * update * finish refactor * modify run mlm * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review * small fixes * upload * upload * finish run mlm script Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @
-
- 18 May, 2021 10 commits
-
-
Daniel Stancl authored
* Add missing head masking for generate() function * Add head_mask, decoder_head_mask and cross_attn_head_mask into prepare_inputs_for_generation for generate() function for multiple encoder-decoder models. * Add test_genereate_with_head_masking * [WIP] Update the new test and handle special cases * make style * Omit ProphetNet test so far * make fix-copies
-
Suraj Patil authored
* flax gpt2 * combine masks * handle shared embeds * add causal LM sample * style * add tests * style * fix imports, docs, quality * don't use cache * add cache * add cache 1st version * make use cache work * start adding test for generation * finish generation loop compilation * rewrite test * finish * update * update * apply sylvains suggestions * update * refactor * fix typo Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Tomy Hsieh authored
-
Avital Oliver authored
* Add Flax Examples README * Apply suggestions from code review * Update examples/flax/README.md * add nice table * fix * fix * apply suggestions * upload * finish flax readme.md Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Philipp Schmid authored
* add `dataset_name` to data_args and added accuracy metric * added documentation for dataset_name * spelling correction
-
Vyom Pathak authored
* Fixed: Better names for nlp variables in pipelines' tests and docs. * Fixed: Better variable names
-
Patrick von Platen authored
* add headers to main doc * Apply suggestions from code review * update * upload
-
Tommy Chiang authored
-
Sylvain Gugger authored
-
Nicolas Patry authored
* [TokenClassification] Label realignment for subword aggregation Tentative to replace https://github.com/huggingface/transformers/pull/11622/files - Added `AggregationStrategy` - `ignore_subwords` and `grouped_entities` arguments are now fused into `aggregation_strategy`. It makes more sense anyway because `ignore_subwords=True` with `grouped_entities=False` did not have a meaning anyway. - Added 2 new ways to aggregate which are MAX, and AVERAGE - AVERAGE requires a bit more information than the others, for now this case is slightly specific, we should keep that in mind for future changes. - Testing has been modified to reflect new argument, and to check the correct deprecation and the new aggregation_strategy. - Put the testing argument and testing results for aggregation_strategy, close together, so that readers can understand what is supposed to happen. - `aggregate` is now only tested on a small model as it does not mean anything to test it globally for all models. - Previous tests are unchanged in desired output. - Added a new test case that showcases better the difference between the FIRST, MAX and AVERAGE strategies. * Wrong framework. * Addressing three issues. 1- Tags might not follow B-, I- convention, so any tag should work now (assumed as B-TAG) 2- Fixed an issue with average that leads to a substantial code change. 3- The testing suite was not checking for the "index" key for "none" strategy. This is now fixed. The issue is that "O" could not be chosen by AVERAGE strategy because those tokens were filtered out beforehand, so their relative scores were not counted in the average. Now filtering on ignore_labels will happen at the very end of the pipeline fixing that issue. It's a bit hard to make sure this stays like that because we do not have a end-to-end test for that behavior * Formatting. * Adding formatting to code + cleaner handling of B-, I- tags. Co-authored-by:
Francesco Rubbo <rubbo.francesco@gmail.com> Co-authored-by:
elk-cloner <rezakakhki.rk@gmail.com> * Typo. Co-authored-by:
Francesco Rubbo <rubbo.francesco@gmail.com> Co-authored-by:
elk-cloner <rezakakhki.rk@gmail.com>
-
- 17 May, 2021 7 commits
-
-
Patrick von Platen authored
-
Sylvain Gugger authored
-
Patrick von Platen authored
* improve tests * remove bogus file * make style Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
Michael Benayoun authored
Co-authored-by:Michael Benayoun <michael@huggingface.co>
-
Julien Chaumond authored
* Update README.md * Update index.rst
-
Julien Chaumond authored
-
Marc van Zee authored
* Add Cloud details to README * Flax script and readme updates * Some simplifications of Flax script
-
- 14 May, 2021 4 commits
-
-
Michael Benayoun authored
Symbolic tracing feature for BERT, ELECTRA and T5 Co-authored-by:
Michael Benayoun <michael@huggingface.co> Co-authored-by:
Stas Bekman <stas@stason.org> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Marc van Zee authored
* Add Cloud details to README * Flax script and readme updates
-
Patrick von Platen authored
-
Oyvind Tafjord authored
-
- 13 May, 2021 8 commits
-
-
Volodymyr Byno authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
lexhuismans authored
* Add 3D attention mask to T5 model (#9643) Added code for 3D attention mask in T5 model. Similar to BERT model. * Add test for 3D attention mask Added test for 3D attention mask: test_decoder_model_past_with_3d_attn_mask() 3D attention mask of the shape [Batch_size, Seq_length, Seq_length] both for attention mask and decoder attention mask. Test is passing.
-
Vasudev Gupta authored
-
Patrick von Platen authored
* fix some stuff * fix roberta & electra as well * del run bug Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
Lysandre Debut authored
-
Philip May authored
* improve slow class tok usage at xlm rob * add subword regularization for barthez * improve barthez tok. test * fix tokenizer tests * add subword regularization for camembert * add subword regularization for deberta v2 tokenizer * add more doc to deberta v2 tokenizer * add subword regularization for speech to text tok. * fix sp_model_kwargs type in speech 2 text tok. * add subword regularization for M2M100 tok. * add more concrete type hints * fix tests for m2m100 and s2t tok. * add missing Any import * fix syntax error in m2m100 tok. * fix unpickle of m2m100 and s2t tok. * fix test of m2m100 and s2t tok. * improve unpickle of deberta v2 tok. * add test for pickle of barthez & camembert * fix pickle of barthez & camembert * add test for deberta v2 tok. pickle * fix m2m100 tok. pickle * fix s2t tok. pickle * add subword regularization to albert tok. * refactor subword reg. test into TokenizerTesterMixin improve albert tok. test remove sample argument form albert tok. check subword reg. using TokenizerTesterMixin improve tok. tests improve xlm roberta tok. tests improve xlm roberta tok. tests * add subword regularization for big bird t. * improve xlm roberta tok. test * add subword regularization for mbart50 tok. * add subword regularization for pegasus tok. * add subword regularization for reformer tok. * add subword regularization for T5 tok. * fix t5 tok. test formatting * add subword regularization for xlm_proph. tok. * add subword regularization for xlnet tok. * add subword regularization for gert_gen tok. * add typing to tokenizers * add typing to xlm rob. tok * add subword regularization for marian tok. * add reverse tok. test * fix marian tok test * fix marian tok test * fix casing in tok. tests * fix style of tok. common test * fix deberta v2 tok test * add type annotations to tok. tests * add type annotations to tok. __init__ * add typing to kokenizer * add type annotations to tok. __init__ * don't specify the default when it's None * fix barthez tok. doc * move sentencepiece tok. tests to TokenizerTesterMixin * fix unused imports * fix albert tok. test * add comment to sentencepiece test options * fix Any import at big bird tok. * fix Any import at xlm prophetnet tok. * empty commit to trigger CI
-
- 12 May, 2021 6 commits
-
-
NielsRogge authored
* Improve docs of DeiT and ViT, add community notebook * Add gitignore for test_samples * Add notebook with Trainer Co-authored-by:Lysandre Debut <lysandre@huggingface.co>
-
Lysandre authored
-
Lysandre authored
-
Patrick von Platen authored
* fix encoder-decoder & RAG * finalize * Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/rag/modeling_rag.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Patrick von Platen <patrick@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Suraj Patil authored
-
Philip May authored
-