- 20 Jan, 2021 8 commits
-
-
acul3 authored
* Update run_mlm.py * add t5 model to transformers-cli convert * update rum_mlm.py same as master * update converting model docs * update converting model docs * Update convert.py * Trigger notification * update import sorted * fix typo t5
-
Julien Plu authored
-
Julien Plu authored
* Create new embeddings + add to BERT * Add Albert * Add DistilBert * Add Albert + Electra + Funnel * Add Longformer + Lxmert * Add last models * Apply style * Update the template * Remove unused imports * Rename attribute * Import embeddings in their own model file * Replace word_embeddings per weight * fix naming * Fix Albert * Fix Albert * Fix Longformer * Fix Lxmert Mobilebert and MPNet * Fix copy * Fix template * Update the get weights function * Update src/transformers/modeling_tf_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/electra/modeling_tf_electra.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * address Sylvain's comments Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Julien Plu authored
* Fix label datatype * Apply style
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
LSinev authored
-
Sylvain Gugger authored
* Restrain tokenizer.model_max_length default * Fix indent
-
- 19 Jan, 2021 13 commits
-
-
Sylvain Gugger authored
* Fix model templates and use less than 119 chars * Missing new line
-
Daniel Stancl authored
* Add decoder_head_mask for PyTorch T5 model * Add decoder_head_mask args into T5Model and T5ForConditionalGeneration * Slightly change the order of input args to be in accordance with the convention from BART-based models introduced within the PR #9569. * Make style for modeling_t5.py * Add decoder_head_mask for TF T5 models * Separate head_mask and decoder_head_mask args in TF T5 models * Slightly change the order of input args to follow convention of BART-based models updated in PR #9569 * Update test_forward_signature tests/test_modeling_tf_common.py w.r.t. the changed order of input args * Add FutureWarnings for T5 and TFT5 models * Add FutureWarnings for T5 and TFT5 models warning a user that input argument `head_mask` was split into two arguments - `head_mask` and `decoder_head_mask` * Add default behaviour - `decoder_head_mask` is set to copy `head_mask` * Fix T5 modeling and FutureWarning * Make proper usage of head_mask and decoder_head_mask in cross_attention * Fix conditions for raising FutureWarning * Reformat FutureWarning in T5 modeling * Refactor the warning message
-
Sylvain Gugger authored
* New run_seq2seq script * Add tests * Mark as slow * Update examples/seq2seq/run_seq2seq.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Update src/transformers/data/data_collator.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Address review comments Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
Julien Plu authored
* Fix Flaubert and XLM * Fix Flaubert and XLM * Apply style
-
max yue authored
File "/share/apps/anaconda3/envs/my_env/lib/python3.7/site-packages/transformers/integrations.py", line 419, in __init__ self._SummaryWriter = SummaryWriter UnboundLocalError: local variable 'SummaryWriter' referenced before assignment -
Yusuke Mori authored
* Update past_key_values in gpt2 (#9391) * Update generation_utils, and rename some items * Update modeling_gpt2 to avoid an error in gradient_checkpointing * Remove 'reorder_cache' from util and add variations to XLNet, TransfoXL, GPT-2 * Change the location of '_reorder_cache' in modeling files * Add '_reorder_cache' in modeling_ctrl * Fix a bug of my last commit in CTRL * Add '_reorder_cache' to GPT2DoubleHeadsModel * Manage 'use_cache' in config of test_modeling_gpt2 * Clean up the doc string * Update src/transformers/models/gpt2/modeling_gpt2.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix the doc string (GPT-2, CTRL) * improve gradient_checkpointing_behavior Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Sergey Mkrtchyan authored
* Fix the attention_mask in DPRReaderTokenizer * Add an integration test for DPRReader inference * Run make style
-
Patrick von Platen authored
-
- 18 Jan, 2021 3 commits
-
-
Daniel Stancl authored
* Add head_mask/decoder_head_mask for BART This branch implement head_mask and decoder_head_mask for BART-based models. Full list below: - BART - MBart - Blenderbot - BlenderbotSmall - Marian - Pegasus Everything is accompanied with updated testing. * Fix test_headmasking for BART models * Fix text_headmasking for BART-like models which has only 2 layers in each modules. The condition ``` self.assertNotEqual(attentions[1][..., 0, :, :].flatten().sum().item(), 0.0) ``` is, therefore, invalid for encoder-decoder models considering the `head_mask` ``` head_mask = torch.ones( self.model_tester.num_hidden_layers, self.model_tester.num_attention_heads, device=torch_device, ) head_mask[0, 0] = 0 head_mask[-1, :-1] = 0 ``` specified in the `test_headmasking` test/function. * Adjust test_modeling_common.py to reflect T5 input args * Update tests/test_modeling_common.py Co-authored-by:Lysandre Debut <lysandre@huggingface.co> * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * make style * make fix-copies Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Devrim authored
-
Anthony MOI authored
-
- 15 Jan, 2021 5 commits
-
-
Stas Bekman authored
-
Lysandre Debut authored
* Ignore lm_head decoder bias warning * Revert "Ignore lm_head decoder bias warning" This reverts commit f25177a9da6ca898e351f46c8b1515971de5c670. * predictions -> lm_head
-
Julien Plu authored
* Add warning * Remove unused import * Fix missing call * Fix missing call * Completely remove token_type_ids * Apply style * Remove unused import * Update src/transformers/models/mpnet/modeling_tf_mpnet.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Patrick von Platen authored
* fix tf led * remove loop file
-
Kiyoung Kim authored
This reverts commit 3f40070c.
-
- 14 Jan, 2021 11 commits
-
-
Stas Bekman authored
* [doc] install + 1-gpu deployment * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improvements Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Upstream (and rename) sortish sampler * Use proper sampler * Update src/transformers/trainer_pt_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Kiyoung Kim authored
* gradient accumulation for tftrainer * label naming Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * label naming Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Lysandre authored
-
Lysandre Debut authored
-
Lysandre Debut authored
* conda build -> conda-build * Syntax error * conda build -> conda-build + 4.2.0 * Prepare to merge in `master`
-
Stas Bekman authored
* note on how to get to deps from shell * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix text Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Julien Plu authored
-
Julien Plu authored
* Compliancy with tf-nightly * Add more version + restore min version check
-
Sylvain Gugger authored
* Switch metrics in run_ner to datasets * Add flag to return all metrics * Upstream (and rename) sortish_sampler * Revert "Upstream (and rename) sortish_sampler" This reverts commit e07d0dcf650c2bae36da011dd76c77a8bb4feb0d.
-
Sylvain Gugger authored
* Fix Trainer with a parallel model * More clean up
-