Commits · 3ebb1b3a2b7b7674ad179ef2c3cdeb9fbfeb023d · chenpangpang / transformers

22 Sep, 2020 12 commits

Release: v3.2.0 · 3ebb1b3a
Lysandre authored Sep 22, 2020

3ebb1b3a
Fixes for LayoutLM (#7318) · 01f0fd0b
Sylvain Gugger authored Sep 22, 2020

01f0fd0b

Create an XLA parameter and fix the mixed precision (#7311) · 702a76ff

Julien Plu authored Sep 22, 2020

* Create an XLA parameter and fix mixed precision creation

* Fix issue brought by intellisense

* Complete docstring

702a76ff

Support for Windows in check_copies (#7316) · 596342c2
Sylvain Gugger authored Sep 22, 2020

596342c2

Add possibility to evaluate every epoch (#7302) · 89edf504

Sylvain Gugger authored Sep 22, 2020



* Add possibility to evaluate every epoch

* Remove multitype arg

* Remove needless import

* Use a proper enum

* Apply suggestions from @LysandreJik
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* One else and formatting
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

89edf504

is_pretokenized -> is_split_into_words (#7236) · 21ca1480
Sylvain Gugger authored Sep 22, 2020
```
* is_pretokenized -> is_split_into_words

* Fix tests
```
21ca1480
Fix saving TF custom models (#7291) · 324f361e
Julien Plu authored Sep 22, 2020
```
* Fix #7277

* Apply style

* Add a full training pipeline test

* Apply style
```
324f361e

Add LayoutLM Model (#7064) · cd9a0585

Minghao Li authored Sep 22, 2020



* first version

* finish test docs readme model/config/tokenization class

* apply make style and make quality

* fix layoutlm GitHub link

* fix conflict in index.rst and add layoutlm to pretrained_models.rst

* fix bug in test_parents_and_children_in_mappings

* reformat modeling_auto.py and tokenization_auto.py

* fix bug in test_modeling_layoutlm.py

* Update docs/source/model_doc/layoutlm.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/layoutlm.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* remove inh, add tokenizer fast, and update some doc

* copy and rename necessary class from modeling_bert to modeling_layoutlm

* Update src/transformers/configuration_layoutlm.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/configuration_layoutlm.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/configuration_layoutlm.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/configuration_layoutlm.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_layoutlm.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_layoutlm.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_layoutlm.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* add mish to activations.py, import ACT2FN and import logging from utils
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

cd9a0585

Fix #7304 (#7305) · 244e1b5b
Sylvain Gugger authored Sep 22, 2020

244e1b5b
Adds FSMT to LM head AutoModel (#7312) · e4610881
Lysandre Debut authored Sep 22, 2020

e4610881
[fsmt] no need to pass device (#7292) · e2964b8a
Stas Bekman authored Sep 22, 2020

e2964b8a

Copy code from Bert to Roberta and add safeguard script (#7219) · e4b94d8e

Sylvain Gugger authored Sep 22, 2020



* Copy code from Bert to Roberta and add safeguard script

* Fix docstring

* Comment code

* Formatting

* Update src/transformers/modeling_roberta.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Add test and fix bugs

* Fix style and make new comand
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

e4b94d8e

21 Sep, 2020 16 commits
- [s2s] save hostname with repo info (#7301) · 656c27c3
  Sam Shleifer authored Sep 21, 2020
```
* save hostname
```
  656c27c3
- Added RobBERT-v2 model card (#7286) · 34a1b75f
  Thomas Winters authored Sep 21, 2020
```
* Added RobBERT-v2 model card

* minor Tweaks
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
```
  34a1b75f
- IXAmBERT model card (#7283) · 6513d16a
  jjacampos authored Sep 21, 2020
```
This PR includes the model card for the IXAmBERT model which has been recently uploaded to the huggingface repository.
```
  6513d16a
- [s2s] adjust finetune + test to work with fsmt (#7263) · af4b98ed
  Stas Bekman authored Sep 21, 2020
  
  af4b98ed
- [s2s] s/alpha_loss_encoder/alpha_encoder_loss/ (#7298) · 8d562a2d
  Stas Bekman authored Sep 21, 2020
```
fix to match `distillation.py:        self.alpha_encoder_loss`
```
  8d562a2d
- [s2s tests] fix test_run_eval_search (#7297) · cbb2f75a
  Stas Bekman authored Sep 21, 2020
  
  cbb2f75a
- [model card] distlbart-mnli model cards (#7278) · 7a88ed6c
  Suraj Patil authored Sep 21, 2020
  
  7a88ed6c
- Fix #7284 (#7289) · 63276b76
  Sylvain Gugger authored Sep 21, 2020
  
  63276b76
- Disable missing weight warning (#7282) · 8d464374
  Raphaël Bournhonesque authored Sep 21, 2020
  
  8d464374
- [fsmt] rewrite SinusoidalPositionalEmbedding + USE_CUDA test fixes + new... · 8ff88d25
  Stas Bekman authored Sep 21, 2020
```
[fsmt] rewrite SinusoidalPositionalEmbedding + USE_CUDA test fixes + new TranslationPipeline test (#7224)

* fix USE_CUDA, add pipeline

* USE_CUDA fix

* recode SinusoidalPositionalEmbedding into nn.Embedding subclass

was needed for torchscript to work - this is now part of the state_dict, so will have to remove these keys during save_pretrained

* back out (ci debug)

* restore

* slow last?

* facilitate not saving certain keys and test

* remove no longer used keys

* style

* fix logging import

* cleanup

* Update src/transformers/modeling_utils.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* fix bug in max_positional_embeddings

* rename keys to keys_to_never_save per suggestion, improve the setup

* Update src/transformers/modeling_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  8ff88d25
- Add model cards for new pre-trained BERTweet-COVID19 models (#7269) · 67c4b0c5
  Dat Quoc Nguyen authored Sep 21, 2020
```
Two new pre-trained models "vinai/bertweet-covid19-base-cased" and "vinai/bertweet-covid19-base-uncased" are resulted by further pre-training the pre-trained model "vinai/bertweet-base" on a  corpus of 23M COVID-19 English Tweets for 40 epochs.
```
  67c4b0c5
- Update README.md · 0cbe1139
  Patrick von Platen authored Sep 21, 2020
  
  0cbe1139
- Addressing review comment · aae4edb5
  Lysandre authored Sep 21, 2020
  
  aae4edb5
- [example/glue] fix compute_metrics_fn for bart like models (#7248) · 43b9d938
  Suraj Patil authored Sep 21, 2020
```
* fix compute_metrics_fn

* p.predictions -> preds

* apply suggestions
```
  43b9d938
- Fixed target_mapping preparation for XLNet when batch size > 1 (incl. beam search) (#7267) · 39062d05
  guillaume-be authored Sep 21, 2020
  
  39062d05
- Add "Fine-tune ALBERT for sentence-pair classification" notebook to the community notebooks (#7255) · 4b3e55bd
  Nadir El Manouzi authored Sep 21, 2020
  
  4b3e55bd
20 Sep, 2020 3 commits
- examples/seq2seq/__init__.py mutates sys.path (#7194) · 7cbf0f72
  Stas Bekman authored Sep 20, 2020
  
  7cbf0f72
- Fix typo in model name (#7268) · a4faecea
  Manuel Romero authored Sep 20, 2020
  
  a4faecea
- @slow has to be last (#7251) · 47ab3e82
  Stas Bekman authored Sep 20, 2020
```
Found an issue when `@slow` isn't the last decorator (gets ignored!), so documenting this significance.
```
  47ab3e82
19 Sep, 2020 4 commits
- model card improvements (#7221) · 4f6e5257
  Stas Bekman authored Sep 19, 2020
  
  4f6e5257
- fsmt tiny model card + script (#7244) · eb074af7
  Stas Bekman authored Sep 19, 2020
  
  eb074af7
- Add title to model card (#7240) · 1d90d0f3
  Manuel Romero authored Sep 19, 2020
  
  1d90d0f3
- Create README.md (#7239) · c9b7ef04
  Manuel Romero authored Sep 19, 2020
  
  c9b7ef04
18 Sep, 2020 5 commits

[s2s] distributed_eval.py saves better speed info (#7242) · 83dba10b
Sam Shleifer authored Sep 18, 2020

83dba10b

Add new pre-trained models BERTweet and PhoBERT (#6129) · af2322c7

Dat Quoc Nguyen authored Sep 19, 2020

* Add BERTweet and PhoBERT models

* Update modeling_auto.py

Re-add `bart` to LM_MAPPING

* Update tokenization_auto.py

Re-add `from .configuration_mobilebert import MobileBertConfig`
not sure why it's replaced by `from transformers.configuration_mobilebert import MobileBertConfig`

* Add BERTweet and PhoBERT to pretrained_models.rst

* Update tokenization_auto.py

Remove BertweetTokenizer and PhobertTokenizer out of tokenization_auto.py (they are currently not supported by AutoTokenizer.

* Update BertweetTokenizer - without nltk

* Update model card for BERTweet

* PhoBERT - with Auto mode - without import fastBPE

* PhoBERT - with Auto mode - without import fastBPE

* BERTweet - with Auto mode - without import fastBPE

* Add PhoBERT and BERTweet to TF modeling auto

* Improve Docstrings for PhobertTokenizer and BertweetTokenizer

* Update PhoBERT and BERTweet model cards

* Fixed a merge conflict in tokenization_auto

* Used black to reformat BERTweet- and PhoBERT-related files

* Used isort to reformat BERTweet- and PhoBERT-related files

* Reformatted BERTweet- and PhoBERT-related files based on flake8

* Updated test files

* Updated test files

* Updated tf test files

* Updated tf test files

* Updated tf test files

* Updated tf test files

* Update commits from huggingface

* Delete unnecessary files

* Add tokenizers to auto and init files

* Add test files for tokenizers

* Revised model cards

* Update save_vocabulary function in BertweetTokenizer and PhobertTokenizer and test files

* Revised test files

* Update orders of Phobert and Bertweet tokenizers in auto tokenization file

af2322c7

Create README.md · 9397436e
Patrick von Platen authored Sep 18, 2020

9397436e
Create README.md · 7eeca4d3
Patrick von Platen authored Sep 18, 2020

7eeca4d3
Update README.md · 31516c77
Patrick von Platen authored Sep 18, 2020

31516c77