Commits · b86a71ea381531bedc38aa23ad8e2f6667bc0f41 · chenpangpang / transformers

19 Oct, 2020 1 commit
- [tests] fix slow bart cnn test, faster marian tests (#7888) · b86a71ea
  Sam Shleifer authored Oct 18, 2020
  
  b86a71ea
18 Oct, 2020 1 commit

[Dependencies|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659) · ba8c4d0a

Thomas Wolf authored Oct 18, 2020

* splitting fast and slow tokenizers [WIP]

* [WIP] splitting sentencepiece and tokenizers dependencies

* update dummy objects

* add name_or_path to models and tokenizers

* prefix added to file names

* prefix

* styling + quality

* spliting all the tokenizer files - sorting sentencepiece based ones

* update tokenizer version up to 0.9.0

* remove hard dependency on sentencepiece 🎉

* and removed hard dependency on tokenizers 🎉



* update conversion script

* update missing models

* fixing tests

* move test_tokenization_fast to main tokenization tests - fix bugs

* bump up tokenizers

* fix bert_generation

* update ad fix several tokenizers

* keep sentencepiece in deps for now

* fix funnel and deberta tests

* fix fsmt

* fix marian tests

* fix layoutlm

* fix squeezebert and gpt2

* fix T5 tokenization

* fix xlnet tests

* style

* fix mbart

* bump up tokenizers to 0.9.2

* fix model tests

* fix tf models

* fix seq2seq examples

* fix tests without sentencepiece

* fix slow => fast  conversion without sentencepiece

* update auto and bert generation tests

* fix mbart tests

* fix auto and common test without tokenizers

* fix tests without tokenizers

* clean up tests lighten up when tokenizers + sentencepiece are both off

* style quality and tests fixing

* add sentencepiece to doc/examples reqs

* leave sentencepiece on for now

* style quality split hebert and fix pegasus

* WIP Herbert fast

* add sample_text_no_unicode and fix hebert tokenization

* skip FSMT example test for now

* fix style

* fix fsmt in example tests

* update following Lysandre and Sylvain's comments

* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/testing_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/tokenization_utils_base.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ba8c4d0a

17 Oct, 2020 3 commits
- Remove duplicated mish activation function (#7856) · c65863ce
  Raza Habib authored Oct 17, 2020
```
* Remove duplicated mish activation function

* Update activations.py
```
  c65863ce
- Fix Rag example docstring (#7872) · f5c45a19
  Patrick von Platen authored Oct 17, 2020
```
* fix rag examples

* fix token generate example
```
  f5c45a19
- [s2s testing] turn all to unittests, use auto-delete temp dirs (#7859) · 9f7b2b24
  Stas Bekman authored Oct 17, 2020
  
  9f7b2b24
16 Oct, 2020 12 commits

Fix typo in sequence model card · dc552b9b
Patrick von Platen authored Oct 16, 2020

dc552b9b
[seq2seq testing] improve readability (#7845) · 1652ddad
Stas Bekman authored Oct 16, 2020

1652ddad
Fix missing reference titles in retrieval evaluation of RAG (#7817) · 466115b2
Quentin Lhoest authored Oct 16, 2020

466115b2

[testing] disable FutureWarning in examples tests (#7842) · 464b53f5

Stas Bekman authored Oct 16, 2020

* [testing] disable FutureWarning in examples tests

same as tests/conftest.py, we can't resolve those warning, so turn the noise off.

* fix

464b53f5

Small fixes to HP search (#7839) · eb186bc1
Sylvain Gugger authored Oct 16, 2020

eb186bc1
fix/hide warnings (#7837) · d8ca57d2
Stas Bekman authored Oct 16, 2020
```
s
```
d8ca57d2
Remove masked_lm_labels from returned dictionary (#7818) · c6e865ac
vblagoje authored Oct 16, 2020

c6e865ac
[cleanup] assign todos, faster bart-cnn test (#7835) · 96e47d92
Sam Shleifer authored Oct 16, 2020
```
* 2 beam output

* unassign/remove TODOs

* remove one more
```
96e47d92

Herbert polish model (#7798) · 7b13bd01

rmroczkowski authored Oct 16, 2020



* HerBERT transformer model for Polish language understanding.

* HerbertTokenizerFast generated with HerbertConverter

* Herbert base and large model cards

* Herbert model cards with tags

* Herbert tensorflow models

* Herbert model tests based on Bert test suit

* src/transformers/tokenization_herbert.py edited online with Bitbucket

* src/transformers/tokenization_herbert.py edited online with Bitbucket

* docs/source/model_doc/herbert.rst edited online with Bitbucket

* Herbert tokenizer tests and bug fixes

* src/transformers/configuration_herbert.py edited online with Bitbucket

* Copyrights and tests for TFHerbertModel

* model_cards/allegro/herbert-base-cased/README.md edited online with Bitbucket

* model_cards/allegro/herbert-large-cased/README.md edited online with Bitbucket

* Bug fixes after testing

* Reformat modified_only_fixup

* Proper order of configuration

* Herbert proper documentation formatting

* Formatting with make modified_only_fixup

* Dummies fixed

* Adding missing models to documentation

* Removing HerBERT model as it is a simple extension of BERT

* Update model_cards/allegro/herbert-base-cased/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Update model_cards/allegro/herbert-large-cased/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* HerbertTokenizer deprecated configuration removed
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

7b13bd01

[Pipelines] Fix links to model lists (#7826) · 99898dcd
Julien Chaumond authored Oct 16, 2020

99898dcd
Fix DeBERTa integration tests (#7729) · 52c9e842
Lysandre Debut authored Oct 16, 2020

52c9e842
[seq2seq] get_git_info fails gracefully (#7843) · 2255c2c7
Stas Bekman authored Oct 15, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
2255c2c7

15 Oct, 2020 13 commits

Typo and fix the input of labels to `cross_entropy` (#7841) · dfa4c26b
Katarina Slama authored Oct 15, 2020
```
The current version caused some errors. The changes fixed it for me. Hope this is helpful!
```
dfa4c26b

fix DeprecationWarning (#7834) · a5a8eeb7

Stas Bekman authored Oct 15, 2020

in `tests/test_utils_check_copies.py` I was getting intermittently:
```
utils/check_copies.py:52
  /mnt/nvme1/code/transformers-comet/utils/check_copies.py:52: DeprecationWarning: invalid escape sequence \s
    while line_index < len(lines) and re.search(f"^{indent}(class|def)\s+{name}", lines[line_index]) is None:
```
So this should fix it.

a5a8eeb7

model card for bert-base-NER (#7799) · 9c71cca3

David S. Lim authored Oct 15, 2020



* model card for bert-base-NER

* add meta data up top
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

9c71cca3

fix wandb/comet problems (#7830) · 4dbca500

Stas Bekman authored Oct 15, 2020



* fix wandb/comet problems

* simplify

* Update src/transformers/integrations.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4dbca500

[model_cards] facebook/bart-large-mnli: register ZSC for the inference API · e7aa6483
Julien Chaumond authored Oct 15, 2020
```
cc @Narsil @mfuntowicz @joeddav
```
e7aa6483
Small fixes to NotebookProgressCallback (#7813) · 2ce3ddab
Sylvain Gugger authored Oct 15, 2020

2ce3ddab
[model_cards] Fix yaml for Facebook/wmt19-* · 6f45dd2f
Julien Chaumond authored Oct 15, 2020
```
see d99ed7ad
```
6f45dd2f
[model_cards] Facebook: add thumbnail · d99ed7ad
Julien Chaumond authored Oct 15, 2020

d99ed7ad
Set XLA example time to 500s · 2485b8b0
Lysandre authored Oct 15, 2020

2485b8b0
Notebook catch all errors · 2dba7d57
Lysandre authored Oct 15, 2020

2dba7d57

Upgrading TFAutoModelWithLMHead to (#7730) · 9ade8e74

Nicolas Patry authored Oct 15, 2020

- TFAutoModelForCausalLM
- TFAutoModelForMaskedLM
- TFAutoModelForSeq2SeqLM

as per deprecation warning. No tests as it simply removes current
warnings from tests.

9ade8e74

Add specific notebook ProgressCalback (#7793) · 62b5622e
Sylvain Gugger authored Oct 15, 2020

62b5622e

Improving Pipelines by defaulting to framework='tf' when pytorch seems unavailable. (#7728) · 0911b6bd

Nicolas Patry authored Oct 15, 2020

* Improving Pipelines by defaulting to framework='tf' when

pytorch seems unavailable.

* Actually changing the default resolution order to account for model
defaults

Adding a new tests for each pipeline to check that pipeline(task) works
too without manually adding the framework too.

0911b6bd

14 Oct, 2020 10 commits
- Fix TF savedmodel in Roberta (#7795) · 3a134f7c
  Julien Plu authored Oct 14, 2020
```
* Remove wrong parameter.

* Same in Longformer
```
  3a134f7c
- Model Card (#7752) · 3032de93
  Nils Reimers authored Oct 14, 2020
```
* Create README.md

* Update model_cards/sentence-transformers/LaBSE/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
```
  3032de93
- [model_cards] sarahlintang/IndoBERT (#7748) · 3fdbeba8
  sarahlintang authored Oct 15, 2020
```
* Create README.md

* Update model_cards/sarahlintang/IndoBERT/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
```
  3fdbeba8
- [model_cards] rename to correct model name · ba654270
  Julien Chaumond authored Oct 14, 2020
  
  ba654270
- Create README.md (#7722) · 08978487
  Zhuosheng Zhang authored Oct 15, 2020
  
  08978487
- added evaluation results for classification task (#7790) · 35575091
  Sagor Sarker authored Oct 14, 2020
  
  35575091
- Don't use `store_xxx` on optional bools (#7786) · bb9559a7
  Sylvain Gugger authored Oct 14, 2020
```
* Don't use `store_xxx` on optional bools

* Refine test

* Refine test
```
  bb9559a7
- Add predict step accumulation (#7767) · a1d1b332
  Sylvain Gugger authored Oct 14, 2020
```
* Add eval_accumulation_step and clean distributed eval

* Add TPU test

* Add TPU stuff

* Fix arg name

* Fix Seq2SeqTrainer

* Fix total_size

* Update src/transformers/trainer_pt_utils.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Doc and add test to TPU

* Add unit test

* Adapt name
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
```
  a1d1b332
- fix examples/rag imports, tests (#7712) · 8feb0cc9
  Sam Shleifer authored Oct 14, 2020
  
  8feb0cc9
- [model_cards] TinyBERT (HUAWEI Noah's Ark Lab) (#7775) · 890e790e
  XiaoqiJiao authored Oct 14, 2020
  
  890e790e