Commits · 143289dcf759a663c03317e30167e89ee6d86588 · chenpangpang / transformers

04 Jan, 2021 13 commits

[test_model_parallelization] multiple fixes (#9354) · 143289dc
Stas Bekman authored Jan 04, 2021

143289dc

Improve documentation coverage for Bertweet (#9379) · 086718ac

Qbiwan authored Jan 05, 2021

* bertweet docs coverage

* style doc max len 119

* maxlen style rst

* run main() from style_doc

* changed according to  comments

086718ac

replace apex.normalization.FusedLayerNorm with torch.nn.LayerNorm (#9386) · 47ca0eaa
Stas Bekman authored Jan 04, 2021

47ca0eaa
correct docs (#9378) · 75ff5305
Patrick von Platen authored Jan 04, 2021

75ff5305

Fix TF DPR (#9283) · ec54d70e

Julien Plu authored Jan 04, 2021

* Fix DPR

* Keep usual models

* Apply style

* Address Sylvain's comments

ec54d70e

Fix open (#9368) · de29ff9b
Julien Plu authored Jan 04, 2021

de29ff9b

[trainer] parametrize default output_dir (#9352) · d018afce

Stas Bekman authored Jan 04, 2021

This PR:

* fixes trainer to have the logger agree with the actual default `output_dir`, but setting it one place and passing it as an argument to both places

@sgugger

d018afce

Fix Flaubert (#9292) · d735b074
Julien Plu authored Jan 04, 2021

d735b074

Bump notebook from 6.1.4 to 6.1.5 in /examples/research_projects/lxmert (#9402) · 5dd389d1

dependabot[bot] authored Jan 04, 2021

Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5.
- [Release notes](https://github.com/jupyter/jupyterhub/releases)
- [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md)
- [Commits](https://github.com/jupyter/jupyterhub/commits

)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

5dd389d1

Put back LXMert example (#9401) · 23a71449
Sylvain Gugger authored Jan 04, 2021

23a71449
Fix CTRL (#9291) · 6c03d4ac
Julien Plu authored Jan 04, 2021

6c03d4ac

Add utility function for retrieving locally cached models (#8836) · c581d8af

Charles authored Jan 04, 2021



* add get_cached_models function

* add List type to import

* fix code quality

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c581d8af

simplify marian distillation script (#9394) · 8eb7f26d
Sam Shleifer authored Jan 04, 2021

8eb7f26d

03 Jan, 2021 1 commit

Fix typos in README and bugs in RAG example code for end-to-end evaluation and finetuning (#9355) · d944966b

Yoshitomo Matsubara authored Jan 03, 2021

* fix a bug in eval_batch_retrieval

* should return parser as well as other staticmethod

* remove duplicate argument

* these kwargs are no longer accepted (cause TypeError in self.generator.generate of modeling_rag.py)

* fixed file paths in README

* moved an arg to add_ray_specific_args

d944966b

02 Jan, 2021 3 commits
- file_utils.py: TF examples outputs.last_hidden_states -> state (#9382) · c4fd609a
  Chris Kennedy authored Jan 02, 2021
  
  c4fd609a
- [Docs] `past_key_values` return a tuple of tuple as a default (#9381) · b01f451c
  Patrick von Platen authored Jan 02, 2021
```
* push

* make style
```
  b01f451c
- use return dict for rag encoder (#9363) · 5f7a07c0
  Derrick Blakely authored Jan 02, 2021
  
  5f7a07c0
30 Dec, 2020 1 commit
- torch.cuda.is_available() is redundant as apex handles that internally (#9350) · ae333d04
  Stas Bekman authored Dec 30, 2020
  
  ae333d04
29 Dec, 2020 3 commits

[prophetnet] wrong import (#9349) · 8217d4e3

Stas Bekman authored Dec 29, 2020

```
python -c "from apex.normalization import FusedProphetNetLayerNorm"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: cannot import name 'FusedProphetNetLayerNorm' from 'apex.normalization' (/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/apex/normalization/__init__.py)
```
It looks like this code has never been tested, so it silently fails inside try/except.

Discovered this by accident in https://github.com/huggingface/transformers/issues/9338#issuecomment-752217708

8217d4e3

add import math (#9346) · 912f6881
Patrick von Platen authored Dec 29, 2020

912f6881
improve templates (#9342) · 785e52cd
Patrick von Platen authored Dec 29, 2020

785e52cd

28 Dec, 2020 3 commits

Fix TransfoXL (#9302) · 64103fb6
Julien Plu authored Dec 28, 2020

64103fb6
Fix TF T5 (#9301) · d97d06d0
Julien Plu authored Dec 28, 2020
```
* Fix T5

* Fix test

* Fix test
```
d97d06d0

[Seq2Seq Templates] Correct some TF-serving errors and add gradient... · 83fdd252

Patrick von Platen authored Dec 28, 2020

[Seq2Seq Templates] Correct some TF-serving errors and add gradient checkpointing to PT by default. (#9334)

* correct tests

* correct shape and get_tf_activation

* more correction tf

* add gradient checkpointing to templates

* correct typo

83fdd252

27 Dec, 2020 1 commit
- push (#9320) · 8e74eca7
  Patrick von Platen authored Dec 27, 2020
  
  8e74eca7
25 Dec, 2020 2 commits

[GPT2] Correct gradient checkpointing (#9308) · 61443cd7

Patrick von Platen authored Dec 25, 2020

* correct gpt2

* fix gpt2

* fix use_cache ordering

* correct past tolerance

* fix for all cases

* style

61443cd7

add translation example (#9303) · 21fc6766

Vasudev Gupta authored Dec 25, 2020



* Created using Colaboratory

* mbart-training examples add

* link add

* Update description
Co-authored-by: Suraj Patil <surajp815@gmail.com>

21fc6766

24 Dec, 2020 8 commits

[Bart doc] Fix outdated statement (#9299) · 52b3a05e
Patrick von Platen authored Dec 24, 2020
```
* fix bart doc

* fix docs
```
52b3a05e
Update tokenization_utils_base.py (#9293) · 7777db15
Bram Vanroy authored Dec 24, 2020
```
Missing "s" typo
```
7777db15

fix typo in modeling_encoder_decoder.py (#9297) · 71963a66

Daniele Sartiano authored Dec 24, 2020



* Update modeling_encoder_decoder.py

Fixed typo.

* typo
Co-authored-by: Suraj Patil <surajp815@gmail.com>

71963a66

Proposed Fix : [RagSequenceForGeneration] generate "without" input_ids (#9220) · f3a3b91d

Ratthachat (Jung) authored Dec 24, 2020

* Create modeling_tf_dpr.py

* Add TFDPR

* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot

last commit accidentally deleted these 4 lines, so I recover them back

* Add TFDPR

* Add TFDPR

* clean up some comments, add TF input-style doc string

* Add TFDPR

* Make return_dict=False as default

* Fix return_dict bug (in .from_pretrained)

* Add get_input_embeddings()

* Create test_modeling_tf_dpr.py

The current version is already passed all 27 tests!
Please see the test run at : 
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing



* fix quality

* delete init weights

* run fix copies

* fix repo consis

* del config_class, load_tf_weights

They shoud be 'pytorch only'

* add config_class back

after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion

* newline after .. note::

* import tf, np (Necessary for ModelIntegrationTest)

* slow_test from_pretrained with from_pt=True

At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug

* Add simple TFDPRModelIntegrationTest

Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet

* upload correct tf model

* remove position_ids as missing keys

* fix RagSeq generate with context_input_ids

fix RagSeq generate with context_input_ids

* apply style

* delete unused lines

* Add test_rag_sequence_generate_batch_from_context_input_ids

* Readability improved

* stylying

* Stylize

* typos

* add check_model_generate_from_context_input_ids

* make style

* Apply suggestions from code review

* make style2
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>

f3a3b91d

enable cache by default (#9296) · 2a18b709
Suraj Patil authored Dec 24, 2020

2a18b709
Fix typo in file_utils.py (#9289) · 6189ae99
Jungwhan authored Dec 24, 2020

6189ae99
allow integer device for BatchEncoding (#9271) · 222dbdb2
Jethro Kuan authored Dec 24, 2020
```
Fixes #9244
Co-authored-by: Jethro Kuan <jethro.kuan@bytedance.com>
```
222dbdb2

[Templates] Adapt Bert (#9284) · 6c091abe

Patrick von Platen authored Dec 24, 2020

* adapt templates

* adapt config

* add test as well

* fix output type

* fix cache false naming

* finish tests

* last fix

6c091abe

23 Dec, 2020 5 commits

Add caching mechanism to BERT, RoBERTa (#9183) · 88ef8893

Suraj Patil authored Dec 23, 2020

* add past_key_values

* add use_cache option

* make mask before cutting ids

* adjust position_ids according to past_key_values

* flatten past_key_values

* fix positional embeds

* fix _reorder_cache

* set use_cache to false when not decoder, fix attention mask init

* add test for caching

* add past_key_values for Roberta

* fix position embeds

* add caching test for roberta

* add doc

* make style

* doc, fix attention mask, test

* small fixes

* adress patrick's comments

* input_ids shouldn't start with pad token

* use_cache only when decoder

* make consistent with bert

* make copies consistent

* add use_cache to encoder

* add past_key_values to tapas attention

* apply suggestions from code review

* make coppies consistent

* add attn mask in tests

* remove copied from longformer

* apply suggestions from code review

* fix bart test

* nit

* simplify model outputs

* fix doc

* fix output ordering

88ef8893

Adapt to new name of `label_smoothing_factor` training arg (#9282) · a1cb6e98
Sylvain Gugger authored Dec 23, 2020

a1cb6e98

Minor documentation revisions from copyediting (#9266) · bcc87c63

Connor Brinton authored Dec 23, 2020

* typo: Revise "checkout" to "check out"

* typo: Change "seemlessly" to "seamlessly"

* typo: Close parentheses in "Using the tokenizer"

* typo: Add closing parenthesis to supported models aside

* docs: Treat ``position_ids`` as plural

Alternatively, the word "argument" could be added to make the subject singular.

* docs: Remove comma, making subordinate clause

* docs: Remove comma separating verb and direct object

* docs: Fix typo ("next" -> "text")

* docs: Reverse phrase order to simplify sentence

* docs: "quicktour" -> "quick tour"

* docs: "to throw" -> "from throwing"

* docs: Remove disruptive newline in padding/truncation section

* docs: "show exemplary" -> "show examples of"

* docs: "much harder as" -> "much harder than"

* docs: Fix typo "seach" -> "search"

* docs: Fix subject-verb disagreement in WordPiece description

* docs: Fix style in preprocessing.rst

bcc87c63

[Seq2Seq Templates] Fix check_repo.py templates file (#9277) · d5db6c37
Patrick von Platen authored Dec 23, 2020
```
* add enc dec pt model to check repo

* fix indent
```
d5db6c37

Fix param error (#9273) · 4bafc43b

Xu Song authored Dec 23, 2020

TypeError: forward() got an unexpected keyword argument 'token_type_ids'

4bafc43b