Commits · 748006c0b35d64cdee23a3cdc2107a1ce64044b5 · chenpangpang / transformers

05 Jan, 2021 5 commits

[trainer] --model_parallel hasn't been implemented for most models (#9347) · 748006c0

Stas Bekman authored Jan 05, 2021

* --model_parallel hasn't been implemented for most models

* make the help clear as well

* implement is_parallelizable; use it

* oops

* remove property

748006c0

Use stable functions (#9369) · 4225740a
Julien Plu authored Jan 05, 2021

4225740a

[logging] autoflush (#9385) · 4aa8f6ad

Stas Bekman authored Jan 05, 2021

This PR proposes to:

* auto-flush `transformers` logging 

When using logging for tracing signals from different parts of the code and which could be mixed with print debug this aids to get all the logging events synchronized. 

I don't think this change will introduce any performance impacts.

If it helps someone here is the code I used to sync `transformers` logging with various other debug prints.

I was porting bart to MP and I needed to trace that the device switching happens correctly and I added a bunch of logger.info calls inside `modeling_bart.py` and also had some other helpers `print` debug messages which weren't logger based:

```

# auto flush std streams
from sys import stdout, stderr
def stdout_write_flush(args, w=stderr.write): w(args); stderr.flush()
def stderr_write_flush(args, w=stderr.write): w(args); stderr.flush()
stdout.write = stdout_write_flush
stderr.write = stderr_write_flush

from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig

import logging
import transformers.utils.logging
import transformers.models.bart.modeling_bart

# I wanted a shorter simpler format
handlers = transformers.utils.logging._get_library_root_logger().handlers
for handler in handlers:
    formatter = logging.Formatter("[%(funcName)s] %(message)s")
    handler.setFormatter(formatter)

transformers.models.bart.modeling_bart.logger.setLevel(transformers.logging.INFO)
```

@LysandreJik, @sgugger, @patrickvonplaten

4aa8f6ad

Fix TF Longformer (#9348) · 83eec97e

Julien Plu authored Jan 05, 2021

* Fix longformer

* Apply style

* Remove serving content

* Forgot a condition

* Apply style

* Address Patrick's comments

* Fix dtype

83eec97e

feat(wandb): save model as artifact (#8119) · 30fa0b78

Boris Dayma authored Jan 05, 2021

* feat(wandb): log artifacts

* fix: typo

* feat(wandb): ensure name is allowed

* feat(wandb): log artifact

* feat(wandb): saving logic

* style: improve formatting

* fix: unrelated typo

* feat: use a fake trainer

* fix: simplify

* feat(wandb): log model files as artifact

* style: fix style

* docs(wandb): correct description

* feat: unpack model + allow env Truethy values

* feat: TrainerCallback can access tokenizer

* style: fix style

* feat(wandb): log more interesting metadata

* feat: unpack tokenizer

* feat(wandb): metadata with load_best_model_at_end

* feat(wandb): more robust metadata

* style(wandb): fix formatting

30fa0b78

04 Jan, 2021 13 commits

[test_model_parallelization] multiple fixes (#9354) · 143289dc
Stas Bekman authored Jan 04, 2021

143289dc

Improve documentation coverage for Bertweet (#9379) · 086718ac

Qbiwan authored Jan 05, 2021

* bertweet docs coverage

* style doc max len 119

* maxlen style rst

* run main() from style_doc

* changed according to  comments

086718ac

replace apex.normalization.FusedLayerNorm with torch.nn.LayerNorm (#9386) · 47ca0eaa
Stas Bekman authored Jan 04, 2021

47ca0eaa
correct docs (#9378) · 75ff5305
Patrick von Platen authored Jan 04, 2021

75ff5305

Fix TF DPR (#9283) · ec54d70e

Julien Plu authored Jan 04, 2021

* Fix DPR

* Keep usual models

* Apply style

* Address Sylvain's comments

ec54d70e

Fix open (#9368) · de29ff9b
Julien Plu authored Jan 04, 2021

de29ff9b

[trainer] parametrize default output_dir (#9352) · d018afce

Stas Bekman authored Jan 04, 2021

This PR:

* fixes trainer to have the logger agree with the actual default `output_dir`, but setting it one place and passing it as an argument to both places

@sgugger

d018afce

Fix Flaubert (#9292) · d735b074
Julien Plu authored Jan 04, 2021

d735b074

Bump notebook from 6.1.4 to 6.1.5 in /examples/research_projects/lxmert (#9402) · 5dd389d1

dependabot[bot] authored Jan 04, 2021

Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5.
- [Release notes](https://github.com/jupyter/jupyterhub/releases)
- [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md)
- [Commits](https://github.com/jupyter/jupyterhub/commits

)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

5dd389d1

Put back LXMert example (#9401) · 23a71449
Sylvain Gugger authored Jan 04, 2021

23a71449
Fix CTRL (#9291) · 6c03d4ac
Julien Plu authored Jan 04, 2021

6c03d4ac

Add utility function for retrieving locally cached models (#8836) · c581d8af

Charles authored Jan 04, 2021



* add get_cached_models function

* add List type to import

* fix code quality

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c581d8af

simplify marian distillation script (#9394) · 8eb7f26d
Sam Shleifer authored Jan 04, 2021

8eb7f26d

03 Jan, 2021 1 commit

Fix typos in README and bugs in RAG example code for end-to-end evaluation and finetuning (#9355) · d944966b

Yoshitomo Matsubara authored Jan 03, 2021

* fix a bug in eval_batch_retrieval

* should return parser as well as other staticmethod

* remove duplicate argument

* these kwargs are no longer accepted (cause TypeError in self.generator.generate of modeling_rag.py)

* fixed file paths in README

* moved an arg to add_ray_specific_args

d944966b

02 Jan, 2021 3 commits
- file_utils.py: TF examples outputs.last_hidden_states -> state (#9382) · c4fd609a
  Chris Kennedy authored Jan 02, 2021
  
  c4fd609a
- [Docs] `past_key_values` return a tuple of tuple as a default (#9381) · b01f451c
  Patrick von Platen authored Jan 02, 2021
```
* push

* make style
```
  b01f451c
- use return dict for rag encoder (#9363) · 5f7a07c0
  Derrick Blakely authored Jan 02, 2021
  
  5f7a07c0
30 Dec, 2020 1 commit
- torch.cuda.is_available() is redundant as apex handles that internally (#9350) · ae333d04
  Stas Bekman authored Dec 30, 2020
  
  ae333d04
29 Dec, 2020 3 commits

[prophetnet] wrong import (#9349) · 8217d4e3

Stas Bekman authored Dec 29, 2020

```
python -c "from apex.normalization import FusedProphetNetLayerNorm"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: cannot import name 'FusedProphetNetLayerNorm' from 'apex.normalization' (/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/apex/normalization/__init__.py)
```
It looks like this code has never been tested, so it silently fails inside try/except.

Discovered this by accident in https://github.com/huggingface/transformers/issues/9338#issuecomment-752217708

8217d4e3

add import math (#9346) · 912f6881
Patrick von Platen authored Dec 29, 2020

912f6881
improve templates (#9342) · 785e52cd
Patrick von Platen authored Dec 29, 2020

785e52cd

28 Dec, 2020 3 commits

Fix TransfoXL (#9302) · 64103fb6
Julien Plu authored Dec 28, 2020

64103fb6
Fix TF T5 (#9301) · d97d06d0
Julien Plu authored Dec 28, 2020
```
* Fix T5

* Fix test

* Fix test
```
d97d06d0

[Seq2Seq Templates] Correct some TF-serving errors and add gradient... · 83fdd252

Patrick von Platen authored Dec 28, 2020

[Seq2Seq Templates] Correct some TF-serving errors and add gradient checkpointing to PT by default. (#9334)

* correct tests

* correct shape and get_tf_activation

* more correction tf

* add gradient checkpointing to templates

* correct typo

83fdd252

27 Dec, 2020 1 commit
- push (#9320) · 8e74eca7
  Patrick von Platen authored Dec 27, 2020
  
  8e74eca7
25 Dec, 2020 2 commits

[GPT2] Correct gradient checkpointing (#9308) · 61443cd7

Patrick von Platen authored Dec 25, 2020

* correct gpt2

* fix gpt2

* fix use_cache ordering

* correct past tolerance

* fix for all cases

* style

61443cd7

add translation example (#9303) · 21fc6766

Vasudev Gupta authored Dec 25, 2020



* Created using Colaboratory

* mbart-training examples add

* link add

* Update description
Co-authored-by: Suraj Patil <surajp815@gmail.com>

21fc6766

24 Dec, 2020 8 commits

[Bart doc] Fix outdated statement (#9299) · 52b3a05e
Patrick von Platen authored Dec 24, 2020
```
* fix bart doc

* fix docs
```
52b3a05e
Update tokenization_utils_base.py (#9293) · 7777db15
Bram Vanroy authored Dec 24, 2020
```
Missing "s" typo
```
7777db15

fix typo in modeling_encoder_decoder.py (#9297) · 71963a66

Daniele Sartiano authored Dec 24, 2020



* Update modeling_encoder_decoder.py

Fixed typo.

* typo
Co-authored-by: Suraj Patil <surajp815@gmail.com>

71963a66

Proposed Fix : [RagSequenceForGeneration] generate "without" input_ids (#9220) · f3a3b91d

Ratthachat (Jung) authored Dec 24, 2020

* Create modeling_tf_dpr.py

* Add TFDPR

* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot

last commit accidentally deleted these 4 lines, so I recover them back

* Add TFDPR

* Add TFDPR

* clean up some comments, add TF input-style doc string

* Add TFDPR

* Make return_dict=False as default

* Fix return_dict bug (in .from_pretrained)

* Add get_input_embeddings()

* Create test_modeling_tf_dpr.py

The current version is already passed all 27 tests!
Please see the test run at : 
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing



* fix quality

* delete init weights

* run fix copies

* fix repo consis

* del config_class, load_tf_weights

They shoud be 'pytorch only'

* add config_class back

after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion

* newline after .. note::

* import tf, np (Necessary for ModelIntegrationTest)

* slow_test from_pretrained with from_pt=True

At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug

* Add simple TFDPRModelIntegrationTest

Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet

* upload correct tf model

* remove position_ids as missing keys

* fix RagSeq generate with context_input_ids

fix RagSeq generate with context_input_ids

* apply style

* delete unused lines

* Add test_rag_sequence_generate_batch_from_context_input_ids

* Readability improved

* stylying

* Stylize

* typos

* add check_model_generate_from_context_input_ids

* make style

* Apply suggestions from code review

* make style2
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>

f3a3b91d

enable cache by default (#9296) · 2a18b709
Suraj Patil authored Dec 24, 2020

2a18b709
Fix typo in file_utils.py (#9289) · 6189ae99
Jungwhan authored Dec 24, 2020

6189ae99
allow integer device for BatchEncoding (#9271) · 222dbdb2
Jethro Kuan authored Dec 24, 2020
```
Fixes #9244
Co-authored-by: Jethro Kuan <jethro.kuan@bytedance.com>
```
222dbdb2

[Templates] Adapt Bert (#9284) · 6c091abe

Patrick von Platen authored Dec 24, 2020

* adapt templates

* adapt config

* add test as well

* fix output type

* fix cache false naming

* finish tests

* last fix

6c091abe