Commits · 090d28e32d7dd05127e968a5fe035c611db99a5c · chenpangpang / transformers

06 Jan, 2021 2 commits

[Refactor] Splitting pipelines.py into its own module. (#9279) · 090d28e3

Nicolas Patry authored Jan 06, 2021

* Splitting pipelines into its own module.

* Moving everything into base.py

* Moving FeatureExtractionPipeline into its own file.

* TextGenerationPipeline.

* TextClassifictionPipeline

* ZeroShot + get_framework import.

* FillMaskPipeline

* NerPipeline + TokenClassificationPipeline

* QuestionAnsweringPipeline

* TableQuestionAnsweringPipeline

* ConversationnalPipeline

* Text2TextGenerationPipeline, TranslationPipeline, SummarizationPipeline

* Typo import fix.

* Relative imports.

090d28e3

[docs] outline sharded ddp doc (#9208) · d64372fd

Stas Bekman authored Jan 05, 2021



* outline sharded dpp doc

* fix link

* add example

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* narrow the command and remove non-essentials
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d64372fd

05 Jan, 2021 13 commits

[PyTorch Bart] Split Bart into different models (#9343) · eef66035

Patrick von Platen authored Jan 05, 2021

* first try

* remove old template

* finish bart

* finish mbart

* delete unnecessary line

* init pegasus

* save intermediate

* correct pegasus

* finish pegasus

* remove cookie cutter leftover

* add marian

* finish blenderbot

* replace in file

* correctly split blenderbot

* delete "old" folder

* correct "add statement"

* adapt config for tf comp

* correct configs for tf

* remove ipdb

* fix more stuff

* fix mbart

* push pegasus fix

* fix mbart

* more fixes

* fix research projects code

* finish docs for bart, mbart, and marian

* delete unnecessary file

* correct attn typo

* correct configs

* remove pegasus for seq class

* correct peg docs

* correct peg docs

* finish configs

* further improve docs

* add copied from statements to mbart

* fix copied from in mbart

* add copy statements to marian

* add copied from to marian

* add pegasus copied from

* finish pegasus

* finish copied from

* Apply suggestions from code review

* make style

* backward comp blenderbot

* apply lysandres and sylvains suggestions

* apply suggestions

* push last fixes

* fix docs

* fix tok tests

* fix imports code style

* fix doc

eef66035

improve readme text to private models/versioning/api (#9424) · 4eec5d0c
Clement authored Jan 05, 2021

4eec5d0c
add experimental warning (#9412) · d9e848c1
Stas Bekman authored Jan 05, 2021

d9e848c1

[trainer] group fp16 args together (#9409) · 29acabd8

Stas Bekman authored Jan 05, 2021

* [t5 doc] typos

a few run away backticks

@sgugger

* style

* [trainer] put fp16 args together

this PR proposes a purely cosmetic change that puts all the fp16 args together - so they are easier to manager/read

@sgugger

* style

29acabd8

[examples/text-classification] Fix a bug for using one's own dataset of a regression task (#9411) · 57a66269
Yusuke Mori authored Jan 05, 2021

57a66269

LED (#9278) · 189387e9

Patrick von Platen authored Jan 05, 2021

* create model

* add integration

* save current state

* make integration tests pass

* add one more test

* add explanation to tests

* remove from bart

* add padding

* remove unnecessary test

* make all tests pass

* re-add cookie cutter tests

* finish PyTorch

* fix attention test

* Update tests/test_modeling_common.py

* revert change

* remove unused file

* add string to doc

* save intermediate

* make tf integration tests pass

* finish tf

* fix doc

* fix docs again

* add led to doctree

* add to auto tokenizer

* added tips for led

* make style

* apply jplus statements

* correct tf longformer

* apply lysandres suggestions

* apply sylvains suggestions

* Apply suggestions from code review

189387e9

Fix documentation links always pointing to master. (#9217) · 314cca28

Sugeeth authored Jan 05, 2021



* Use extlinks to point hyperlink with the version of code

* Point to version on release and master until then

* Apply style

* Correct links

* Add missing backtick

* Simple missing backtick after all.
Co-authored-by: Raghavendra Sugeeth P S <raghav-5305@raghav-5305.csez.zohocorpin.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

314cca28

Fix TF Funnel (#9300) · 52d62e68

Julien Plu authored Jan 05, 2021

* Fix Funnel

* Apply Patrick's comment

* Remove comment

* Fix dummy value

* Apply style

52d62e68

[trainer] --model_parallel hasn't been implemented for most models (#9347) · 748006c0

Stas Bekman authored Jan 05, 2021

* --model_parallel hasn't been implemented for most models

* make the help clear as well

* implement is_parallelizable; use it

* oops

* remove property

748006c0

Use stable functions (#9369) · 4225740a
Julien Plu authored Jan 05, 2021

4225740a

[logging] autoflush (#9385) · 4aa8f6ad

Stas Bekman authored Jan 05, 2021

This PR proposes to:

* auto-flush `transformers` logging 

When using logging for tracing signals from different parts of the code and which could be mixed with print debug this aids to get all the logging events synchronized. 

I don't think this change will introduce any performance impacts.

If it helps someone here is the code I used to sync `transformers` logging with various other debug prints.

I was porting bart to MP and I needed to trace that the device switching happens correctly and I added a bunch of logger.info calls inside `modeling_bart.py` and also had some other helpers `print` debug messages which weren't logger based:

```

# auto flush std streams
from sys import stdout, stderr
def stdout_write_flush(args, w=stderr.write): w(args); stderr.flush()
def stderr_write_flush(args, w=stderr.write): w(args); stderr.flush()
stdout.write = stdout_write_flush
stderr.write = stderr_write_flush

from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig

import logging
import transformers.utils.logging
import transformers.models.bart.modeling_bart

# I wanted a shorter simpler format
handlers = transformers.utils.logging._get_library_root_logger().handlers
for handler in handlers:
    formatter = logging.Formatter("[%(funcName)s] %(message)s")
    handler.setFormatter(formatter)

transformers.models.bart.modeling_bart.logger.setLevel(transformers.logging.INFO)
```

@LysandreJik, @sgugger, @patrickvonplaten

4aa8f6ad

Fix TF Longformer (#9348) · 83eec97e

Julien Plu authored Jan 05, 2021

* Fix longformer

* Apply style

* Remove serving content

* Forgot a condition

* Apply style

* Address Patrick's comments

* Fix dtype

83eec97e

feat(wandb): save model as artifact (#8119) · 30fa0b78

Boris Dayma authored Jan 05, 2021

* feat(wandb): log artifacts

* fix: typo

* feat(wandb): ensure name is allowed

* feat(wandb): log artifact

* feat(wandb): saving logic

* style: improve formatting

* fix: unrelated typo

* feat: use a fake trainer

* fix: simplify

* feat(wandb): log model files as artifact

* style: fix style

* docs(wandb): correct description

* feat: unpack model + allow env Truethy values

* feat: TrainerCallback can access tokenizer

* style: fix style

* feat(wandb): log more interesting metadata

* feat: unpack tokenizer

* feat(wandb): metadata with load_best_model_at_end

* feat(wandb): more robust metadata

* style(wandb): fix formatting

30fa0b78

04 Jan, 2021 13 commits

[test_model_parallelization] multiple fixes (#9354) · 143289dc
Stas Bekman authored Jan 04, 2021

143289dc

Improve documentation coverage for Bertweet (#9379) · 086718ac

Qbiwan authored Jan 05, 2021

* bertweet docs coverage

* style doc max len 119

* maxlen style rst

* run main() from style_doc

* changed according to  comments

086718ac

replace apex.normalization.FusedLayerNorm with torch.nn.LayerNorm (#9386) · 47ca0eaa
Stas Bekman authored Jan 04, 2021

47ca0eaa
correct docs (#9378) · 75ff5305
Patrick von Platen authored Jan 04, 2021

75ff5305

Fix TF DPR (#9283) · ec54d70e

Julien Plu authored Jan 04, 2021

* Fix DPR

* Keep usual models

* Apply style

* Address Sylvain's comments

ec54d70e

Fix open (#9368) · de29ff9b
Julien Plu authored Jan 04, 2021

de29ff9b

[trainer] parametrize default output_dir (#9352) · d018afce

Stas Bekman authored Jan 04, 2021

This PR:

* fixes trainer to have the logger agree with the actual default `output_dir`, but setting it one place and passing it as an argument to both places

@sgugger

d018afce

Fix Flaubert (#9292) · d735b074
Julien Plu authored Jan 04, 2021

d735b074

Bump notebook from 6.1.4 to 6.1.5 in /examples/research_projects/lxmert (#9402) · 5dd389d1

dependabot[bot] authored Jan 04, 2021

Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5.
- [Release notes](https://github.com/jupyter/jupyterhub/releases)
- [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md)
- [Commits](https://github.com/jupyter/jupyterhub/commits

)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

5dd389d1

Put back LXMert example (#9401) · 23a71449
Sylvain Gugger authored Jan 04, 2021

23a71449
Fix CTRL (#9291) · 6c03d4ac
Julien Plu authored Jan 04, 2021

6c03d4ac

Add utility function for retrieving locally cached models (#8836) · c581d8af

Charles authored Jan 04, 2021



* add get_cached_models function

* add List type to import

* fix code quality

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/file_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fix style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

c581d8af

simplify marian distillation script (#9394) · 8eb7f26d
Sam Shleifer authored Jan 04, 2021

8eb7f26d

03 Jan, 2021 1 commit

Fix typos in README and bugs in RAG example code for end-to-end evaluation and finetuning (#9355) · d944966b

Yoshitomo Matsubara authored Jan 03, 2021

* fix a bug in eval_batch_retrieval

* should return parser as well as other staticmethod

* remove duplicate argument

* these kwargs are no longer accepted (cause TypeError in self.generator.generate of modeling_rag.py)

* fixed file paths in README

* moved an arg to add_ray_specific_args

d944966b

02 Jan, 2021 3 commits
- file_utils.py: TF examples outputs.last_hidden_states -> state (#9382) · c4fd609a
  Chris Kennedy authored Jan 02, 2021
  
  c4fd609a
- [Docs] `past_key_values` return a tuple of tuple as a default (#9381) · b01f451c
  Patrick von Platen authored Jan 02, 2021
```
* push

* make style
```
  b01f451c
- use return dict for rag encoder (#9363) · 5f7a07c0
  Derrick Blakely authored Jan 02, 2021
  
  5f7a07c0
30 Dec, 2020 1 commit
- torch.cuda.is_available() is redundant as apex handles that internally (#9350) · ae333d04
  Stas Bekman authored Dec 30, 2020
  
  ae333d04
29 Dec, 2020 3 commits

[prophetnet] wrong import (#9349) · 8217d4e3

Stas Bekman authored Dec 29, 2020

```
python -c "from apex.normalization import FusedProphetNetLayerNorm"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: cannot import name 'FusedProphetNetLayerNorm' from 'apex.normalization' (/home/stas/anaconda3/envs/main-38/lib/python3.8/site-packages/apex/normalization/__init__.py)
```
It looks like this code has never been tested, so it silently fails inside try/except.

Discovered this by accident in https://github.com/huggingface/transformers/issues/9338#issuecomment-752217708

8217d4e3

add import math (#9346) · 912f6881
Patrick von Platen authored Dec 29, 2020

912f6881
improve templates (#9342) · 785e52cd
Patrick von Platen authored Dec 29, 2020

785e52cd

28 Dec, 2020 3 commits

Fix TransfoXL (#9302) · 64103fb6
Julien Plu authored Dec 28, 2020

64103fb6
Fix TF T5 (#9301) · d97d06d0
Julien Plu authored Dec 28, 2020
```
* Fix T5

* Fix test

* Fix test
```
d97d06d0

[Seq2Seq Templates] Correct some TF-serving errors and add gradient... · 83fdd252

Patrick von Platen authored Dec 28, 2020

[Seq2Seq Templates] Correct some TF-serving errors and add gradient checkpointing to PT by default. (#9334)

* correct tests

* correct shape and get_tf_activation

* more correction tf

* add gradient checkpointing to templates

* correct typo

83fdd252

27 Dec, 2020 1 commit
- push (#9320) · 8e74eca7
  Patrick von Platen authored Dec 27, 2020
  
  8e74eca7