Commits · 514486739cc732ad05549d81bd48c0aa9e03a0f3 · chenpangpang / transformers

10 Sep, 2020 2 commits
- Fix CI with change of name of nlp (#7054) · 51448673
  Sylvain Gugger authored Sep 10, 2020
```
* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last
```
  51448673
- [s2s] --eval_max_generate_length (#7018) · e9a2f772
  Sam Shleifer authored Sep 10, 2020
  
  e9a2f772
08 Sep, 2020 1 commit
- Fix typo (#6994) · 1b76936d
  Manuel Romero authored Sep 08, 2020
  
  1b76936d
07 Sep, 2020 3 commits

Remove misleading docstring · 1650130b
Lysandre authored Sep 07, 2020

1650130b

feat: allow prefix for any generative model (#5885) · 995a958d

Boris Dayma authored Sep 07, 2020



* feat: allow padding_text for any generative model

* docs(pipelines.py): correct typo

* Update src/transformers/pipelines.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* feat: rename padding_text to prefix

* fix: cannot tokenize empty text

* fix: pass prefix arg to pipeline

* test: add prefix to text-generetation pipeline

* style: fix style

* style: clean code and variable name more explicit

* set arg docstring to optional
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

995a958d

[s2s] warn if --fp16 for torch 1.6 (#6977) · ce37be9d
Sam Shleifer authored Sep 06, 2020

ce37be9d

04 Sep, 2020 3 commits
- [doc] remove the implied defaults to :obj:`None`, s/True/ :obj:`True/, etc. (#6956) · 48ff6d51
  Stas Bekman authored Sep 04, 2020
```
* remove the implied defaults to :obj:`None`

* fix bug in the original

* replace to :obj:`True`, :obj:`False`
```
  48ff6d51
- [s2s] run_eval.py parses generate_kwargs (#6948) · a4fc0c80
  Sam Shleifer authored Sep 04, 2020
  
  a4fc0c80
- [s2s] distill: --normalize_hidden --supervise_forward (#6834) · 6078b120
  Sam Shleifer authored Sep 04, 2020
  
  6078b120
03 Sep, 2020 5 commits
- [s2s] support early stopping based on loss, rather than rouge (#6927) · e95d262f
  Sam Shleifer authored Sep 03, 2020
  
  e95d262f
- [s2s] use --eval_beams command line arg (#6926) · 207ed8cb
  Sam Shleifer authored Sep 03, 2020
  
  207ed8cb
- [s2s] allow task_specific_params=summarization_xsum (#6923) · 39ed68d5
  Sam Shleifer authored Sep 03, 2020
  
  39ed68d5
- [s2s]: script to convert pl checkpoints to hf checkpoints (#6911) · 5a318f07
  Sam Shleifer authored Sep 03, 2020
```
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  5a318f07
- tweak tar command in readme (#6919) · b8e4906c
  brett koonce authored Sep 03, 2020
  
  b8e4906c
01 Sep, 2020 1 commit

Add cache_dir to save features TextDataset (#6879) · 21d71923

Jin Young (Daniel) Sohn authored Sep 01, 2020

* Add cache_dir to save features TextDataset

This is in case the dataset is in a RO filesystem, for which is the case
in tests (GKE TPU tests).

* style

21d71923

31 Aug, 2020 3 commits
- [fix] typo in available in helper function (#6859) · 431ab19d
  Sam Shleifer authored Aug 31, 2020
  
  431ab19d
- [s2s] command line args for faster val steps (#6833) · b9772897
  Sam Shleifer authored Aug 31, 2020
  
  b9772897
- Marian distill scripts + integration test (#6799) · 61b7ba93
  Sam Shleifer authored Aug 31, 2020
  
  61b7ba93
30 Aug, 2020 2 commits
- [s2s README] Add more dataset download instructions (#6737) · dfa10a41
  Sam Shleifer authored Aug 30, 2020
  
  dfa10a41
- clearly indicate shuffle=False (#6312) · 32fe4408
  xujiaze13 authored Aug 30, 2020
```
* Clarify shuffle

* clarify shuffle
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
```
  32fe4408
29 Aug, 2020 2 commits
- Pegasus finetune script: add --adafactor (#6811) · 0f58903b
  Sam Shleifer authored Aug 29, 2020
  
  0f58903b
- [s2s] round runtime in run_eval (#6798) · ac47458a
  Sam Shleifer authored Aug 29, 2020
  
  ac47458a
28 Aug, 2020 3 commits

[s2s] Test hub configs in self-scheduled CI (#6809) · 5ab21b07
Sam Shleifer authored Aug 28, 2020

5ab21b07

prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654) · 9336086a

Sam Shleifer authored Aug 28, 2020

* broken test

* batch parity

* tests pass

* boom boom

* boom boom

* split out bart tokenizer tests

* fix tests

* boom boom

* Fixed dataset bug

* Fix marian

* Undo extra

* Get marian working

* Fix t5 tok tests

* Test passing

* Cleanup

* better assert msg

* require torch

* Fix mbart tests

* undo extra decoder_attn_mask change

* Fix import

* pegasus tokenizer can ignore src_lang kwargs

* unused kwarg test cov

* boom boom

* add todo for pegasus issue

* cover one word translation edge case

* Cleanup

* doc

9336086a

PL: --adafactor option (#6776) · fb78a90d
Sam Shleifer authored Aug 27, 2020

fb78a90d

27 Aug, 2020 3 commits
- Fix it to work with BART (#6756) · c225e872
  Tom Grek authored Aug 27, 2020
  
  c225e872
- Fix the TF Trainer gradient accumulation and the TF NER example (#6713) · 6f289dc9
  Julien Plu authored Aug 27, 2020
```
* Align TF NER example over the PT one

* Fix Dataset call

* Fix gradient accumulation training

* Apply style

* Address Sylvain's comments

* Address Sylvain's comments

* Apply style
```
  6f289dc9
- s2s distillation uses AutoModelForSeqToSeqLM (#6761) · 4bd7be9a
  Sam Shleifer authored Aug 26, 2020
  
  4bd7be9a
26 Aug, 2020 2 commits
- [s2s] run_eval.py QOL improvements and cleanup(#6746) · 61518e2d
  Sam Shleifer authored Aug 26, 2020
  
  61518e2d
- Black 20 release · a75c64d8
  Lysandre authored Aug 26, 2020
  
  a75c64d8
25 Aug, 2020 2 commits

Allow tests in examples to use cuda or fp16,if they are available (#5512) · 4db2fa77

Joel Hanson authored Aug 25, 2020

* Allow tests in examples to use cuda or fp16,if they are available

The tests in examples didn't use the cuda or fp16 even if they where available.
- The text classification example (`run_glue.py`) didn't use the fp16 even if it was available but
  the device was take based on the availablity(cuda/cpu).
- The language-modeling example (`run_language_modeling.py`) was having `--no_cuda` argument
  which made the test to work without cuda. This example is having issue when running with fp16
  thus it not enabled (got an assertion error for perplexity due to it higher value).
- The cuda and fp16 is not enabled for question-answering example (`run_squad.py`) as it is having a
  difference in the f1 score.
- The text-generation example (`run_generation.py`) will take the cuda or fp16 whenever it is available.

Resolves some of: #5057

* Unwanted import of is_apex_available was removed

* Made changes to test examples file to have the pass --fp16 only if cuda and apex is avaliable
- run_glue.py: Removed the check for cuda and fp16.
- run_generation.py: Removed the check for cuda and fp16 also removed unwanted flag creation.

* Incorrectly sorted imports fixed

* The model needs to be converted to half precision

* Formatted single line if condition statement to multiline

* The torch_device also needed to be checked before running the test on examples
- The tests in examples which uses cuda should also depend from the USE_CUDA flag,
  similarly to the rest of the test suite. Even if we decide to set USE_CUDA to
  True by default, setting USE_CUDA to False should result in the examples not using CUDA

* Format some of the code in test_examples file

* The improper import of is_apex_available was sorted

* Formatted the code to keep the style standards

* The comma at the end of list giving a flake8 issue was fixed

* Import sort was fixed

* Removed the clean_test_dir function as its not used right now

4db2fa77

[s2s] round bleu, rouge to 4 digits (#6704) · 0344428f
Sam Shleifer authored Aug 25, 2020

0344428f

24 Aug, 2020 2 commits
- Fix PL token classification examples (#6682) · dd522da0
  vblagoje authored Aug 24, 2020
  
  dd522da0
- Update repo to isort v5 (#6686) · a5737779
  Sylvain Gugger authored Aug 24, 2020
```
* Run new isort

* More changes

* Update CI, CONTRIBUTING and benchmarks
```
  a5737779
18 Aug, 2020 1 commit
- update xnli-mt url (#6580) · 6f972e14
  Suraj Patil authored Aug 18, 2020
  
  6f972e14
17 Aug, 2020 4 commits
- allow spaces in bash args with "$@" (#6521) · d2da2cb2
  Sam Shleifer authored Aug 17, 2020
  
  d2da2cb2
- [testing] a new TestCasePlus subclass + get_auto_remove_tmp_dir() (#6494) · 9dbe4094
  Stas Bekman authored Aug 17, 2020
```
* [testing] switch to a new TestCasePlus + get_auto_remove_tmp_dir() for auto-removal of tmp dirs

* respect after=True for tempfile, simplify code

* comments

* comment fix

* put `before` last in args, so can make debug even faster
```
  9dbe4094
- [lightning_base] fix s2s logging, only make train_loader once (#6404) · 84c265ff
  Sam Shleifer authored Aug 16, 2020
  
  84c265ff
- [s2s] docs, document desired filenames nicely (#6525) · 72add6c9
  Sam Shleifer authored Aug 16, 2020
  
  72add6c9
16 Aug, 2020 1 commit
- Fixes paths with spaces in seq2seq example (#6493) · 20601811
  Kyle Piira authored Aug 16, 2020
  
  20601811