Commits · 097049b81bdd91e981ddbae68a5243e00d0f7c6e · chenpangpang / transformers

30 Sep, 2020 4 commits
- [s2s] fix kwargs style (#7488) · 03e46c1d
  Sam Shleifer authored Sep 30, 2020
  
  03e46c1d
- [s2s] Fix t5 warning for distributed eval (#7487) · 6fe8a693
  Sam Shleifer authored Sep 30, 2020
  
  6fe8a693
- Seq2SeqDataset: avoid passing src_lang everywhere (#7470) · c031d010
  Amanpreet Singh authored Sep 30, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  c031d010
- [s2strainer] fix eval dataset loading (#7477) · 08939cfd
  Suraj Patil authored Sep 30, 2020
  
  08939cfd
29 Sep, 2020 1 commit
- [s2s] consistent output format across eval scripts (#7435) · 74d8d69b
  Sam Shleifer authored Sep 28, 2020
  
  74d8d69b
28 Sep, 2020 1 commit

[T5] allow config.decoder_layers to control decoder size (#7409) · 748425d4

Sam Shleifer authored Sep 28, 2020

* Working assymmetrical T5

* rename decoder_layers -> num_decoder_layers

* Fix docstring

* Allow creation of asymmetric t5 students

748425d4

27 Sep, 2020 2 commits
- [s2s] rougeLSum expects \n between sentences (#7410) · 7296fea1
  Sam Shleifer authored Sep 27, 2020
```
Co-authored-by: Swetha Mandava <smandava@nvidia.com>
```
  7296fea1
- [s2s] add create student script (#7290) · eab5f596
  Suraj Patil authored Sep 28, 2020
```
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  eab5f596
25 Sep, 2020 1 commit
- doc changes (#7385) · 415071b4
  Suraj Patil authored Sep 25, 2020
  
  415071b4
24 Sep, 2020 3 commits
- Seq2SeqTrainer (#6769) · 9e68d075
  Suraj Patil authored Sep 25, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  9e68d075
- [s2s] distributed eval allows num_return_sequences > 1 (#7254) · d9d0f114
  Sam Shleifer authored Sep 24, 2020
  
  d9d0f114
- [seq2seq] make it easier to run the scripts (#7274) · eadd870b
  Stas Bekman authored Sep 24, 2020
  
  eadd870b
22 Sep, 2020 3 commits
- [s2s] only save metrics.json from rank zero (#7331) · 78387cc6
  Sam Shleifer authored Sep 22, 2020
  
  78387cc6
- [s2s] add src_lang kwarg for distributed eval (#7300) · e53138a1
  Sam Shleifer authored Sep 22, 2020
  
  e53138a1
- [s2s] add supported architecures to MD (#7252) · 25b0463d
  Sam Shleifer authored Sep 22, 2020
  
  25b0463d
21 Sep, 2020 4 commits
- [s2s] save hostname with repo info (#7301) · 656c27c3
  Sam Shleifer authored Sep 21, 2020
```
* save hostname
```
  656c27c3
- [s2s] adjust finetune + test to work with fsmt (#7263) · af4b98ed
  Stas Bekman authored Sep 21, 2020
  
  af4b98ed
- [s2s] s/alpha_loss_encoder/alpha_encoder_loss/ (#7298) · 8d562a2d
  Stas Bekman authored Sep 21, 2020
```
fix to match `distillation.py:        self.alpha_encoder_loss`
```
  8d562a2d
- [s2s tests] fix test_run_eval_search (#7297) · cbb2f75a
  Stas Bekman authored Sep 21, 2020
  
  cbb2f75a
20 Sep, 2020 1 commit
- examples/seq2seq/__init__.py mutates sys.path (#7194) · 7cbf0f72
  Stas Bekman authored Sep 20, 2020
  
  7cbf0f72
18 Sep, 2020 1 commit
- [s2s] distributed_eval.py saves better speed info (#7242) · 83dba10b
  Sam Shleifer authored Sep 18, 2020
  
  83dba10b
17 Sep, 2020 5 commits

[s2s] remove double assert (#7223) · 67d9fc50
Sam Shleifer authored Sep 17, 2020

67d9fc50
[s2s] dynamic batch size with --max_tokens_per_batch (#7030) · a5638b2b
Sam Shleifer authored Sep 17, 2020

a5638b2b
[s2s] run_eval/run_eval_search tweaks (#7192) · efeab6a3
Stas Bekman authored Sep 17, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
efeab6a3

[ported model] FSMT (FairSeq MachineTranslation) (#6940) · 1eeb206b

Stas Bekman authored Sep 17, 2020

* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in https://github.com/huggingface/transformers/pull/6956



* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad15ca6a9ef4e26178526e878e8fc5c85874.

* decouple from bart

* remove unused code #1

* remove unused code #2

* remove unused code #3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (#7034)

* [s2s] --eval_max_generate_length (#7018)

* Fix CI with change of name of nlp (#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

1eeb206b

[s2s] fix kwarg typo (#7196) · 45b0b1ff
Sam Shleifer authored Sep 16, 2020

45b0b1ff

16 Sep, 2020 3 commits
- [s2s] distributed eval cleanup (#7186) · 0203ad43
  Sam Shleifer authored Sep 16, 2020
  
  0203ad43
- Formatting · 3babef81
  sgugger authored Sep 16, 2020
  
  3babef81
- [s2s run_eval] new features (#7109) · fdaf8ab3
  Stas Bekman authored Sep 16, 2020
```
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
```
  fdaf8ab3
14 Sep, 2020 3 commits
- [s2s] distributed eval in one command (#7124) · 33d479d2
  Sam Shleifer authored Sep 14, 2020
  
  33d479d2
- [s2s distill] allow pegasus-12-12 (#7104) · 0fab3969
  Sam Shleifer authored Sep 14, 2020
  
  0fab3969
- [s2s] distributed eval cleanup (#7110) · de9e2979
  Sam Shleifer authored Sep 13, 2020
  
  de9e2979
13 Sep, 2020 1 commit
- [s2s] two stage run_distributed_eval.py (#7105) · e7f8d2ab
  Sam Shleifer authored Sep 13, 2020
  
  e7f8d2ab
12 Sep, 2020 1 commit
- [s2s] run_eval supports --prefix clarg. (#6953) · b76cb1c3
  Sam Shleifer authored Sep 12, 2020
  
  b76cb1c3
10 Sep, 2020 3 commits
- [wip/s2s] DistributedSortishSampler (#7056) · 77950c48
  Sam Shleifer authored Sep 10, 2020
  
  77950c48
- Fix CI with change of name of nlp (#7054) · 51448673
  Sylvain Gugger authored Sep 10, 2020
```
* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last
```
  51448673
- [s2s] --eval_max_generate_length (#7018) · e9a2f772
  Sam Shleifer authored Sep 10, 2020
  
  e9a2f772
07 Sep, 2020 1 commit
- [s2s] warn if --fp16 for torch 1.6 (#6977) · ce37be9d
  Sam Shleifer authored Sep 06, 2020
  
  ce37be9d
04 Sep, 2020 2 commits
- [s2s] run_eval.py parses generate_kwargs (#6948) · a4fc0c80
  Sam Shleifer authored Sep 04, 2020
  
  a4fc0c80
- [s2s] distill: --normalize_hidden --supervise_forward (#6834) · 6078b120
  Sam Shleifer authored Sep 04, 2020
  
  6078b120