Commits · 1eeb206bef3cccd21a6e910b4ad08f4d0c3037f2 · chenpangpang / transformers

17 Sep, 2020 1 commit

[ported model] FSMT (FairSeq MachineTranslation) (#6940) · 1eeb206b

Stas Bekman authored Sep 17, 2020

* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in https://github.com/huggingface/transformers/pull/6956



* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad15ca6a9ef4e26178526e878e8fc5c85874.

* decouple from bart

* remove unused code #1

* remove unused code #2

* remove unused code #3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (#7034)

* [s2s] --eval_max_generate_length (#7018)

* Fix CI with change of name of nlp (#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

1eeb206b

14 Sep, 2020 3 commits
- Extra ) · 5636cbb2
  sgugger authored Sep 14, 2020
  
  5636cbb2
- Clean up autoclass doc (#7081) · ccc8e30c
  Sylvain Gugger authored Sep 14, 2020
  
  ccc8e30c
- fix link to paper (#7116) · 15d18e03
  Bartosz Telenczuk authored Sep 14, 2020
  
  15d18e03
10 Sep, 2020 4 commits

[BertGeneration, Docs] Fix another old name in docs (#7050) · db38f7ce
Patrick von Platen authored Sep 10, 2020
```
* correct docs for bert generation

* upload
```
db38f7ce
correct docs for bert generation (#7048) · 3bd95b0f
Patrick von Platen authored Sep 10, 2020

3bd95b0f

Add TF Funnel Transformer (#7029) · 15a18904

Sylvain Gugger authored Sep 10, 2020



* Add TF Funnel Transformer

* Proper dummy input

* Formatting

* Update src/transformers/modeling_tf_funnel.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* One review comment forgotten
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

15a18904

Add "Leveraging Pretrained Checkpoints for Generation" Seq2Seq models. (#6594) · 7fd1febf

Patrick von Platen authored Sep 10, 2020

* add conversion script

* improve conversion script

* make style

* add tryout files

* fix

* update

* add causal bert

* better names

* add tokenizer file as well

* finish causal_bert

* fix small bugs

* improve generate

* change naming

* renaming

* renaming

* renaming

* remove leftover files

* clean files

* add fix tokenizer

* finalize

* correct slow test

* update docs

* small fixes

* fix link

* adapt check repo

* apply sams and sylvains recommendations

* fix import

* implement Lysandres recommendations

* fix logger warn

7fd1febf

08 Sep, 2020 2 commits

pegasus.rst: fix expected output (#7017) · f0fc0aea
Sam Shleifer authored Sep 08, 2020

f0fc0aea

Funnel transformer (#6908) · d155b38d

Sylvain Gugger authored Sep 08, 2020



* Initial model

* Fix upsampling

* Add special cls token id and test

* Formatting

* Test and fist FunnelTokenizerFast

* Common tests

* Fix the check_repo script and document Funnel

* Doc fixes

* Add all models

* Write doc

* Fix test

* Initial model

* Fix upsampling

* Add special cls token id and test

* Formatting

* Test and fist FunnelTokenizerFast

* Common tests

* Fix the check_repo script and document Funnel

* Doc fixes

* Add all models

* Write doc

* Fix test

* Fix copyright

* Forgot some layers can be repeated

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/modeling_funnel.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Update src/transformers/modeling_funnel.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments

* Update src/transformers/modeling_funnel.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Slow integration test

* Make small integration test

* Formatting

* Add checkpoint and separate classification head

* Formatting

* Expand list, fix link and add in pretrained models

* Styling

* Add the model in all summaries

* Typo fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

d155b38d

03 Sep, 2020 1 commit

Adding the LXMERT pretraining model (MultiModal languageXvision) to... · ea2c6f1a

Antonio V Mendoza authored Sep 03, 2020


Adding the LXMERT pretraining model (MultiModal  languageXvision)  to HuggingFace's suite of models (#5793)

* added template files for LXMERT and competed the configuration_lxmert.py

* added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested]

* added model card for lxmert

* cleaning up lxmert code

* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* tested torch lxmert, changed documtention, updated outputs, and other small fixes

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* renaming, other small issues, did not change TF code in this commit

* added lxmert question answering model in pytorch

* added capability to edit number of qa labels for lxmert

* made answer optional for lxmert question answering

* add option to return hidden_states for lxmert

* changed default qa labels for lxmert

* changed config archive path

* squshing 3 commits: merged UI + testing improvments + more UI and testing

* changed some variable names for lxmert

* TF LXMERT

* Various fixes to LXMERT

* Final touches to LXMERT

* AutoTokenizer order

* Add LXMERT to index.rst and README.md

* Merge commit test fixes + Style update

* TensorFlow 2.3.0 sequential model changes variable names

Remove inherited test

* Update src/transformers/modeling_tf_pytorch_utils.py

* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* added suggestions

* Fixes

* Final fixes for TF model

* Fix docs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ea2c6f1a

01 Sep, 2020 2 commits

[EncoderDecoder] Add xlm-roberta to encoder decoder (#6878) · 4d1a3ffd
Patrick von Platen authored Sep 01, 2020
```
* finish xlm-roberta

* finish docs

* expose XLMRobertaForCausalLM
```
4d1a3ffd

[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735) · afc4ece4

Patrick von Platen authored Sep 01, 2020

* fix generate for GPT2 Double Head

* fix gpt2 double head model

* fix  bart / t5

* also add for no beam search

* fix no beam search

* fix encoder decoder

* simplify t5

* simplify t5

* fix t5 tests

* fix BART

* fix transfo-xl

* fix conflict

* integrating sylvains and sams comments

* fix tf past_decoder_key_values

* fix enc dec test

afc4ece4

24 Aug, 2020 2 commits
- [fixdoc] Add import to pegasus usage doc (#6698) · 0ebc9699
  Sam Shleifer authored Aug 24, 2020
  
  0ebc9699
- remove BartForConditionalGeneration.generate (#6659) · 912a21ec
  Stas Bekman authored Aug 24, 2020
```
As suggested here: https://github.com/huggingface/transformers/issues/6651#issuecomment-678594233
this removes generic `generate` doc with examples not-relevant to bart.
```
  912a21ec
21 Aug, 2020 1 commit

CamembertForCausalLM (#6577) · d0e42a7b

Suraj Patil authored Aug 21, 2020

* added CamembertForCausalLM

* add in __init__ and auto model

* style

* doc

d0e42a7b

18 Aug, 2020 2 commits
- [Pegasus Doc] minor typo (#6579) · fb6844af
  Suraj Patil authored Aug 18, 2020
```
Minor typo correction
@sshleifer
```
  fb6844af
- [marian] converter supports models from new Tatoeba project (#6342) · 12d76241
  Sam Shleifer authored Aug 17, 2020
  
  12d76241
17 Aug, 2020 2 commits
- [Doc] add more MBart and other doc (#6490) · c9564f53
  Suraj Patil authored Aug 17, 2020
```
* add mbart example

* add Pegasus and MBart in readme

* typo

* add MBart in Pretrained models

* add pre-proc doc

* add DPR in readme

* fix indent

* doc fix
```
  c9564f53
- fix pegasus doc (#6533) · 36010cb1
  Patrick von Platen authored Aug 17, 2020
  
  36010cb1
14 Aug, 2020 1 commit

MBartForConditionalGeneration (#6441) · 680f1337

Suraj Patil authored Aug 14, 2020

* add MBartForConditionalGeneration

* style

* rebase and fixes

* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS

* fix docs

* don't ignore mbart

* doc

* fix mbart fairseq link

* put mbart before bart

* apply doc suggestions

680f1337

12 Aug, 2020 3 commits
- [EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411) · 0735def8
  Patrick von Platen authored Aug 12, 2020
```
* add encoder-decoder for roberta

* fix headmask

* apply Sylvains suggestions

* fix typo

* Apply suggestions from code review
```
  0735def8
- Activate check on the CI (#6427) · a8db954c
  Sylvain Gugger authored Aug 12, 2020
```
* Activate check on the CI

* Fix repo inconsistencies

* Don't document too much
```
  a8db954c
- Move prediction_loss_only to TrainingArguments (#6426) · 34fabe16
  Sylvain Gugger authored Aug 12, 2020
  
  34fabe16
11 Aug, 2020 2 commits
- rename prepare_translation_batch -> prepare_seq2seq_batch (#6103) · be1520d3
  Sam Shleifer authored Aug 11, 2020
  
  be1520d3
- PegasusForConditionalGeneration (torch version) (#6340) · 66fa8cea
  Sam Shleifer authored Aug 11, 2020
```
Co-authored-by: Jingqing  Zhang <jingqing.zhang15@imperial.ac.uk>
```
  66fa8cea
10 Aug, 2020 1 commit

TF Longformer (#5764) · 00bb0b25

Patrick von Platen authored Aug 10, 2020



* improve names and tests longformer

* more and better tests for longformer

* add first tf test

* finalize tf basic op functions

* fix merge

* tf shape test passes

* narrow down discrepancies

* make longformer local attn tf work

* correct tf longformer

* add first global attn function

* add more global longformer func

* advance tf longformer

* finish global attn

* upload big model

* finish all tests

* correct false any statement

* fix common tests

* make all tests pass except keras save load

* fix some tests

* fix torch test import

* finish tests

* fix test

* fix torch tf tests

* add docs

* finish docs

* Update src/transformers/modeling_longformer.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_longformer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply Lysandres suggestions

* reverse to assert statement because function will fail otherwise

* applying sylvains recommendations

* Update src/transformers/modeling_longformer.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update src/transformers/modeling_tf_longformer.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

00bb0b25

07 Aug, 2020 1 commit

Add a script to check all models are tested and documented (#6298) · 6ba540b7

Sylvain Gugger authored Aug 07, 2020



* Add a script to check all models are tested and documented

* Apply suggestions from code review
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>

* Address comments
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>

6ba540b7

05 Aug, 2020 1 commit

Tf model outputs (#6247) · c67d1a02

Sylvain Gugger authored Aug 05, 2020

* TF outputs and test on BERT

* Albert to DistilBert

* All remaining TF models except T5

* Documentation

* One file forgotten

* TF outputs and test on BERT

* Albert to DistilBert

* All remaining TF models except T5

* Documentation

* One file forgotten

* Add new models and fix issues

* Quality improvements

* Add T5

* A bit of cleanup

* Fix for slow tests

* Style

c67d1a02

03 Aug, 2020 1 commit

Remove outdated BERT tips (#6217) · 3c289fb3

Kevin Canwen Xu authored Aug 04, 2020

* Remove out-dated BERT tips

* Update modeling_outputs.py

* Update bert.rst

* Update bert.rst

3c289fb3

01 Aug, 2020 1 commit
- Fixed typo in Longformer (#6180) · a39dfe4f
  Faiaz Rahman authored Aug 01, 2020
  
  a39dfe4f
30 Jul, 2020 1 commit

Actually the extra_id are from 0-99 and not from 1-100 (#5967) · d24ea708

Oren Amsalem authored Jul 30, 2020

a = tokenizer.encode("we got a <extra_id_99>", return_tensors='pt',add_special_tokens=True)
print(a)
>tensor([[   62,   530,     3,     9, 32000]])
a = tokenizer.encode("we got a <extra_id_100>", return_tensors='pt',add_special_tokens=True)
print(a)
>tensor([[   62,   530,     3,     9,     3,     2, 25666,   834,    23,    26,
           834,  2915,  3155]])

d24ea708

20 Jul, 2020 1 commit
- Add AlbertForPretraining to doc (#5914) · a2096917
  Sylvain Gugger authored Jul 20, 2020
  
  a2096917
13 Jul, 2020 1 commit

FlaubertForTokenClassification (#5644) · 45addfe9

Stas Bekman authored Jul 13, 2020

* implement FlaubertForTokenClassification as a subclass of XLMForTokenClassification

* fix mapping order

* add the doc

* add common tests

45addfe9

10 Jul, 2020 1 commit

Document model outputs (#5673) · 7fad617d

Sylvain Gugger authored Jul 10, 2020



* Document model outputs

* Update docs/source/main_classes/output.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

7fad617d

07 Jul, 2020 2 commits

Add mbart-large-cc25, support translation finetuning (#5129) · 353b8f1e

Sam Shleifer authored Jul 07, 2020

improve unittests for finetuning, especially w.r.t testing frozen parameters
fix freeze_embeds for T5
add streamlit setup.cfg

353b8f1e

Add DPR model (#5279) · fbd87921

Quentin Lhoest authored Jul 07, 2020



* beginning of dpr modeling

* wip

* implement forward

* remove biencoder + better init weights

* export dpr model to embed model for nlp lib

* add new api

* remove old code

* make style

* fix dumb typo

* don't load bert weights

* docs

* docs

* style

* move the `k` parameter

* fix init_weights

* add pretrained configs

* minor

* update config names

* style

* better config

* style

* clean code based on PR comments

* change Dpr to DPR

* fix config

* switch encoder config to a dict

* style

* inheritance -> composition

* add messages in assert startements

* add dpr reader tokenizer

* one tokenizer per model

* fix base_model_prefix

* fix imports

* typo

* add convert script

* docs

* change tokenizers conf names

* style

* change tokenizers conf names

* minor

* minor

* fix wrong names

* minor

* remove unused convert functions

* rename convert script

* use return_tensors in tokenizers

* remove n_questions dim

* move generate logic to tokenizer

* style

* add docs

* docs

* quality

* docs

* add tests

* style

* add tokenization tests

* DPR full tests

* Stay true to the attention mask building

* update docs

* missing param in bert input docs

* docs

* style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

fbd87921

01 Jul, 2020 2 commits
- [Reformer] Add Masked LM Reformer (#5426) · d16e36c7
  Patrick von Platen authored Jul 01, 2020
```
* fix conflicts

* fix

* happy rebasing
```
  d16e36c7
- finish reformer qa head (#5433) · fe81f7d1
  Patrick von Platen authored Jul 01, 2020
  
  fe81f7d1
22 Jun, 2020 1 commit
- Add TF auto model to the docs + fix sphinx warnings (#5187) · 1262495a
  Sylvain Gugger authored Jun 22, 2020
  
  1262495a