Commits · f0fc0aea6bd885ee90837cd96c3935a5d44e060a · chenpangpang / transformers

08 Sep, 2020 2 commits

pegasus.rst: fix expected output (#7017) · f0fc0aea
Sam Shleifer authored Sep 08, 2020

f0fc0aea

Sylvain Gugger authored Sep 08, 2020



* Initial model

* Fix upsampling

* Add special cls token id and test

* Formatting

* Test and fist FunnelTokenizerFast

* Common tests

* Fix the check_repo script and document Funnel

* Doc fixes

* Add all models

* Write doc

* Fix test

* Initial model

* Fix upsampling

* Add special cls token id and test

* Formatting

* Test and fist FunnelTokenizerFast

* Common tests

* Fix the check_repo script and document Funnel

* Doc fixes

* Add all models

* Write doc

* Fix test

* Fix copyright

* Forgot some layers can be repeated

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/modeling_funnel.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address review comments

* Update src/transformers/modeling_funnel.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments

* Update src/transformers/modeling_funnel.py
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Slow integration test

* Make small integration test

* Formatting

* Add checkpoint and separate classification head

* Formatting

* Expand list, fix link and add in pretrained models

* Styling

* Add the model in all summaries

* Typo fixes
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

d155b38d

03 Sep, 2020 1 commit

Adding the LXMERT pretraining model (MultiModal languageXvision) to... · ea2c6f1a

Antonio V Mendoza authored Sep 03, 2020


Adding the LXMERT pretraining model (MultiModal  languageXvision)  to HuggingFace's suite of models (#5793)

* added template files for LXMERT and competed the configuration_lxmert.py

* added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested]

* added model card for lxmert

* cleaning up lxmert code

* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* tested torch lxmert, changed documtention, updated outputs, and other small fixes

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* renaming, other small issues, did not change TF code in this commit

* added lxmert question answering model in pytorch

* added capability to edit number of qa labels for lxmert

* made answer optional for lxmert question answering

* add option to return hidden_states for lxmert

* changed default qa labels for lxmert

* changed config archive path

* squshing 3 commits: merged UI + testing improvments + more UI and testing

* changed some variable names for lxmert

* TF LXMERT

* Various fixes to LXMERT

* Final touches to LXMERT

* AutoTokenizer order

* Add LXMERT to index.rst and README.md

* Merge commit test fixes + Style update

* TensorFlow 2.3.0 sequential model changes variable names

Remove inherited test

* Update src/transformers/modeling_tf_pytorch_utils.py

* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* added suggestions

* Fixes

* Final fixes for TF model

* Fix docs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ea2c6f1a

02 Sep, 2020 2 commits

[pipelines] Text2TextGenerationPipeline (#6744) · 4230d30f

Suraj Patil authored Sep 02, 2020

* add Text2TextGenerationPipeline

* remove max length warning

* remove comments

* remove input_length

* fix typo

* add tests

* use TFAutoModelForSeq2SeqLM

* doc

* typo

* add the doc below TextGenerationPipeline

* doc nit

* style

* delete comment

4230d30f

minor docs grammar fixes (#6889) · ee1bff06
Harry Wang authored Sep 02, 2020

ee1bff06

01 Sep, 2020 6 commits

[EncoderDecoder] Add xlm-roberta to encoder decoder (#6878) · 4d1a3ffd
Patrick von Platen authored Sep 01, 2020
```
* finish xlm-roberta

* finish docs

* expose XLMRobertaForCausalLM
```
4d1a3ffd
Update docs stable version · 1461aac8
Lysandre Debut authored Sep 01, 2020

1461aac8
v3.1.0 documentation · 3726754a
Lysandre authored Sep 01, 2020

3726754a
Release: v3.1.0 · 4b3ee9cb
Lysandre authored Sep 01, 2020

4b3ee9cb

[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735) · afc4ece4

Patrick von Platen authored Sep 01, 2020

* fix generate for GPT2 Double Head

* fix gpt2 double head model

* fix  bart / t5

* also add for no beam search

* fix no beam search

* fix encoder decoder

* simplify t5

* simplify t5

* fix t5 tests

* fix BART

* fix transfo-xl

* fix conflict

* integrating sylvains and sams comments

* fix tf past_decoder_key_values

* fix enc dec test

afc4ece4

Logging doc (#6852) · d5f1ffa0

Sylvain Gugger authored Sep 01, 2020



* Add logging doc

* Foamtting

* Update docs/source/main_classes/logging.rst

* Update src/transformers/utils/logging.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

d5f1ffa0

27 Aug, 2020 1 commit
- Adafactor docs (#6765) · 41aa2b4e
  Lysandre Debut authored Aug 27, 2020
  
  41aa2b4e
26 Aug, 2020 1 commit
- fix torchscript docs (#6740) · fa8ee8e8
  Patrick von Platen authored Aug 26, 2020
  
  fa8ee8e8
25 Aug, 2020 1 commit

Add DPR to models summary (#6690) · 0f16dd0a

Quentin Lhoest authored Aug 25, 2020



* add dpr to models summary

* minor

* minor

* Update docs/source/model_summary.rst

qa -> question answering
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_summary.rst

qa -> question ansering (cont'd)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

0f16dd0a

24 Aug, 2020 2 commits
- [fixdoc] Add import to pegasus usage doc (#6698) · 0ebc9699
  Sam Shleifer authored Aug 24, 2020
  
  0ebc9699
- remove BartForConditionalGeneration.generate (#6659) · 912a21ec
  Stas Bekman authored Aug 24, 2020
```
As suggested here: https://github.com/huggingface/transformers/issues/6651#issuecomment-678594233
this removes generic `generate` doc with examples not-relevant to bart.
```
  912a21ec
21 Aug, 2020 4 commits
- [Doc model summary] add MBart model summary (#6649) · cbda7293
  Suraj Patil authored Aug 21, 2020
  
  cbda7293
- [Docs model summaries] Add pegasus to docs (#6640) · a4db4e30
  Patrick von Platen authored Aug 21, 2020
```
* add pegasus to docs

* Update docs/source/model_summary.rst
```
  a4db4e30
- CamembertForCausalLM (#6577) · d0e42a7b
  Suraj Patil authored Aug 21, 2020
```
* added CamembertForCausalLM

* add in __init__ and auto model

* style

* doc
```
  d0e42a7b
- Update ONNX doc to match the removal of --optimize argument. · b105f2c6
  Morgan Funtowicz authored Aug 21, 2020
```
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
```
  b105f2c6
20 Aug, 2020 2 commits

add intro to nlp lib & dataset links to custom datasets tutorial (#6583) · 039d8d65
Joe Davison authored Aug 20, 2020
```
* add intro to nlp lib + links

* unique links...
```
039d8d65

Docs copy button misses ... prefixed code (#6518) · cabfdfaf

Romain Rigaux authored Aug 20, 2020

Tested in a local build of the docs.

e.g. Just above https://huggingface.co/transformers/task_summary.html#causal-language-modeling

Copy will copy the full code, e.g.

for token in top_5_tokens:
     print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))

Instead of currently only:

for token in top_5_tokens:


>>> for token in top_5_tokens:
...     print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token])))
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help reduce our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help increase our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help decrease our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help offset our carbon footprint.
Distilled models are smaller than the models they mimic. Using them instead of the large versions would help improve our carbon footprint.

Docs for the option fix:
https://sphinx-copybutton.readthedocs.io/en/latest/

cabfdfaf

19 Aug, 2020 1 commit
- Fix #6575 (#6596) · 18ca0e91
  Sylvain Gugger authored Aug 19, 2020
  
  18ca0e91
18 Aug, 2020 4 commits
- [Pegasus Doc] minor typo (#6579) · fb6844af
  Suraj Patil authored Aug 18, 2020
```
Minor typo correction
@sshleifer
```
  fb6844af
- [docs] Fix number of 'ug' occurrences in tokenizer_summary (#6574) · 7516bcf2
  Romain Rigaux authored Aug 18, 2020
  
  7516bcf2
- [docs] Fix wrong newline in the middle of a paragraph (#6573) · 5a5af22e
  Romain Rigaux authored Aug 18, 2020
  
  5a5af22e
- [marian] converter supports models from new Tatoeba project (#6342) · 12d76241
  Sam Shleifer authored Aug 17, 2020
  
  12d76241
17 Aug, 2020 9 commits

[Doc] add more MBart and other doc (#6490) · c9564f53

Suraj Patil authored Aug 17, 2020

* add mbart example

* add Pegasus and MBart in readme

* typo

* add MBart in Pretrained models

* add pre-proc doc

* add DPR in readme

* fix indent

* doc fix

c9564f53

replace _ with __ rst links (#6541) · f68c8731
Stas Bekman authored Aug 17, 2020

f68c8731

[doc] multiple corrections to "Summary of the tasks" (#6509) · b732e7e1

Stas Bekman authored Aug 17, 2020

* [doc] multiple corrections to "Summary of the tasks"

* fix indentation

* correction

* fix links, add links to examples/seq2seq/README.md instead of non-existing script

b732e7e1

[doc] make the text more readable, fix some typos, add some disambiguation (#6508) · 84d33317

Stas Bekman authored Aug 17, 2020



* [doc] make the text more readable, fix some typos, add some disambiguation

* Update docs/source/glossary.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

84d33317

add custom datasets tutorial (#6466) · d0c2389f

Joe Davison authored Aug 17, 2020



* add custom datasets tutorial

* python -> bash code blocks

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* minor review feedback changes

* add working native QA snippet
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d0c2389f

fix pegasus doc (#6533) · 36010cb1
Patrick von Platen authored Aug 17, 2020

36010cb1
[doc] Summary of the models fixes (#6511) · 49d8076f
Stas Bekman authored Aug 17, 2020
```
* [doc] Summary of the models fixes

* correction
```
49d8076f
[doc] fix invalid env vars (#6504) · 423eb5b1
Stas Bekman authored Aug 16, 2020
```
- remove invalid `ENV_` prefix.
- add a few ':' while at it
```
423eb5b1
typos (#6505) · df15c7c2
Stas Bekman authored Aug 16, 2020

df15c7c2

14 Aug, 2020 3 commits

Generation doc (#6470) · 895ed8f4

Sylvain Gugger authored Aug 14, 2020



* Generation doc

* MBartForConditionalGeneration (#6441)

* add MBartForConditionalGeneration

* style

* rebase and fixes

* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS

* fix docs

* don't ignore mbart

* doc

* fix mbart fairseq link

* put mbart before bart

* apply doc suggestions

* Use hash to clean the test dirs (#6475)

* Use hash to clean the test dirs

* Use hash to clean the test dirs

* Use hash to clean the test dirs

* fix

* [EncoderDecoder] Add Cross Attention for GPT2 (#6415)

* add cross attention layers for gpt2

* make gpt2 cross attention work

* finish bert2gpt2

* add explicit comments

* remove attention mask since not yet supported

* revert attn mask in pipeline

* Update src/transformers/modeling_gpt2.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_encoder_decoder.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Sort unique_no_split_tokens to make it deterministic (#6461)

* change unique_no_split_tokens's type to set

* use sorted list instead of set

* style

* Import accuracy_score (#6480)

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address comments

* Styling

* Generation doc

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Address comments

* Styling
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
Co-authored-by: gijswijnholds <gijswijnholds@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

895ed8f4

Import accuracy_score (#6480) · b5ba758b
gijswijnholds authored Aug 14, 2020

b5ba758b

MBartForConditionalGeneration (#6441) · 680f1337

Suraj Patil authored Aug 14, 2020

* add MBartForConditionalGeneration

* style

* rebase and fixes

* add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS

* fix docs

* don't ignore mbart

* doc

* fix mbart fairseq link

* put mbart before bart

* apply doc suggestions

680f1337

12 Aug, 2020 1 commit
- [EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411) · 0735def8
  Patrick von Platen authored Aug 12, 2020
```
* add encoder-decoder for roberta

* fix headmask

* apply Sylvains suggestions

* fix typo

* Apply suggestions from code review
```
  0735def8