Commits · 25afb4ea502202d082b50b2a96d1f8d5347a798e · chenpangpang / transformers

03 Sep, 2020 1 commit

Adding the LXMERT pretraining model (MultiModal languageXvision) to... · ea2c6f1a

Antonio V Mendoza authored Sep 03, 2020


Adding the LXMERT pretraining model (MultiModal  languageXvision)  to HuggingFace's suite of models (#5793)

* added template files for LXMERT and competed the configuration_lxmert.py

* added modeling, tokization, testing, and finishing touched for lxmert [yet to be tested]

* added model card for lxmert

* cleaning up lxmert code

* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/modeling_lxmert.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* tested torch lxmert, changed documtention, updated outputs, and other small fixes

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update src/transformers/convert_pytorch_checkpoint_to_tf2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* renaming, other small issues, did not change TF code in this commit

* added lxmert question answering model in pytorch

* added capability to edit number of qa labels for lxmert

* made answer optional for lxmert question answering

* add option to return hidden_states for lxmert

* changed default qa labels for lxmert

* changed config archive path

* squshing 3 commits: merged UI + testing improvments + more UI and testing

* changed some variable names for lxmert

* TF LXMERT

* Various fixes to LXMERT

* Final touches to LXMERT

* AutoTokenizer order

* Add LXMERT to index.rst and README.md

* Merge commit test fixes + Style update

* TensorFlow 2.3.0 sequential model changes variable names

Remove inherited test

* Update src/transformers/modeling_tf_pytorch_utils.py

* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/model_doc/lxmert.rst
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_lxmert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* added suggestions

* Fixes

* Final fixes for TF model

* Fix docs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ea2c6f1a

17 Aug, 2020 1 commit

[Doc] add more MBart and other doc (#6490) · c9564f53

Suraj Patil authored Aug 17, 2020

* add mbart example

* add Pegasus and MBart in readme

* typo

* add MBart in Pretrained models

* add pre-proc doc

* add DPR in readme

* fix indent

* doc fix

c9564f53

11 Aug, 2020 1 commit
- PegasusForConditionalGeneration (torch version) (#6340) · 66fa8cea
  Sam Shleifer authored Aug 11, 2020
```
Co-authored-by: Jingqing  Zhang <jingqing.zhang15@imperial.ac.uk>
```
  66fa8cea
31 Jul, 2020 1 commit

Replace mecab-python3 with fugashi for Japanese tokenization (#6086) · cf3cf304

Paul O'Leary McCann authored Jul 31, 2020



* Replace mecab-python3 with fugashi

This replaces mecab-python3 with fugashi for Japanese tokenization. I am
the maintainer of both projects.

Both projects are MeCab wrappers, so the underlying C++ code is the
same. fugashi is the newer wrapper and doesn't use SWIG, so for basic
use of the MeCab API it's easier to use.

This code insures the use of a version of ipadic installed via pip,
which should make versioning and tracking down issues easier.

fugashi has wheels for Windows, OSX, and Linux, which will help with
issues with installing old versions of mecab-python3 on Windows.
Compared to mecab-python3, because fugashi doesn't use SWIG, it doesn't
require a C++ runtime to be installed on Windows.

In adding this change I removed some code dealing with `cursor`,
`token_start`, and `token_end` variables. These variables didn't seem to
be used for anything, it is unclear to me why they were there.

I ran the tests and they passed, though I couldn't figure out how to run
the slow tests (`--runslow` gave an error) and didn't try testing with
Tensorflow.

* Style fix

* Remove unused variable

Forgot to delete this...

* Adapt doc with install instructions

* Fix typo
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

cf3cf304

16 Jun, 2020 1 commit
- Fix all sphynx warnings (#5068) · 011cc0be
  Sylvain Gugger authored Jun 16, 2020
  
  011cc0be
15 Jun, 2020 1 commit
- Add bart-base (#5014) · a9f1fc6c
  Sam Shleifer authored Jun 15, 2020
  
  a9f1fc6c
02 Jun, 2020 1 commit

Kill model archive maps (#4636) · d4c2cb40

Julien Chaumond authored Jun 02, 2020

* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI

d4c2cb40

19 May, 2020 2 commits

[Longformer] Docs and clean API (#4464) · 48c3a70b
Patrick von Platen authored May 19, 2020
```
* add longformer docs

* improve docs
```
48c3a70b

Longformer (#4352) · 8f1d0471

Iz Beltagy authored May 19, 2020

* first commit

* bug fixes

* better examples

* undo padding

* remove wrong VOCAB_FILES_NAMES

* License

* make style

* make isort happy

* unit tests

* integration test

* make `black` happy by undoing `isort` changes!!

* lint

* no need for the padding value

* batch_size not bsz

* remove unused type casting

* seqlen not seq_len

* staticmethod

* `bert` selfattention instead of `n2`

* uint8 instead of bool + lints

* pad inputs_embeds using embeddings not a constant

* black

* unit test with padding

* fix unit tests

* remove redundant unit test

* upload model weights

* resolve todo

* simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_

* increase unittest coverage

8f1d0471

11 May, 2020 1 commit
- [Reformer] Add Enwiki8 Reformer Model - Adapt convert script (#4282) · ac7d5f67
  Patrick von Platen authored May 11, 2020
```
* adapt convert script

* update convert script

* finish

* fix marian pretrained docs
```
  ac7d5f67
10 May, 2020 1 commit

[Marian] documentation and AutoModel support (#4152) · 3487be75

Sam Shleifer authored May 10, 2020

- MarianSentencepieceTokenizer - > MarianTokenizer
- Start using unk token.
- add docs page
- add better generation params to MarianConfig
- more conversion utilities

3487be75

07 May, 2020 1 commit

Reformer (#3351) · dca34695

Patrick von Platen authored May 07, 2020

* first copy & past commit from Bert and morgans LSH code

* add easy way to compare to trax original code

* translate most of function

* make trax lsh self attention deterministic with numpy seed + copy paste code

* add same config

* add same config

* make layer init work

* implemented hash_vectors function for lsh attention

* continue reformer translation

* hf LSHSelfAttentionLayer gives same output as trax layer

* refactor code

* refactor code

* refactor code

* refactor

* refactor + add reformer config

* delete bogus file

* split reformer attention layer into two layers

* save intermediate step

* save intermediate step

* make test work

* add complete reformer block layer

* finish reformer layer

* implement causal and self mask

* clean reformer test and refactor code

* fix merge conflicts

* fix merge conflicts

* update init

* fix device for GPU

* fix chunk length init for tests

* include morgans optimization

* improve memory a bit

* improve comment

* factorize num_buckets

* better testing parameters

* make whole model work

* make lm model work

* add t5 copy paste tokenizer

* add chunking feed forward

* clean config

* add improved assert statements

* make tokenizer work

* improve test

* correct typo

* extend config

* add complexer test

* add new axial position embeddings

* add local block attention layer

* clean tests

* refactor

* better testing

* save intermediate progress

* clean test file

* make shorter input length work for model

* allow variable input length

* refactor

* make forward pass for pretrained model work

* add generation possibility

* finish dropout and init

* make style

* refactor

* add first version of RevNet Layers

* make forward pass work and add convert file

* make uploaded model forward pass work

* make uploaded model forward pass work

* refactor code

* add namedtuples and cache buckets

* correct head masks

* refactor

* made reformer more flexible

* make style

* remove set max length

* add attention masks

* fix up tests

* fix lsh attention mask

* make random seed optional for the moment

* improve memory in reformer

* add tests

* make style

* make sure masks work correctly

* detach gradients

* save intermediate

* correct backprob through gather

* make style

* change back num hashes

* rename to labels

* fix rotation shape

* fix detach

* update

* fix trainer

* fix backward dropout

* make reformer more flexible

* fix conflict

* fix

* fix

* add tests for fixed seed in reformer layer

* fix trainer typo

* fix typo in activations

* add fp16 tests

* add fp16 training

* support fp16

* correct gradient bug in reformer

* add fast gelu

* re-add dropout for embedding dropout

* better naming

* better naming

* renaming

* finalize test branch

* finalize tests

* add more tests

* finish tests

* fix

* fix type trainer

* fix fp16 tests

* fix tests

* fix tests

* fix tests

* fix issue with dropout

* fix dropout seeds

* correct random seed on gpu

* finalize random seed for dropout

* finalize random seed for dropout

* remove duplicate line

* correct half precision bug

* make style

* refactor

* refactor

* docstring

* remove sinusoidal position encodings for reformer

* move chunking to modeling_utils

* make style

* clean config

* make style

* fix tests

* fix auto tests

* pretrained models

* fix docstring

* update conversion file

* Update pretrained_models.rst

* fix rst

* fix rst

* update copyright

* fix test path

* fix test path

* fix small issue in test

* include reformer in generation tests

* add docs for axial position encoding

* finish docs

* Update convert_reformer_trax_checkpoint_to_pytorch.py

* remove isort

* include sams comments

* remove wrong comment in utils

* correct typos

* fix typo

* Update reformer.rst

* applied morgans optimization

* make style

* make gpu compatible

* remove bogus file

* big test refactor

* add example for chunking

* fix typo

* add to README

dca34695

16 Apr, 2020 1 commit

[Docs] Add DialoGPT (#3755) · d22894df

Patrick von Platen authored Apr 16, 2020



* add dialoGPT

* update README.md

* fix conflict

* update readme

* add code links to docs

* Update README.md

* Update dialo_gpt2.rst

* Update pretrained_models.rst

* Update docs/source/model_doc/dialo_gpt2.rst
Co-Authored-By: Julien Chaumond <chaumond@gmail.com>

* change filename of dialogpt
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

d22894df

10 Apr, 2020 1 commit
- Multilingual BART - (#3602) · 7a7fdf71
  Sam Shleifer authored Apr 10, 2020
```
- support mbart-en-ro weights
- add MBartTokenizer
```
  7a7fdf71
27 Mar, 2020 1 commit

Add T5 to docs (#3461) · fa9af246

Patrick von Platen authored Mar 27, 2020

* add t5 docs basis

* improve docs

* add t5 docs

* improve t5 docstring

* add t5 tokenizer docstring

* finish docstring

* make style

* add pretrained models

* correct typo

* make examples work

* finalize docs

fa9af246

02 Mar, 2020 1 commit

Bart-CNN (#3059) · b54ef78d

Sam Shleifer authored Mar 02, 2020

`generate` code that produces 99% identical summarizations to fairseq on CNN test data, with caching.

b54ef78d

20 Feb, 2020 1 commit

New BartModel (#2745) · 53ce3854

Sam Shleifer authored Feb 20, 2020

* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs

53ce3854

07 Feb, 2020 1 commit
- distilbert-base-cased weights + Readmes + omissions · ee5a6856
  VictorSanh authored Feb 07, 2020
  
  ee5a6856
30 Jan, 2020 1 commit
- Pretrained models · 93dccf52
  Lysandre authored Jan 30, 2020
  
  93dccf52
28 Jan, 2020 1 commit
- Add Dutch pre-trained BERT model · f5a236c3
  Wietse de Vries authored Dec 19, 2019
  
  f5a236c3
06 Jan, 2020 2 commits
- GPU text generation: mMoved the encoded_prompt to correct device · 81d6841b
  alberduris authored Dec 31, 2019
  
  81d6841b
- Moved the encoded_prompts to correct device · dd4df80f
  alberduris authored Dec 31, 2019
  
  dd4df80f
21 Dec, 2019 1 commit
- [doc] move distilroberta to more appropriate place · ac1b449c
  Julien Chaumond authored Dec 21, 2019
```
cc @lysandrejik
```
  ac1b449c
18 Dec, 2019 2 commits
- docs: add XLM-RoBERTa to pretrained model list (incl. all parameters) · dd7a958f
  Stefan Schweter authored Dec 18, 2019
  
  dd7a958f
- Add pretrained model documentation for FinBERT. · abc43ffb
  Antti Virtanen authored Dec 16, 2019
  
  abc43ffb
13 Dec, 2019 1 commit
- update model doc - swith 3B/11B to 3b/11b · 5c00e344
  thomwolf authored Dec 13, 2019
  
  5c00e344
11 Dec, 2019 3 commits
- [doc] Fix rst table · 1748fdf6
  Julien Chaumond authored Dec 11, 2019
  
  1748fdf6
- Add support for Japanese BERT models by cl-tohoku · c03c0dfd
  Masatoshi Suzuki authored Nov 15, 2019
  
  c03c0dfd
- doc: fix pretrained models table · 030faccb
  Stefan Schweter authored Dec 11, 2019
  
  030faccb
09 Dec, 2019 1 commit
- fix albert links · 5c877fe9
  Pierric Cistac authored Dec 09, 2019
  
  5c877fe9
05 Dec, 2019 1 commit
- release distilm-bert · 552c44a9
  VictorSanh authored Dec 05, 2019
  
  552c44a9
26 Nov, 2019 3 commits
- Fix pretrained models table · ce02550d
  Lysandre authored Nov 26, 2019
  
  ce02550d
- Fix pretrained models table · cf26a0c8
  Lysandre authored Nov 26, 2019
  
  cf26a0c8
- Pretrained models · 668aac45
  Lysandre authored Nov 26, 2019
  
  668aac45
19 Nov, 2019 1 commit
- docs: add new German distilbert model to pretrained models · e631383d
  Stefan Schweter authored Nov 19, 2019
  
  e631383d
16 Nov, 2019 1 commit
- Add CamemBERT to auto files and docs · 035fea53
  Louis MARTIN authored Nov 12, 2019
  
  035fea53
08 Nov, 2019 1 commit
- adding models in readme and auto classes · f03c0c14
  thomwolf authored Nov 08, 2019
  
  f03c0c14
06 Nov, 2019 1 commit

Add RoBERTa-based GPT-2 Output Detector from OpenAI · 1c542df7

Julien Chaumond authored Nov 06, 2019

converted from https://github.com/openai/gpt-2-output-dataset/tree/master/detector

Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-Authored-By: Jong Wook Kim <jongwook@nyu.edu>
Co-Authored-By: Jeff Wu <wuthefwasthat@gmail.com>

1c542df7

05 Nov, 2019 1 commit
- GPT-2 XL · d7d36181
  Lysandre authored Nov 05, 2019
  
  d7d36181
23 Oct, 2019 1 commit
- [RELEASE] DistilRoBERTa · 8ad5c591
  VictorSanh authored Oct 23, 2019
  
  8ad5c591