Commits · 011cc0be51cf2eb0a91333f1a731658361e81d89 · chenpangpang / transformers

16 Jun, 2020 1 commit
- Fix all sphynx warnings (#5068) · 011cc0be
  Sylvain Gugger authored Jun 16, 2020
  
  011cc0be
15 Jun, 2020 1 commit
- Add bart-base (#5014) · a9f1fc6c
  Sam Shleifer authored Jun 15, 2020
  
  a9f1fc6c
02 Jun, 2020 1 commit

Kill model archive maps (#4636) · d4c2cb40

Julien Chaumond authored Jun 02, 2020

* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI

d4c2cb40

19 May, 2020 2 commits

[Longformer] Docs and clean API (#4464) · 48c3a70b
Patrick von Platen authored May 19, 2020
```
* add longformer docs

* improve docs
```
48c3a70b

Longformer (#4352) · 8f1d0471

Iz Beltagy authored May 19, 2020

* first commit

* bug fixes

* better examples

* undo padding

* remove wrong VOCAB_FILES_NAMES

* License

* make style

* make isort happy

* unit tests

* integration test

* make `black` happy by undoing `isort` changes!!

* lint

* no need for the padding value

* batch_size not bsz

* remove unused type casting

* seqlen not seq_len

* staticmethod

* `bert` selfattention instead of `n2`

* uint8 instead of bool + lints

* pad inputs_embeds using embeddings not a constant

* black

* unit test with padding

* fix unit tests

* remove redundant unit test

* upload model weights

* resolve todo

* simpler _mask_invalid_locations without lru_cache + backward compatible masked_fill_

* increase unittest coverage

8f1d0471

11 May, 2020 1 commit
- [Reformer] Add Enwiki8 Reformer Model - Adapt convert script (#4282) · ac7d5f67
  Patrick von Platen authored May 11, 2020
```
* adapt convert script

* update convert script

* finish

* fix marian pretrained docs
```
  ac7d5f67
10 May, 2020 1 commit

[Marian] documentation and AutoModel support (#4152) · 3487be75

Sam Shleifer authored May 10, 2020

- MarianSentencepieceTokenizer - > MarianTokenizer
- Start using unk token.
- add docs page
- add better generation params to MarianConfig
- more conversion utilities

3487be75

07 May, 2020 1 commit

Reformer (#3351) · dca34695

Patrick von Platen authored May 07, 2020

* first copy & past commit from Bert and morgans LSH code

* add easy way to compare to trax original code

* translate most of function

* make trax lsh self attention deterministic with numpy seed + copy paste code

* add same config

* add same config

* make layer init work

* implemented hash_vectors function for lsh attention

* continue reformer translation

* hf LSHSelfAttentionLayer gives same output as trax layer

* refactor code

* refactor code

* refactor code

* refactor

* refactor + add reformer config

* delete bogus file

* split reformer attention layer into two layers

* save intermediate step

* save intermediate step

* make test work

* add complete reformer block layer

* finish reformer layer

* implement causal and self mask

* clean reformer test and refactor code

* fix merge conflicts

* fix merge conflicts

* update init

* fix device for GPU

* fix chunk length init for tests

* include morgans optimization

* improve memory a bit

* improve comment

* factorize num_buckets

* better testing parameters

* make whole model work

* make lm model work

* add t5 copy paste tokenizer

* add chunking feed forward

* clean config

* add improved assert statements

* make tokenizer work

* improve test

* correct typo

* extend config

* add complexer test

* add new axial position embeddings

* add local block attention layer

* clean tests

* refactor

* better testing

* save intermediate progress

* clean test file

* make shorter input length work for model

* allow variable input length

* refactor

* make forward pass for pretrained model work

* add generation possibility

* finish dropout and init

* make style

* refactor

* add first version of RevNet Layers

* make forward pass work and add convert file

* make uploaded model forward pass work

* make uploaded model forward pass work

* refactor code

* add namedtuples and cache buckets

* correct head masks

* refactor

* made reformer more flexible

* make style

* remove set max length

* add attention masks

* fix up tests

* fix lsh attention mask

* make random seed optional for the moment

* improve memory in reformer

* add tests

* make style

* make sure masks work correctly

* detach gradients

* save intermediate

* correct backprob through gather

* make style

* change back num hashes

* rename to labels

* fix rotation shape

* fix detach

* update

* fix trainer

* fix backward dropout

* make reformer more flexible

* fix conflict

* fix

* fix

* add tests for fixed seed in reformer layer

* fix trainer typo

* fix typo in activations

* add fp16 tests

* add fp16 training

* support fp16

* correct gradient bug in reformer

* add fast gelu

* re-add dropout for embedding dropout

* better naming

* better naming

* renaming

* finalize test branch

* finalize tests

* add more tests

* finish tests

* fix

* fix type trainer

* fix fp16 tests

* fix tests

* fix tests

* fix tests

* fix issue with dropout

* fix dropout seeds

* correct random seed on gpu

* finalize random seed for dropout

* finalize random seed for dropout

* remove duplicate line

* correct half precision bug

* make style

* refactor

* refactor

* docstring

* remove sinusoidal position encodings for reformer

* move chunking to modeling_utils

* make style

* clean config

* make style

* fix tests

* fix auto tests

* pretrained models

* fix docstring

* update conversion file

* Update pretrained_models.rst

* fix rst

* fix rst

* update copyright

* fix test path

* fix test path

* fix small issue in test

* include reformer in generation tests

* add docs for axial position encoding

* finish docs

* Update convert_reformer_trax_checkpoint_to_pytorch.py

* remove isort

* include sams comments

* remove wrong comment in utils

* correct typos

* fix typo

* Update reformer.rst

* applied morgans optimization

* make style

* make gpu compatible

* remove bogus file

* big test refactor

* add example for chunking

* fix typo

* add to README

dca34695

16 Apr, 2020 1 commit

[Docs] Add DialoGPT (#3755) · d22894df

Patrick von Platen authored Apr 16, 2020



* add dialoGPT

* update README.md

* fix conflict

* update readme

* add code links to docs

* Update README.md

* Update dialo_gpt2.rst

* Update pretrained_models.rst

* Update docs/source/model_doc/dialo_gpt2.rst
Co-Authored-By: Julien Chaumond <chaumond@gmail.com>

* change filename of dialogpt
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

d22894df

10 Apr, 2020 1 commit
- Multilingual BART - (#3602) · 7a7fdf71
  Sam Shleifer authored Apr 10, 2020
```
- support mbart-en-ro weights
- add MBartTokenizer
```
  7a7fdf71
27 Mar, 2020 1 commit

Add T5 to docs (#3461) · fa9af246

Patrick von Platen authored Mar 27, 2020

* add t5 docs basis

* improve docs

* add t5 docs

* improve t5 docstring

* add t5 tokenizer docstring

* finish docstring

* make style

* add pretrained models

* correct typo

* make examples work

* finalize docs

fa9af246

02 Mar, 2020 1 commit

Bart-CNN (#3059) · b54ef78d

Sam Shleifer authored Mar 02, 2020

`generate` code that produces 99% identical summarizations to fairseq on CNN test data, with caching.

b54ef78d

20 Feb, 2020 1 commit

New BartModel (#2745) · 53ce3854

Sam Shleifer authored Feb 20, 2020

* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs

53ce3854

07 Feb, 2020 1 commit
- distilbert-base-cased weights + Readmes + omissions · ee5a6856
  VictorSanh authored Feb 07, 2020
  
  ee5a6856
30 Jan, 2020 1 commit
- Pretrained models · 93dccf52
  Lysandre authored Jan 30, 2020
  
  93dccf52
28 Jan, 2020 1 commit
- Add Dutch pre-trained BERT model · f5a236c3
  Wietse de Vries authored Dec 19, 2019
  
  f5a236c3
06 Jan, 2020 2 commits
- GPU text generation: mMoved the encoded_prompt to correct device · 81d6841b
  alberduris authored Dec 31, 2019
  
  81d6841b
- Moved the encoded_prompts to correct device · dd4df80f
  alberduris authored Dec 31, 2019
  
  dd4df80f
21 Dec, 2019 1 commit
- [doc] move distilroberta to more appropriate place · ac1b449c
  Julien Chaumond authored Dec 21, 2019
```
cc @lysandrejik
```
  ac1b449c
18 Dec, 2019 2 commits
- docs: add XLM-RoBERTa to pretrained model list (incl. all parameters) · dd7a958f
  Stefan Schweter authored Dec 18, 2019
  
  dd7a958f
- Add pretrained model documentation for FinBERT. · abc43ffb
  Antti Virtanen authored Dec 16, 2019
  
  abc43ffb
13 Dec, 2019 1 commit
- update model doc - swith 3B/11B to 3b/11b · 5c00e344
  thomwolf authored Dec 13, 2019
  
  5c00e344
11 Dec, 2019 3 commits
- [doc] Fix rst table · 1748fdf6
  Julien Chaumond authored Dec 11, 2019
  
  1748fdf6
- Add support for Japanese BERT models by cl-tohoku · c03c0dfd
  Masatoshi Suzuki authored Nov 15, 2019
  
  c03c0dfd
- doc: fix pretrained models table · 030faccb
  Stefan Schweter authored Dec 11, 2019
  
  030faccb
09 Dec, 2019 1 commit
- fix albert links · 5c877fe9
  Pierric Cistac authored Dec 09, 2019
  
  5c877fe9
05 Dec, 2019 1 commit
- release distilm-bert · 552c44a9
  VictorSanh authored Dec 05, 2019
  
  552c44a9
26 Nov, 2019 3 commits
- Fix pretrained models table · ce02550d
  Lysandre authored Nov 26, 2019
  
  ce02550d
- Fix pretrained models table · cf26a0c8
  Lysandre authored Nov 26, 2019
  
  cf26a0c8
- Pretrained models · 668aac45
  Lysandre authored Nov 26, 2019
  
  668aac45
19 Nov, 2019 1 commit
- docs: add new German distilbert model to pretrained models · e631383d
  Stefan Schweter authored Nov 19, 2019
  
  e631383d
16 Nov, 2019 1 commit
- Add CamemBERT to auto files and docs · 035fea53
  Louis MARTIN authored Nov 12, 2019
  
  035fea53
08 Nov, 2019 1 commit
- adding models in readme and auto classes · f03c0c14
  thomwolf authored Nov 08, 2019
  
  f03c0c14
06 Nov, 2019 1 commit

Add RoBERTa-based GPT-2 Output Detector from OpenAI · 1c542df7

Julien Chaumond authored Nov 06, 2019

converted from https://github.com/openai/gpt-2-output-dataset/tree/master/detector

Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
Co-Authored-By: Jong Wook Kim <jongwook@nyu.edu>
Co-Authored-By: Jeff Wu <wuthefwasthat@gmail.com>

1c542df7

05 Nov, 2019 1 commit
- GPT-2 XL · d7d36181
  Lysandre authored Nov 05, 2019
  
  d7d36181
23 Oct, 2019 1 commit
- [RELEASE] DistilRoBERTa · 8ad5c591
  VictorSanh authored Oct 23, 2019
  
  8ad5c591
11 Oct, 2019 1 commit
- model: add support for new German BERT models (cased and uncased) from @dbmdz · 5f25a5f3
  Stefan Schweter authored Oct 11, 2019
  
  5f25a5f3
09 Oct, 2019 1 commit
- doc and conversion · 48b438ff
  thomwolf authored Oct 09, 2019
  
  48b438ff
03 Oct, 2019 2 commits
- fix name · c1689ac3
  VictorSanh authored Oct 03, 2019
  
  c1689ac3
- update doc for distil* · 4a790c40
  VictorSanh authored Oct 03, 2019
  
  4a790c40