Commits · 55eccfbb49e8b58b4bbfdad5acc119b54e66191e · chenpangpang / transformers

25 Sep, 2020 2 commits
- Update README.md · 55eccfbb
  Patrick von Platen authored Sep 25, 2020
  
  55eccfbb
- Update README.md · 5ff0d6d7
  Patrick von Platen authored Sep 25, 2020
  
  5ff0d6d7
22 Sep, 2020 2 commits
- [model_cards] blinoff/roberta-base-russian-v0 (#7317) · a9c7849c
  blinovpd authored Sep 23, 2020
  
  a9c7849c
- Fixed results of SQuAD-FR evaluation (#7313) · d6bc72c4
  Pavel Soriano authored Sep 22, 2020
```
The score for the F1 metric was reported as the Exact Match and vice-versa.
```
  d6bc72c4
21 Sep, 2020 5 commits

Added RobBERT-v2 model card (#7286) · 34a1b75f

Thomas Winters authored Sep 21, 2020



* Added RobBERT-v2 model card

* minor Tweaks
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

34a1b75f

IXAmBERT model card (#7283) · 6513d16a

jjacampos authored Sep 21, 2020

This PR includes the model card for the IXAmBERT model which has been recently uploaded to the huggingface repository.

6513d16a

[model card] distlbart-mnli model cards (#7278) · 7a88ed6c
Suraj Patil authored Sep 21, 2020

7a88ed6c

Add model cards for new pre-trained BERTweet-COVID19 models (#7269) · 67c4b0c5

Dat Quoc Nguyen authored Sep 21, 2020

Two new pre-trained models "vinai/bertweet-covid19-base-cased" and "vinai/bertweet-covid19-base-uncased" are resulted by further pre-training the pre-trained model "vinai/bertweet-base" on a  corpus of 23M COVID-19 English Tweets for 40 epochs.

67c4b0c5

Update README.md · 0cbe1139
Patrick von Platen authored Sep 21, 2020

0cbe1139

19 Sep, 2020 4 commits
- model card improvements (#7221) · 4f6e5257
  Stas Bekman authored Sep 19, 2020
  
  4f6e5257
- fsmt tiny model card + script (#7244) · eb074af7
  Stas Bekman authored Sep 19, 2020
  
  eb074af7
- Add title to model card (#7240) · 1d90d0f3
  Manuel Romero authored Sep 19, 2020
  
  1d90d0f3
- Create README.md (#7239) · c9b7ef04
  Manuel Romero authored Sep 19, 2020
  
  c9b7ef04
18 Sep, 2020 22 commits
- Add new pre-trained models BERTweet and PhoBERT (#6129) · af2322c7
  Dat Quoc Nguyen authored Sep 19, 2020
```
* Add BERTweet and PhoBERT models

* Update modeling_auto.py

Re-add `bart` to LM_MAPPING

* Update tokenization_auto.py

Re-add `from .configuration_mobilebert import MobileBertConfig`
not sure why it's replaced by `from transformers.configuration_mobilebert import MobileBertConfig`

* Add BERTweet and PhoBERT to pretrained_models.rst

* Update tokenization_auto.py

Remove BertweetTokenizer and PhobertTokenizer out of tokenization_auto.py (they are currently not supported by AutoTokenizer.

* Update BertweetTokenizer - without nltk

* Update model card for BERTweet

* PhoBERT - with Auto mode - without import fastBPE

* PhoBERT - with Auto mode - without import fastBPE

* BERTweet - with Auto mode - without import fastBPE

* Add PhoBERT and BERTweet to TF modeling auto

* Improve Docstrings for PhobertTokenizer and BertweetTokenizer

* Update PhoBERT and BERTweet model cards

* Fixed a merge conflict in tokenization_auto

* Used black to reformat BERTweet- and PhoBERT-related files

* Used isort to reformat BERTweet- and PhoBERT-related files

* Reformatted BERTweet- and PhoBERT-related files based on flake8

* Updated test files

* Updated test files

* Updated tf test files

* Updated tf test files

* Updated tf test files

* Updated tf test files

* Update commits from huggingface

* Delete unnecessary files

* Add tokenizers to auto and init files

* Add test files for tokenizers

* Revised model cards

* Update save_vocabulary function in BertweetTokenizer and PhobertTokenizer and test files

* Revised test files

* Update orders of Phobert and Bertweet tokenizers in auto tokenization file
```
  af2322c7
- Create README.md · 9397436e
  Patrick von Platen authored Sep 18, 2020
  
  9397436e
- Create README.md · 7eeca4d3
  Patrick von Platen authored Sep 18, 2020
  
  7eeca4d3
- Update README.md · 31516c77
  Patrick von Platen authored Sep 18, 2020
  
  31516c77
- Update README.md · 4c14669a
  Patrick von Platen authored Sep 18, 2020
  
  4c14669a
- [model_cards] · eef8d94d
  Julien Chaumond authored Sep 18, 2020
```
We use ISO 639-1 cc @gentaiscool
```
  eef8d94d
- Create README.md · afd6a9f8
  Patrick von Platen authored Sep 18, 2020
  
  afd6a9f8
- Create README.md · 9f1544b9
  Patrick von Platen authored Sep 18, 2020
  
  9f1544b9
- Create README.md (#7205) · 4a26e8ac
  Manuel Romero authored Sep 18, 2020
  
  4a26e8ac
- Add customized text to widget (#7204) · 94320c5b
  Manuel Romero authored Sep 18, 2020
  
  94320c5b
- Create README.md (#7209) · 3aefb24b
  Manuel Romero authored Sep 18, 2020
  
  3aefb24b
- Create README.md (#7210) · a22e7a8d
  Manuel Romero authored Sep 18, 2020
  
  a22e7a8d
- Create README.md (#7212) · c028b264
  Manuel Romero authored Sep 18, 2020
  
  c028b264
- Create README.md for indobert-lite-base-p1 (#7182) · c7cdd7b4
  Genta Indra Winata authored Sep 18, 2020
  
  c7cdd7b4
- Create README.md for indobert-lite-large-p1 (#7184) · bfb9150b
  Genta Indra Winata authored Sep 18, 2020
```
* Create README.md

* Update README.md
```
  bfb9150b
- Create README.md (#7183) · d1935934
  Genta Indra Winata authored Sep 18, 2020
  
  d1935934
- Create README.md (#7185) · e65d8466
  Genta Indra Winata authored Sep 18, 2020
  
  e65d8466
- Create README.md for indobert-large-p2 model card (#7181) · e27d86d4
  Genta Indra Winata authored Sep 18, 2020
  
  e27d86d4
- Create README.md for indobert-large-p1 model card (#7180) · 881c0783
  Genta Indra Winata authored Sep 18, 2020
  
  881c0783
- Create README.md (#7179) · e0d58a5c
  Genta Indra Winata authored Sep 18, 2020
  
  e0d58a5c
- Create README.md for indobert-base-p2 (#7178) · 1313a1d2
  Genta Indra Winata authored Sep 18, 2020
  
  1313a1d2
- Create README.md (#7095) · cf24f43e
  tuner007 authored Sep 18, 2020
```
Create model card for Pegasus QA
```
  cf24f43e
17 Sep, 2020 5 commits

[model cards] fix metadata - 3rd attempt (#7218) · edbaad2c
Stas Bekman authored Sep 17, 2020

edbaad2c
[model cards] fix dataset yaml (#7216) · 51c4adf5
Stas Bekman authored Sep 17, 2020

51c4adf5
[model cards] fix yaml in cards (#7207) · 9c5bcab5
Stas Bekman authored Sep 17, 2020

9c5bcab5

[model cards] ported allenai Deep Encoder, Shallow Decoder models (#7153) · 0fe6e435

Stas Bekman authored Sep 17, 2020

* [model cards] ported allenai Deep Encoder, Shallow Decoder models

* typo

* fix references

* add allenai/wmt19-de-en-6-6 model cards

* fill-in the missing info for the build script as provided by the searcher.

0fe6e435

[ported model] FSMT (FairSeq MachineTranslation) (#6940) · 1eeb206b

Stas Bekman authored Sep 17, 2020

* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in https://github.com/huggingface/transformers/pull/6956



* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad15ca6a9ef4e26178526e878e8fc5c85874.

* decouple from bart

* remove unused code #1

* remove unused code #2

* remove unused code #3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (#7034)

* [s2s] --eval_max_generate_length (#7018)

* Fix CI with change of name of nlp (#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

1eeb206b