Commits · ee1bff06f883bb2f993959885c8ded00b5479cea · chenpangpang / transformers

02 Sep, 2020 5 commits
- minor docs grammar fixes (#6889) · ee1bff06
  Harry Wang authored Sep 02, 2020
  
  ee1bff06
- fix warning for position ids (#6884) · 8abd7f69
  Patrick von Platen authored Sep 02, 2020
  
  8abd7f69
- Update modeling_bert.py (#6897) · 7cb0572c
  Parthe Pandit authored Sep 02, 2020
```
outptus -> outputs in example of BertForPreTraining
```
  7cb0572c
- Model card for huBERT (#6893) · e3c55ceb
  David Mark Nemeskey authored Sep 02, 2020
```
* Create README.md

Model card for huBERT.

* Update README.md

lowercase h

* Update model_cards/SZTAKI-HLT/hubert-base-cc/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
```
  e3c55ceb
- fix QA example for PT (#6890) · 1889e96c
  Patrick von Platen authored Sep 02, 2020
  
  1889e96c
01 Sep, 2020 22 commits
- [model_cards] Fix file path for flexudy/t5-base-multi-sentence-doctor · d822ab63
  Julien Chaumond authored Sep 02, 2020
  
  d822ab63
- Create README.md (#6598) · ad5fb33c
  Rohan Rajpal authored Sep 02, 2020
  
  ad5fb33c
- Create README.md (#6602) · f9dadcd8
  Rohan Rajpal authored Sep 02, 2020
  
  f9dadcd8
- Update multilingual passage rereanking model card (#6788) · f5d69c75
  Igli Manaj authored Sep 01, 2020
```
Fix range of possible score, add inference .
```
  f5d69c75
- Model card for primer/BART-Squad2 (#6801) · 5d820f3c
  Tom Grek authored Sep 01, 2020
  
  5d820f3c
- added model card for flexudys t5 model (#6759) · 8b884dad
  zolekode authored Sep 01, 2020
```
Co-authored-by: zolekode <pascal.zoleko@fau.de>
```
  8b884dad
- loodos turkish model cards added (#6840) · bff6d517
  hakan authored Sep 02, 2020
  
  bff6d517
- Create README.md (#6887) · 502d194b
  Manuel Romero authored Sep 01, 2020
```
Add language meta attribute
```
  502d194b
- Create README.md (#6888) · d082edf2
  Manuel Romero authored Sep 01, 2020
```
Add language meta attribute
```
  d082edf2
- Create README.md (#6886) · dacbee9a
  Abed khooli authored Sep 02, 2020
```
* Create README.md

model card for  akhooli/xlm-r-large-arabic-sent

* Update model_cards/akhooli/xlm-r-large-arabic-sent/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
```
  dacbee9a
- Create README.md (#6885) · e2971e61
  Abed khooli authored Sep 01, 2020
  
  e2971e61
- [EncoderDecoder] Add xlm-roberta to encoder decoder (#6878) · 4d1a3ffd
  Patrick von Platen authored Sep 01, 2020
```
* finish xlm-roberta

* finish docs

* expose XLMRobertaForCausalLM
```
  4d1a3ffd
- Create README.md (#6883) · 31199263
  Patrick von Platen authored Sep 01, 2020
```
* Create README.md

* Update README.md
```
  31199263
- Add cache_dir to save features TextDataset (#6879) · 21d71923
  Jin Young (Daniel) Sohn authored Sep 01, 2020
```
* Add cache_dir to save features TextDataset

This is in case the dataset is in a RO filesystem, for which is the case
in tests (GKE TPU tests).

* style
```
  21d71923
- Update docs stable version · 1461aac8
  Lysandre Debut authored Sep 01, 2020
  
  1461aac8
- v3.1.0 documentation · 3726754a
  Lysandre authored Sep 01, 2020
  
  3726754a
- Release: v3.1.0 · 4b3ee9cb
  Lysandre authored Sep 01, 2020
  
  4b3ee9cb
- [Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735) · afc4ece4
  Patrick von Platen authored Sep 01, 2020
```
* fix generate for GPT2 Double Head

* fix gpt2 double head model

* fix  bart / t5

* also add for no beam search

* fix no beam search

* fix encoder decoder

* simplify t5

* simplify t5

* fix t5 tests

* fix BART

* fix transfo-xl

* fix conflict

* integrating sylvains and sams comments

* fix tf past_decoder_key_values

* fix enc dec test
```
  afc4ece4
- Restore PaddingStrategy.MAX_LENGTH on QAPipeline while no v2. (#6875) · 397f8196
  Funtowicz Morgan authored Sep 01, 2020
```
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
```
  397f8196
- delete reinit (#6862) · a32d85f0
  Sam Shleifer authored Sep 01, 2020
  
  a32d85f0
- Logging doc (#6852) · d5f1ffa0
  Sylvain Gugger authored Sep 01, 2020
```
* Add logging doc

* Foamtting

* Update docs/source/main_classes/logging.rst

* Update src/transformers/utils/logging.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
```
  d5f1ffa0
- add a final report to all pytest jobs (#6861) · 59a6a32a
  Stas Bekman authored Aug 31, 2020
```
we had it added for one job, please add it to all pytest jobs - we need the output of what tests were run to debug the codecov issue. thank you!
```
  59a6a32a
31 Aug, 2020 13 commits

[fix] typo in available in helper function (#6859) · 431ab19d
Sam Shleifer authored Aug 31, 2020

431ab19d
Bart can make decoder_input_ids from labels (#6758) · 367235ee
Sam Shleifer authored Aug 31, 2020

367235ee
[s2s] command line args for faster val steps (#6833) · b9772897
Sam Shleifer authored Aug 31, 2020

b9772897
Fix marian slow test (#6854) · 8af1970e
Sam Shleifer authored Aug 31, 2020

8af1970e

Update ONNX notebook to include section on quantization. (#6831) · bbdba0a7

Funtowicz Morgan authored Aug 31, 2020



* Update ONNX notebook to include section on quantization.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing ONNX team comments

bbdba0a7

Split hp search methods (#6857) · a59bcefb
Sylvain Gugger authored Aug 31, 2020
```
* Split the run_hp_search by backend

* Unused import
```
a59bcefb

Add checkpointing to Ray Tune HPO (#6747) · 23f9611c

krfricke authored Aug 31, 2020

* Introduce HPO checkpointing for PBT

* Moved checkpoint saving

* Fixed checkpoint subdir pass

* Fixed style

* Enable/disable checkpointing, check conditions for various tune schedulers incl. PBT

* Adjust number of GPUs to number of jobs

* Avoid mode pickling in ray

* Move hp search to integrations

23f9611c

Marian distill scripts + integration test (#6799) · 61b7ba93
Sam Shleifer authored Aug 31, 2020

61b7ba93

Only access loss tensor every logging_steps (#6802) · 02d09c8f

Jin Young (Daniel) Sohn authored Aug 31, 2020



* Only access loss tensor every logging_steps

* tensor.item() was being called every step. This must not be done
for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU
communication at each step. On RoBERTa MLM for example, it reduces step
time by 30%, should be larger for smaller step time models/tasks.
* Train batch size was not correct in case a user uses the
`per_gpu_train_batch_size` flag
* Avg reduce loss accross eval shards

* Fix style (#6803)

* t5 model should make decoder_attention_mask (#6800)

* [s2s] Test hub configs in self-scheduled CI (#6809)

* [s2s] round runtime in run_eval (#6798)

* Pegasus finetune script: add --adafactor (#6811)

* [bart] rename self-attention -> attention (#6708)

* [tests] fix typos in inputs (#6818)

* Fixed open in colab link (#6825)

* Add model card for singbert lite. Update widget for singbert and singbert-large. (#6827)

* BR_BERTo model card (#6793)

* clearly indicate shuffle=False (#6312)

* Clarify shuffle

* clarify shuffle
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>

* [s2s README] Add more dataset download instructions (#6737)

* Style

* Patch logging issue

* Set default logging level to `WARNING` instead of `INFO`

* TF Flaubert w/ pre-norm (#6841)

* Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644)

* add datacollator and dataset for next sentence prediction task

* bug fix (numbers of special tokens & truncate sequences)

* bug fix (+ dict inputs support for data collator)

* add padding for nsp data collator; renamed cached files to avoid conflict.

* add test for nsp data collator

* Style
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

* Fix in Adafactor docstrings (#6845)

* Fix resuming training for Windows (#6847)

* Only access loss tensor every logging_steps

* tensor.item() was being called every step. This must not be done
for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU
communication at each step. On RoBERTa MLM for example, it reduces step
time by 30%, should be larger for smaller step time models/tasks.
* Train batch size was not correct in case a user uses the
`per_gpu_train_batch_size` flag
* Avg reduce loss accross eval shards

* comments
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Thomas Ashish Cherian <6967017+PandaWhoCodes@users.noreply.github.com>
Co-authored-by: Zane Lim <zyuanlim@gmail.com>
Co-authored-by: Rodolfo De Nadai <rdenadai@gmail.com>
Co-authored-by: xujiaze13 <37360975+xujiaze13@users.noreply.github.com>
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Huang Lianzhe <hlz@pku.edu.cn>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

02d09c8f

Fix resuming training for Windows (#6847) · c48546c7
Sylvain Gugger authored Aug 31, 2020

c48546c7
Fix in Adafactor docstrings (#6845) · d2f9cb83
Sylvain Gugger authored Aug 31, 2020

d2f9cb83

Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644) · 2de7ee03

Huang Lianzhe authored Aug 31, 2020



* add datacollator and dataset for next sentence prediction task

* bug fix (numbers of special tokens & truncate sequences)

* bug fix (+ dict inputs support for data collator)

* add padding for nsp data collator; renamed cached files to avoid conflict.

* add test for nsp data collator

* Style
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

2de7ee03

TF Flaubert w/ pre-norm (#6841) · 895d3946
Lysandre Debut authored Aug 31, 2020

895d3946