Commits · 37be3786cf1de9d21233f543c231866e68954998 · chenpangpang / transformers

08 Jun, 2020 7 commits
- Clean documentation (#4849) · 37be3786
  Sylvain Gugger authored Jun 08, 2020
```
* Clean documentation
```
  37be3786
- Turn off codecov patch for now · 42860e92
  Lysandre authored Jun 08, 2020
  
  42860e92
- TF Checkpoints (#4831) · 36dfc317
  Julien Plu authored Jun 08, 2020
```
* Align checkpoint dir with the PT trainer

* Use args for max to keep checkpoints
```
  36dfc317
- [Generate] beam search should generate without replacement (#4845) · 439f1cab
  Patrick von Platen authored Jun 08, 2020
```
* fix flaky beam search

* fix typo
```
  439f1cab
- fix PR (#4810) · c0554776
  Patrick von Platen authored Jun 08, 2020
  
  c0554776
- Expose classes used in documentation (#4808) · e8177479
  Sylvain Gugger authored Jun 08, 2020
```
* Expose classes used in documentation

* Format code
```
  e8177479
- Updates args in tf squad example. (#4820) · b6f365a8
  daniel-shan authored Jun 08, 2020
```
Co-authored-by: Daniel Shan <daniel.shan@workday.com>
```
  b6f365a8
07 Jun, 2020 1 commit
- Export PretrainedBartModel from __init__ (#4819) · e33fdc93
  Bram Vanroy authored Jun 07, 2020
  
  e33fdc93
06 Jun, 2020 3 commits
- [marian tests ] pass device to pipeline (#4815) · c58e6c12
  Sam Shleifer authored Jun 06, 2020
  
  c58e6c12
- Updated path "cd examples/text-generation/pplm" (#4778) · ddf9a3df
  Mr Ruben authored Jun 06, 2020
```
https://github.com/huggingface/transformers/issues/4776
```
  ddf9a3df
- Explain how to preview the docs in a PR (#4795) · 2d372a99
  Sylvain Gugger authored Jun 05, 2020
  
  2d372a99
05 Jun, 2020 13 commits
- Add model and doc badges (#4811) · 56d5d160
  Sylvain Gugger authored Jun 05, 2020
```
* Add badges for models and docs
```
  56d5d160
- [cleanup/marian] pipelines test and new kwarg (#4812) · 4ab74245
  Sam Shleifer authored Jun 05, 2020
  
  4ab74245
- [isort] add matplotlib to known 3rd party dependencies (#4800) · 875288b3
  Sam Shleifer authored Jun 05, 2020
  
  875288b3
- [EncoderDecoderConfig] automatically set decoder config to decoder (#4809) · 8cca8755
  Patrick von Platen authored Jun 05, 2020
```
* automatically set decoder config to decoder

* add more tests
```
  8cca8755
- Use labels to remove deprecation warnings (#4807) · f1fe1846
  Sylvain Gugger authored Jun 05, 2020
  
  f1fe1846
- Add link to community models (#4804) · 5c0cfc2c
  Sylvain Gugger authored Jun 05, 2020
  
  5c0cfc2c
- Fix argument label (#4792) · 4dd5cf22
  Sylvain Gugger authored Jun 05, 2020
```
* Fix argument label

* Fix test
```
  4dd5cf22
- [cleanup] MarianTokenizer: delete unused constants (#4802) · 3723f30a
  Sam Shleifer authored Jun 05, 2020
  
  3723f30a
- Clean-up code (#4790) · acaa2e62
  Sylvain Gugger authored Jun 05, 2020
  
  acaa2e62
- Add model summary (#4789) · fa661ce7
  Sylvain Gugger authored Jun 05, 2020
```
* Add model summary

* Add link to pretrained models
```
  fa661ce7
- No silent error when d_head already in the configuration (#4747) · 79ab881e
  Lysandre Debut authored Jun 05, 2020
```
* No silent error when d_head already in the configuration

* Update src/transformers/configuration_xlnet.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
```
  79ab881e
- [doc] Make it clearer that `text-generation` does not involve training · b9109f2d
  Julien Chaumond authored Jun 05, 2020
  
  b9109f2d
- Add .vs to gitignore (#4774) · ceaab8dd
  Sylvain Gugger authored Jun 05, 2020
  
  ceaab8dd
04 Jun, 2020 14 commits

Tensorflow improvements (#4530) · f9414f75

Julien Plu authored Jun 05, 2020



* Better None gradients handling

* Apply Style

* Apply Style

* Create a loss class per task to compute its respective loss

* Add loss classes to the ALBERT TF models

* Add loss classes to the BERT TF models

* Add question answering and multiple choice to TF Camembert

* Remove prints

* Add multiple choice model to TF DistilBERT + loss computation

* Add question answering model to TF Electra + loss computation

* Add token classification, question answering and multiple choice models to TF Flaubert

* Add multiple choice model to TF Roberta + loss computation

* Add multiple choice model to TF XLM + loss computation

* Add multiple choice and question answering models to TF XLM-Roberta

* Add multiple choice model to TF XLNet + loss computation

* Remove unused parameters

* Add task loss classes

* Reorder TF imports + add new model classes

* Add new model classes

* Bugfix in TF T5 model

* Bugfix for TF T5 tests

* Bugfix in TF T5 model

* Fix TF T5 model tests

* Fix T5 tests + some renaming

* Fix inheritance issue in the AutoX tests

* Add tests for TF Flaubert and TF XLM Roberta

* Add tests for TF Flaubert and TF XLM Roberta

* Remove unused piece of code in the TF trainer

* bugfix and remove unused code

* Bugfix for TF 2.2

* Apply Style

* Divide TFSequenceClassificationAndMultipleChoiceLoss into their two respective name

* Apply style

* Mirror the PT Trainer in the TF one: fp16, optimizers and tb_writer as class parameter and better dataset handling

* Fix TF optimizations tests and apply style

* Remove useless parameter

* Bugfix and apply style

* Fix TF Trainer prediction

* Now the TF models return the loss such as their PyTorch couterparts

* Apply Style

* Ignore some tests output

* Take into account the SQuAD cls_index, p_mask and is_impossible parameters for the QuestionAnswering task models.

* Fix names for SQuAD data

* Apply Style

* Fix conflicts with 2.11 release

* Fix conflicts with 2.11

* Fix wrongname

* Add better documentation on the new create_optimizer function

* Fix isort

* logging_dir: use same default as PyTorch
Co-authored-by: Julien Chaumond <chaumond@gmail.com>

f9414f75

Create model card for tblard/allocine (#4775) · ccd26c28
Théophile Blard authored Jun 05, 2020
```
https://huggingface.co/tblard/tf-allocine
```
ccd26c28

NER: Add new WNUT’17 example (#4681) · 2a4b9e09

Stefan Schweter authored Jun 05, 2020

* ner: add preprocessing script for examples that splits longer sentences

* ner: example shell scripts use local preprocessing now

* ner: add new example section for WNUT’17 NER task. Remove old English CoNLL-03 results

* ner: satisfy black and isort

2a4b9e09

Add drop_last arg for data loader · 0e1869cc
Setu Shah authored Jun 03, 2020

0e1869cc
removed deprecared use of Variable api from pplm example · 48a05026
prajjwal1 authored May 28, 2020

48a05026
Don't access pad_token_id if there is no pad_token (#4773) · 12d0eb5f
Sylvain Gugger authored Jun 04, 2020

12d0eb5f
Create model card for T5-base fine-tuned for Sentiment Span Extraction (#4737) · 17a88d31
Manuel Romero authored Jun 04, 2020

17a88d31
Create README.md (#4743) · fb52143c
Oren Amsalem authored Jun 04, 2020

fb52143c

Model Card for RoBERTa trained on Sanskrit (#4763) · 5f077a34

Suraj Parmar authored Jun 04, 2020

* Model cad for SanBERTa

Model Card for RoBERTa trained on Sanskrit

* Model card for SanBERTa

model card for RoBERTa trained on Sanskrit

5f077a34

Add note about doc generation (#4770) · cd4e07a8
Sylvain Gugger authored Jun 04, 2020

cd4e07a8
Remove unnecessary model_type arg in example (#4771) · 492b352a
Jason Phang authored Jun 04, 2020

492b352a
Codecov setup (#4768) · e645b9ab
Lysandre Debut authored Jun 04, 2020
```
* Codecov setup

* Understanding codecov
```
e645b9ab
[cleanup] PretrainedModel.generate: remove unused kwargs (#4761) · 2b8b6c92
Sam Shleifer authored Jun 04, 2020

2b8b6c92

Introduce a new tensor type for return_tensors on tokenizer for NumPy (#4585) · 5bf9afbf

Funtowicz Morgan authored Jun 04, 2020

* Refactor tensor creation in tokenizers.

* Make sure to convert string to TensorType

* Refactor convert_to_tensors_

* Introduce numpy tensor creation

* Format

* Add unittest for TensorType creation from str

* sorting imports

* Added unittests for numpy tensor conversion.

* Do not use in-place version for squeeze as numpy doesn't provide such feature.

* Added extra parameter prepend_batch_axis: bool on prepare_for_model.

* Ensure test_np_encode_plus_sent_to_model is not executed if encoder/decoder model.

* style.

* numpy tests require_torch for now while flax not merged.

* Hopefully will make flake8 happy.

* One more time 🎶

5bf9afbf

03 Jun, 2020 2 commits

never_split on slow tokenizers should not split (#4723) · efae1549

Funtowicz Morgan authored Jun 03, 2020

* Ensure tokens in never_split are not splitted when using basic tokenizer before wordpiece.

* never_split only use membership attempt to use a set() which is 10x faster for this operation.

* Use union to concatenate two sets.

* Updated docstring for never_split parameter.

* Avoid set.union() if never_split is None

* Added comments.

* Correct docstring format.

efae1549

Update encode documentation (#4751) · 2e4de762
Lysandre Debut authored Jun 03, 2020

2e4de762