Commits · 88ef8893cd649cc2b4adb9885aba88c750118cff · chenpangpang / transformers

23 Dec, 2020 1 commit

Add caching mechanism to BERT, RoBERTa (#9183) · 88ef8893

Suraj Patil authored Dec 23, 2020

* add past_key_values

* add use_cache option

* make mask before cutting ids

* adjust position_ids according to past_key_values

* flatten past_key_values

* fix positional embeds

* fix _reorder_cache

* set use_cache to false when not decoder, fix attention mask init

* add test for caching

* add past_key_values for Roberta

* fix position embeds

* add caching test for roberta

* add doc

* make style

* doc, fix attention mask, test

* small fixes

* adress patrick's comments

* input_ids shouldn't start with pad token

* use_cache only when decoder

* make consistent with bert

* make copies consistent

* add use_cache to encoder

* add past_key_values to tapas attention

* apply suggestions from code review

* make coppies consistent

* add attn mask in tests

* remove copied from longformer

* apply suggestions from code review

* fix bart test

* nit

* simplify model outputs

* fix doc

* fix output ordering

88ef8893

22 Dec, 2020 3 commits

Model Templates for Seq2Seq (#9251) · cbe63949

Patrick von Platen authored Dec 22, 2020

* adapt cookie cutter

* fix copy past statement

* delete copy statements for now

* remove unused import from template

* make doc rst

* correct config docstring

* correct training

* correct inputs processing tf enc dec

* make style

* adapt templates

* clean tabs

* correct tensor -> Tensor naming

* correct indent

* correct templates

* fix the test

* break lines to avoid > 119

* Apply suggestions from code review

cbe63949

Seq2seq trainer (#9241) · 490b39e6

Sylvain Gugger authored Dec 22, 2020



* Add label smoothing in Trainer

* Add options for scheduler and Adafactor in Trainer

* Put Seq2SeqTrainer in the main lib

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Address review comments and adapt scripts

* Documentation

* Move test not using script to tests folder
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

490b39e6

[EncoderDecoder] Make tests more aggressive (#9256) · e9d77ccd

Patrick von Platen authored Dec 22, 2020

* add tests

* make style and fix bart bug

* fix bart past key value edge case

* correct tf bart test

* fix gpt2 tf

* fix t5 test

e9d77ccd

21 Dec, 2020 3 commits
- [MPNet] Add slow to fast tokenizer converter (#9233) · 9a12b969
  Patrick von Platen authored Dec 21, 2020
```
* add converter

* delet unnecessary comments
```
  9a12b969
- add base model classes to bart subclassed models (#9230) · f4432b7e
  Suraj Patil authored Dec 21, 2020
```
* add base model classes to  bart subclassed models

* add doc
```
  f4432b7e
- Fixed beam search generation for GPT2 and T5 (#9219) · 08abdabd
  TobiasNorlund authored Dec 21, 2020
  
  08abdabd
19 Dec, 2020 1 commit

Added TF TransfoXL Sequence Classification (#9169) · e0e255be

sandip authored Dec 19, 2020

* TF Transfoxl seq classification

* Update test_modeling_tf_transfo_xl.py

Added num_labels to config level

* TF Transfoxl seq classification

* Update test_modeling_tf_transfo_xl.py

Added num_labels to config level

* code refactor

* code refactor

* code refator

e0e255be

18 Dec, 2020 2 commits

Add timing inside Trainer (#9196) · 1198ba8f
Sylvain Gugger authored Dec 18, 2020
```
* Add timing inside Trainer

* Fix tests

* Add n_objs for train

* Sort logs
```
1198ba8f

Add new run_swag example (#9175) · 9a25c5bd

Sylvain Gugger authored Dec 18, 2020



* Add new run_swag example

* Add check

* Add sample

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Very important change to make Lysandre happy
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

9a25c5bd

17 Dec, 2020 1 commit
- Added TF CTRL Sequence Classification (#9151) · 467e9158
  sandip authored Dec 18, 2020
```
* Added TF CTRL Sequence Classification

* code refactor
```
  467e9158
16 Dec, 2020 3 commits

TableQuestionAnsweringPipeline (#9145) · 1c1a2ffb

Lysandre Debut authored Dec 16, 2020



* AutoModelForTableQuestionAnswering

* TableQuestionAnsweringPipeline

* Apply suggestions from Patrick's code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Sylvain and Patrick comments

* Better PyTorch/TF error message

* Add integration tests

* Argument Handler naming
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

* Fix docs to appease the documentation gods
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

1c1a2ffb

AutoModelForTableQuestionAnswering (#9154) · 07384baf

Lysandre Debut authored Dec 16, 2020

* AutoModelForTableQuestionAnswering

* Update src/transformers/models/auto/modeling_auto.py

* Style

07384baf

[Flax] Align FlaxBertForMaskedLM with BertForMaskedLM, implement from_pretrained, init (#9054) · 640e6fe1

Patrick von Platen authored Dec 16, 2020



* save intermediate

* save intermediate

* save intermediate

* correct flax bert model file

* new module / model naming

* make style

* almost finish BERT

* finish roberta

* make fix-copies

* delete keys file

* last refactor

* fixes in run_mlm_flax.py

* remove pooled from run_mlm_flax.py`

* fix gelu | gelu_new

* remove Module from inits

* splits

* dirty print

* preventing warmup_steps == 0

* smaller splits

* make fix-copies

* dirty print

* dirty print

* initial_evaluation argument

* declaration order fix

* proper model initialization/loading

* proper initialization

* run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug

* removed tokenizers warning hack, fixed model re-initialization

* reverted training_args.py changes

* fix flax from pretrained

* improve test in flax

* apply sylvains tips

* update init

* make 0.3.0 compatible

* revert tevens changes

* revert tevens changes 2

* finalize revert

* fix bug

* add docs

* add pretrained to init

* Update src/transformers/modeling_flax_utils.py

* fix copies

* final improvements
Co-authored-by: TevenLeScao <teven.lescao@gmail.com>

640e6fe1

15 Dec, 2020 7 commits

[WIP] Tapas v4 (tres) (#9117) · 1551e2dc

NielsRogge authored Dec 15, 2020



* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Test PyTorch scatter

* Set to slow + minify

* Calm flake8 down

* First commit: adding all files from tapas_v3

* Fix multiple bugs including soft dependency and new structure of the library

* Improve testing by adding torch_device to inputs and adding dependency on scatter

* Use Python 3 inheritance rather than Python 2

* First draft model cards of base sized models

* Remove model cards as they are already on the hub

* Fix multiple bugs with integration tests

* All model integration tests pass

* Remove print statement

* Add test for convert_logits_to_predictions method of TapasTokenizer

* Incorporate suggestions by Google authors

* Fix remaining tests

* Change position embeddings sizes to 512 instead of 1024

* Comment out positional embedding sizes

* Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES

* Added more model names

* Fix truncation when no max length is specified

* Disable torchscript test

* Make style & make quality

* Quality

* Address CI needs

* Test the Masked LM model

* Fix the masked LM model

* Truncate when overflowing

* More much needed docs improvements

* Fix some URLs

* Some more docs improvements

* Add add_pooling_layer argument to TapasModel

Fix comments by @sgugger and @patrickvonplaten

* Fix issue in docs + fix style and quality

* Clean up conversion script and add task parameter to TapasConfig

* Revert the task parameter of TapasConfig

Some minor fixes

* Improve conversion script and add test for absolute position embeddings

* Improve conversion script and add test for absolute position embeddings

* Fix bug with reset_position_index_per_cell arg of the conversion cli

* Add notebooks to the examples directory and fix style and quality

* Apply suggestions from code review

* Move from `nielsr/` to `google/` namespace

* Apply Sylvain's comments
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Rogge Niels <niels.rogge@howest.be>
Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

1551e2dc

Add possibility to switch between APEX and AMP in Trainer (#9137) · ad895af9

Sylvain Gugger authored Dec 15, 2020



* Add possibility to switch between APEX and AMP in Trainer

* Update src/transformers/training_args.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments

* Update src/transformers/training_args.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

ad895af9

Add large model config (#9140) · 0b2f46fa
Lysandre Debut authored Dec 15, 2020

0b2f46fa

[TF Bart] Refactor TFBart (#9029) · abc573f5

Patrick von Platen authored Dec 15, 2020

* reorder file

* delete unnecesarry function

* make style

* save intermediate

* fix attention masks

* correct tf bart past key values

* solve merge conflict bug

* correct tensor dims

* save intermediate tf

* change attn layer

* fix typo re-order past

* inputs_embeds

* make fix copies

* finish tests

* fix graph mode

* appyl lysandres suggestions

abc573f5

Added TF OpenAi GPT1 Sequence Classification (#9105) · 389aba34

sandip authored Dec 15, 2020



* TF OpenAI GPT Sequence Classification

* Update src/transformers/models/openai/modeling_tf_openai.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

389aba34

Fix tf2.4 (#9120) · ef2d4cd4

Julien Plu authored Dec 15, 2020



* Fix tests for TF 2.4

* Remove <2.4 limitation

* Add version condition

* Update tests/test_optimization_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_optimization_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/test_optimization_tf.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

ef2d4cd4

Fix T5 model parallel tes (#9107) · 6ccea048
Lysandre Debut authored Dec 15, 2020
```
k
```
6ccea048

14 Dec, 2020 4 commits

Fix T5 and BART for TF (#9063) · df3f4d2a

Julien Plu authored Dec 14, 2020

* Fix T5 for graphe compilation+execution

* Fix BART

* Fix import

* Fix naming

* fix attribute name

* Oops

* fix import

* fix tests

* fix tests

* Update test

* Add mising import

* Address Patrick's comments

* Style

* Address Patrick's comment

df3f4d2a

Add parallelization support for T5EncoderModel (#9082) · a9c8bff7

Ahmed Elnaggar authored Dec 14, 2020



* add model parallelism to T5EncoderModel

add model parallelism to T5EncoderModel

* remove decoder from T5EncoderModel parallelize

* uodate T5EncoderModel docs

* Extend T5ModelTest for T5EncoderModel

* fix T5Stask using range for get_device_map

* fix style
Co-authored-by: Ahmed Elnaggar <elnaggar@rostlab.informatik.tu-muenchen.de>

a9c8bff7

[RAG, Bart] Align RAG, Bart cache with T5 and other models of transformers (#9098) · fa1ddced
Patrick von Platen authored Dec 14, 2020
```
* fix rag

* fix slow test

* fix past in bart
```
fa1ddced

Fix embeddings resizing in TF models (#8657) · 51d9c569

Julien Plu authored Dec 14, 2020

* Resize the biases in same time than the embeddings

* Trigger CI

* Biases are not reset anymore

* Remove get_output_embeddings + better LM model detection in generation utils

* Apply style

* First test on BERT

* Update docstring + new name

* Apply the new resizing logic to all the models

* fix tests

* Apply style

* Update the template

* Fix naming

* Fix naming

* Apply style

* Apply style

* Remove unused import

* Revert get_output_embeddings

* Trigger CI

* Update num parameters

* Restore get_output_embeddings in TFPretrainedModel and add comments

* Style

* Add decoder resizing

* Style

* Fix tests

* Separate bias and decoder resize

* Fix tests

* Fix tests

* Apply style

* Add bias resizing in MPNet

* Trigger CI

* Apply style

51d9c569

11 Dec, 2020 1 commit
- Make ProphetNetModel really compatible with EncoderDecoder (#9033) · 9cc9f412
  Patrick von Platen authored Dec 11, 2020
```
* improve

* finish

* upload model

* fix lm head

* fix test
```
  9cc9f412
10 Dec, 2020 1 commit
- Refactor FLAX tests (#9034) · 8d4bb020
  Sylvain Gugger authored Dec 10, 2020
  
  8d4bb020
09 Dec, 2020 4 commits

[Bart] Refactor - fix issues, consistency with the library, naming (#8900) · 06971ac4

Patrick von Platen authored Dec 09, 2020

* remove make on the fly linear embedding

* start refactor

* big first refactor

* save intermediate

* save intermediat

* correct mask issue

* save tests

* refactor padding masks

* make all tests pass

* further refactor

* make pegasus test pass

* fix bool if

* fix leftover tests

* continue

* bart renaming

* delete torchscript test hack

* fix imports in tests

* correct shift

* fix docs and repo cons

* re-add fix for FSTM

* typo in test

* fix typo

* fix another typo

* continue

* hot fix 2 for tf

* small fixes

* refactor types linting

* continue

* finish refactor

* fix import in tests

* better bart names

* further refactor and add test

* delete hack

* apply sylvains and lysandres commens

* small perf improv

* further perf improv

* improv perf

* fix typo

* make style

* small perf improv

06971ac4

Flax Masked Language Modeling training example (#8728) · 75627148

Funtowicz Morgan authored Dec 09, 2020



* Remove "Model" suffix from Flax models to look more :hugs:
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Initial working (forward + backward) for Flax MLM training example.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Simply code
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing comments, using module and moving to LM task.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore parameter name "module" wrongly renamed model.
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Restore correct output ordering...
Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Actually commit the example 😅

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Add FlaxBertModelForMaskedLM after rebasing.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make it possible to initialize the training from scratch
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Reuse flax linen example of cross entropy loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added specific data collator for flax
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Remove todo for data collator
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added evaluation step
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added ability to provide dtype to support bfloat16 on TPU
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable flax tensorboard output
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable jax.pmap support.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Ensure batches are correctly sized to be dispatched with jax.pmap
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable bfloat16 with --fp16 cmdline args
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Correctly export metrics to tensorboard
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added dropout and ability to use it.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Effectively enable & disable during training and evaluation steps.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Oops.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Enable specifying kernel initializer scale
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Style.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Added warmup step to the learning rate scheduler.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix typo.
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Print training loss
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make style
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix linter issue (flake8)
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix model matching
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix dummies
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Fix non default dtype on Flax models
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Use the same create_position_ids_from_input_ids for FlaxRoberta
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Make Roberta attention as Bert
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* fix copy
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>

* Wording.
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>
Co-authored-by: Marc van Zee <marcvanzee@gmail.com>

75627148

Add MP Net 2 (#9004) · df2af6d8
StillKeepTry authored Dec 09, 2020

df2af6d8

Diverse beam search 2 (#9006) · 02d0e035

Patrick von Platen authored Dec 09, 2020



* diverse beam search

* bug fixes

* bug fixes

* bug fix

* separate out diverse_beam_search function

* separate out diverse_beam_search function

* bug fix

* improve code quality

* bug fix

* bug fix

* separate out diverse beam search scorer

* code format

* code format

* code format

* code format

* add test

* code format

* documentation changes

* code quality

* add slow integration tests

* more general name

* refactor into logits processor

* add test

* avoid too much copy paste

* refactor

* add to docs

* fix-copies

* bug fix

* Revert "bug fix"

This reverts commit c99eb5a8dc57a7b0d33a8ac06d8c6a32a7812ad4.

* improve comment

* implement sylvains feedback
Co-authored-by: Ayush Jain <a.jain@sprinklr.com>
Co-authored-by: ayushtiku5 <40797286+ayushtiku5@users.noreply.github.com>

02d0e035

08 Dec, 2020 3 commits

New squad example (#8992) · 447808c8

Sylvain Gugger authored Dec 08, 2020



* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add new SQUAD example

* Same with a task-specific Trainer

* Address review comment.

* Small fixes

* Initial work for XLNet

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Final clean up and working XLNet script

* Test and debug

* Final working version

* Add tick

* Update README

* Address review comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

447808c8

Removed unused `encoder_hidden_states` and `encoder_attention_mask` (#8972) · 7809eb82

guillaume-be authored Dec 08, 2020

* Removed unused `encoder_hidden_states` and `encoder_attention_mask` from MobileBert

* Removed decoder tests for MobileBert

* Removed now unnecessary import

7809eb82

Optional layers (#8961) · bf7f79cd

Julien Plu authored Dec 08, 2020

* Apply on BERT and ALBERT

* Update TF Bart

* Add input processing to TF BART

* Add input processing for TF CTRL

* Add input processing to TF Distilbert

* Add input processing to TF DPR

* Add input processing to TF Electra

* Add deprecated arguments

* Add input processing to TF XLM

* remove unused imports

* Add input processing to TF Funnel

* Add input processing to TF GPT2

* Add input processing to TF Longformer

* Add input processing to TF Lxmert

* Apply style

* Add input processing to TF Mobilebert

* Add input processing to TF GPT

* Add input processing to TF Roberta

* Add input processing to TF T5

* Add input processing to TF TransfoXL

* Apply style

* Rebase on master

* Fix wrong model name

* Fix BART

* Apply style

* Put the deprecated warnings in the input processing function

* Remove the unused imports

* Raise an error when len(kwargs)>0

* test ModelOutput instead of TFBaseModelOutput

* Address Patrick's comments

* Address Patrick's comments

* Add boolean processing for the inputs

* Take into account the optional layers

* Add missing/unexpected weights in the other models

* Apply style

* rename parameters

* Apply style

* Remove useless

* Remove useless

* Remove useless

* Update num parameters

* Fix tests

* Address Patrick's comment

* Remove useless attribute

bf7f79cd

07 Dec, 2020 3 commits

Copyright (#8970) · 00aa9dbc
Sylvain Gugger authored Dec 07, 2020
```
* Add copyright everywhere missing

* Style
```
00aa9dbc

transformers-cli: LFS multipart uploads (> 5GB) (#8663) · 28fa014a

Julien Chaumond authored Dec 07, 2020



* initial commit

* [cli] lfs commands

* Fix FileSlice

* Tweak to FileSlice

* [hf_api] Backport filetype arg from `datasets`

cc @lhoestq

* Silm down the CI while i'm working

* Ok let's try this in CI

* Update config.yml

* Do not try this at home

* one more try

* Update lfs.py

* Revert "Tweak to FileSlice"

This reverts commit d7e32c4b3500400486411e85a2b74e57fb6b52f5.

* Update test_hf_api.py

* Update test_hf_api.py

* Update test_hf_api.py

* CI still green?

* make CI green again?

* Update test_hf_api.py

* make CI red again?

* Update test_hf_api.py

* add CI style back

* Fix CI?

* oh my

* doc + switch back to real staging endpoint

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

* Fix docblock + f-strings
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Pierric Cistac <Pierrci@users.noreply.github.com>

28fa014a

Add TFGPT2ForSequenceClassification based on DialogRPT (#8714) · 483e1327

sandip authored Dec 07, 2020

* Add TFGPT2ForSequenceClassification based on DialogRPT

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* Add TFGPT2ForSequenceClassification based on DialogRPT

* TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing

* code refactor for latest other TF PR

* code refactor

* code refactor

* Update modeling_tf_gpt2.py

483e1327

03 Dec, 2020 1 commit
- Patch model parallel test (#8920) · aa60b230
  Lysandre Debut authored Dec 03, 2020
```
* Patch model parallel test

* Remove line

* Remove `ci_*` from scheduled branches
```
  aa60b230
02 Dec, 2020 2 commits

[PyTorch] Refactor Resize Token Embeddings (#8880) · 443f67e8

Patrick von Platen authored Dec 02, 2020

* fix resize tokens

* correct mobile_bert

* move embedding fix into modeling_utils.py

* refactor

* fix lm head resize

* refactor

* break lines to make sylvain happy

* add news tests

* fix typo

* improve test

* skip bart-like for now

* check if base_model = get(...) is necessary

* clean files

* improve test

* fix tests

* revert style templates

* Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py

443f67e8

Warning about too long input for fast tokenizers too (#8799) · a8c3f9aa

Nicolas Patry authored Dec 02, 2020

* Warning about too long input for fast tokenizers too

If truncation is not set in tokenizers, but the tokenization is too long
for the model (`model_max_length`), we used to trigger a warning that

The input would probably fail (which it most likely will).

This PR re-enables the warning for fast tokenizers too and uses common
code for the trigger to make sure it's consistent across.

* Checking for pair of inputs too.

* Making the function private and adding it's doc.

* Remove formatting ?? in odd place.

* Missed uppercase.

a8c3f9aa