Commits · 696e8a43655a63b7312e036616f4abd2106e179e · chenpangpang / transformers

08 Mar, 2021 9 commits

Ratthachat (Jung) authored Mar 09, 2021

* Create modeling_tf_dpr.py

* Add TFDPR

* Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot

last commit accidentally deleted these 4 lines, so I recover them back

* Add TFDPR

* Add TFDPR

* clean up some comments, add TF input-style doc string

* Add TFDPR

* Make return_dict=False as default

* Fix return_dict bug (in .from_pretrained)

* Add get_input_embeddings()

* Create test_modeling_tf_dpr.py

The current version is already passed all 27 tests!
Please see the test run at : 
https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing



* fix quality

* delete init weights

* run fix copies

* fix repo consis

* del config_class, load_tf_weights

They shoud be 'pytorch only'

* add config_class back

after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion

* newline after .. note::

* import tf, np (Necessary for ModelIntegrationTest)

* slow_test from_pretrained with from_pt=True

At the moment we don't have TF weights (since we don't have official official TF model)
Previously, I did not run slow test, so I missed this bug

* Add simple TFDPRModelIntegrationTest

Note that this is just a test that TF and Pytorch gives approx. the same output.
However, I could not test with the official DPR repo's output yet

* upload correct tf model

* remove position_ids as missing keys

* create modeling_tf_rag

* add tests for tf

* add tf tests

* revert wrong pt commit

* further refactor

* further refactor

* refactor

* Update modeling_tf_rag.py

- input_processing
- fix prepare_input_for_generation (mostly fix generate bug)
- bring back from_pretrained hack in order to test generate

* delete colab pieces of code

* Show case of greedy "generate"

Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output.

* cosmetic update

* correct typos

* update

* push some progress

* make easy check

* fix rag save from pretrained

* Update src/transformers/modeling_tf_utils.py

* remove commented out lines

* delete unnecessary lines

* add simple test case for nq_checkpoint

Add nq_checkpoint test to show that current version without hack still fails

* temporarily put ugly hack back again

* Add TFRagSequenceForGeneration!!

* __init__.py , import TFRagSequenceForGeneration

* Add TFRagSequence tests!

* rag init.py - add TFRagSequenceForGeneration

* fix from_pretrained

* fix prepare_inputs_for_generation

* Beam search for RagToken!

* minor clean up

* add tf.cast in TFRagModel

* More tf.cast

* Add all remaining tests (still have issues)

* delete all T5 related

* make style

* fix load weight prefix

* fix bart

* fix return_dict for tf_rag

make all tests pass .. Hooray

* fix some tests

* fix code quality

* fix qualtiy check

* finish tests tf rag

* add tf rag to docs

* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove TFT5 from docstring
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Delete outdated comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* improve doc strings

* add generative model classes

* fix adjust token logic

* refactor generate for TFRag

* using shape_list, not _get_shape
Co-authored-by: Julien Plu <plu.julien@gmail.com>

* axis=[1]->axis=1

* delete NEED_HELP comment

* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve readability
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Indicating model is in a developing state in docstrings

As suggested by Julien

* small last changes

* apply sylvains suggestions

* finish tf rag
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: patrickvonplaten <patrick@huggingface.co>
Co-authored-by: Julien Plu <plu.julien@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

696e8a43

Check layer types for Optimizer construction (#10598) · 3ced9b3e
Sylvain Gugger authored Mar 08, 2021
```
* Check layer types for Optimizer construction

* Duplicate class
```
3ced9b3e
Revert "Tests" · 821d518e
Sylvain Gugger authored Mar 08, 2021
```
This reverts commit b35e7b68.
```
821d518e
Revert "Style" · 4196bfed
Sylvain Gugger authored Mar 08, 2021
```
This reverts commit a8ec52ef.
```
4196bfed
Style · a8ec52ef
Sylvain Gugger authored Mar 08, 2021

a8ec52ef
Tests · b35e7b68
Sylvain Gugger authored Mar 08, 2021

b35e7b68

offline mode for firewalled envs (part 2) (#10569) · 6f84531e

Stas Bekman authored Mar 08, 2021

* more readable test

* add all the missing places

* one more nltk

* better exception check

* revert

6f84531e

fix double wrapping + test (#10583) · f8829660
Stas Bekman authored Mar 08, 2021

f8829660

[M2M100] fix positional embeddings (#10590) · 2a737bff

Suraj Patil authored Mar 08, 2021

* fix tests

* emb should be a parameter

* fix positional embeddings

* fix make_weights

* don't save pos embeds

* add comment to describe the clamping

2a737bff

06 Mar, 2021 2 commits

Add m2m100 (#10236) · f6e74a63

Suraj Patil authored Mar 06, 2021

* m2m_100

* no layernorm_embedding

* sinusoidal positional embeddings

* update pos embeddings

* add default config values

* tokenizer

* add conversion script

* fix config

* fix pos embed

* remove _float_tensor

* update tokenizer

* update lang codes

* handle lang codes

* fix pos embeds

* fix spm key

* put embedding weights on device

* remove qa and seq classification heads

* fix convert script

* lang codes pn one line

* fix embeds

* fix tokenizer

* fix tokenizer

* add fast tokenizer

* style

* M2M100MT => M2M100

* fix copyright, style

* tokenizer converter

* vocab file

* remove fast tokenizer

* fix embeds

* fix tokenizer

* fix tests

* add tokenizer tests

* add integration test

* quality

* fix model name

* fix test

* doc

* doc

* fix doc

* add copied from statements

* fix tokenizer tests

* apply review suggestions

* fix urls

* fix shift_tokens_right

* apply review suggestions

* fix

* fix doc

* add lang code to id

* remove unused function

* update checkpoint names

* fix copy

* fix tokenizer

* fix checkpoint names

* fix merge issue

* style

f6e74a63

offline mode for firewalled envs (#10407) · 88a951e3

Stas Bekman authored Mar 05, 2021



* offline mode start

* add specific values

* fix fallback

* add test

* better values check and range

* test that actually works

* document the offline mode

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more strict check

* cleaner test

* pt-only test

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

88a951e3

05 Mar, 2021 2 commits
- Fix torch 1.8.0 segmentation fault (#10546) · 6b58e155
  Lysandre Debut authored Mar 05, 2021
```
* Only run one test

* Patch segfault

* Fix summarization pipeline

* Ready for merge
```
  6b58e155
- Fixing conversation test for torch 1.8 (#10545) · 54e55b52
  Nicolas Patry authored Mar 05, 2021
  
  54e55b52
04 Mar, 2021 2 commits

[ProphetNet] Bart-like Refactor (#10501) · c503a1c1

Patrick von Platen authored Mar 04, 2021

* first step to refactor

* make all fast tests pass

* make all slow tests pass

* save intermediate

* correct cache

* finish PR

* make fp16 work

c503a1c1

Rework TPU checkpointing in Trainer (#10504) · 6290169e

Sylvain Gugger authored Mar 04, 2021

* Rework TPU checkpointing in Trainer

* Wraps the barrier in a dist test

* Address review comments

* Remove line

6290169e

03 Mar, 2021 1 commit
- Generate can return cross-attention weights too (#10493) · 1750e629
  Mehrad Moradshahi authored Mar 03, 2021
  
  1750e629
01 Mar, 2021 1 commit

Add Fine-Tuning for Wav2Vec2 (#10145) · 0234de84

Patrick von Platen authored Mar 01, 2021



* add encode labels function to tokenizer

* start adding finetuning

* init dropout

* upload

* correct convert script

* apply changes

* fix second typo

* make first dummy training run

* adapt convert script

* push confg for comparison

* remove conf

* finish training

* adapt data collator

* add research folder

* update according to fairseq feedback

* some minor corrections

* refactor masking indices a bit

* some minor changes

* clean tokenizer

* finish clean-up

* remove previous logic

* update run script

* correct training

* finish changes

* finish model

* correct bug

* fix training a bit more

* add some tests

* finish gradient checkpointing

* finish example

* correct gradient checkpointing

* improve tokenization method

* revert changes in tokenizer

* revert general change

* adapt fine-tuning

* update

* save intermediate test

* Update README.md

* finish finetuning

* delete conversion script

* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* finish wav2vec2 script

* finish wav2vec2 fine-tuning

* finalize test

* correct test

* adapt tests

* finish

* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

0234de84

28 Feb, 2021 1 commit

Introduce save_strategy training argument (#10286) · 256482ac

Tanmay Garg authored Feb 28, 2021

* Introduce save_strategy training argument

* deprecate EvaluationStrategy

* collapse EvaluationStrategy and LoggingStrategy into a single
  IntervalStrategy enum

* modify tests to use modified enum

256482ac

26 Feb, 2021 2 commits
- Add Ray Tune hyperparameter search integration test (#10414) · 98569d4b
  Kai Fricke authored Feb 26, 2021
  
  98569d4b
- [ci, flax] non-existing models are unlikely to pass tests (#10409) · 83d2d55c
  Julien Chaumond authored Feb 26, 2021
```
😂
```
  83d2d55c
25 Feb, 2021 3 commits

Make Barthez tokenizer tests a bit faster (#10399) · 26f8b2cb
Sylvain Gugger authored Feb 25, 2021
```
* Make Barthez tokenizer tests a bit faster

* Quality
```
26f8b2cb

I-BERT model support (#10153) · 63645b3b

Sehoon Kim authored Feb 26, 2021



* IBertConfig, IBertTokentizer added

* IBert Model names moified

* tokenizer bugfix

* embedding -> QuantEmbedding

* quant utils added

* quant_mode added to configuration

* QuantAct added, Embedding layer + QuantAct addition

* QuantAct added

* unused path removed, QKV quantized

* self attention layer all quantized, except softmax

* temporarl commit

* all liner layers quantized

* quant_utils bugfix

* bugfix: requantization missing

* IntGELU added

* IntSoftmax added

* LayerNorm implemented

* LayerNorm implemented all

* names changed: roberta->ibert

* config not inherit from ROberta

* No support for CausalLM

* static quantization added, quantize_model.py removed

* import modules uncommented

* copyrights fixed

* minor bugfix

* quant_modules, quant_utils merged as one file

* import * fixed

* unused runfile removed

* make style run

* configutration.py docstring fixed

* refactoring: comments removed, function name fixed

* unused dependency removed

* typo fixed

* comments(Copied from), assertion string added

* refactoring: super(..) -> super(), etc.

* refactoring

* refarctoring

* make style

* refactoring

* cuda -> to(x.device)

* weight initialization removed

* QuantLinear set_param removed

* QuantEmbedding set_param removed

* IntLayerNorm set_param removed

* assert string added

* assertion error message fixed

* is_decoder removed

* enc-dec arguments/functions removed

* Converter removed

* quant_modules docstring fixed

* conver_slow_tokenizer rolled back

* quant_utils docstring fixed

* unused aruments e.g. use_cache removed from config

* weight initialization condition fixed

* x_min, x_max initialized with small values to avoid div-zero exceptions

* testing code for ibert

* test emb, linear, gelu, softmax added

* test ln and act added

* style reformatted

* force_dequant added

* error tests overrided

* make style

* Style + Docs

* force dequant tests added

* Fix fast tokenizer in init

* Fix doc

* Remove space

* docstring, IBertConfig, chunk_size

* test_modeling_ibert refactoring

* quant_modules.py refactoring

* e2e integration test added

* tokenizers removed

* IBertConfig added to tokenizer_auto.py

* bugfix

* fix docs & test

* fix style num 2

* final fixes
Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

63645b3b

[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor,... · cb38ffcc

Patrick von Platen authored Feb 25, 2021

[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)

* push to show

* small improvement

* small improvement

* Update src/transformers/feature_extraction_utils.py

* Update src/transformers/feature_extraction_utils.py

* implement base

* add common tests

* make all tests pass for wav2vec2

* make padding work & add more tests

* finalize feature extractor utils

* add call method to feature extraction

* finalize feature processor

* finish tokenizer

* finish general processor design

* finish tests

* typo

* remove bogus file

* finish docstring

* add docs

* finish docs

* small fix

* correct docs

* save intermediate

* load changes

* apply changes

* apply changes to doc

* change tests

* apply surajs recommend

* final changes

* Apply suggestions from code review

* fix typo

* fix import

* correct docstring

cb38ffcc

24 Feb, 2021 1 commit

ConvBERT fix torch <> tf weights conversion (#10314) · 2d458b2c

abhishek thakur authored Feb 24, 2021



* convbert conversion test

* fin

* fin

* fin

* clean up tf<->pt conversion

* remove from_pt
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>

2d458b2c

22 Feb, 2021 2 commits

Deprecate prepare_seq2seq_batch (#10287) · 9e147d31

Sylvain Gugger authored Feb 22, 2021



* Deprecate prepare_seq2seq_batch

* Fix last tests

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

* More review comments
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>

9e147d31

Making TF Longformer-like models compliant with AMP (#10233) · 19e737b9
Julien Plu authored Feb 22, 2021
```
* AMP

* Add LED

* Apply style

* Fix longformer
```
19e737b9

19 Feb, 2021 9 commits

Integrate DeBERTa v2(the 1.5B model surpassed human performance on Su… (#10018) · 9a7e6372

Pengcheng He authored Feb 19, 2021



* Integrate DeBERTa v2(the 1.5B model surpassed human performance on SuperGLUE); Add DeBERTa v2 900M,1.5B models;

* DeBERTa-v2

* Fix v2 model loading issue (#10129)

* Doc members

* Update src/transformers/models/deberta/modeling_deberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Address Sylvain's comments

* Address Patrick's comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Style
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

9a7e6372

Making TF OpenAI GPT model compliant with AMP and XLA (#10261) · 34df26ec
Julien Plu authored Feb 19, 2021
```
* Fix AMP and XLA

* Remove useless var
```
34df26ec
Making TF TransfoXL model compliant with AMP (#10264) · 3e116ed3
Julien Plu authored Feb 19, 2021
```
* Fix AMP

* Apply style

* Remove unused import
```
3e116ed3
Fix XLA and AMP (#10262) · 86caeb76
Julien Plu authored Feb 19, 2021

86caeb76
Making TF MPNet model compliant with XLA (#10260) · 3d72d47f
Julien Plu authored Feb 19, 2021
```
* Fix XLA

* Rework cast

* Apply style
```
3d72d47f
Making TF MobileBert model compliant with AMP (#10259) · fb56bf25
Julien Plu authored Feb 19, 2021
```
* Fix AMP

* Trigger CI

* Rework cast
```
fb56bf25
Making TF Lxmert model compliant with AMP (#10257) · 2fc6284f
Julien Plu authored Feb 19, 2021
```
* Fix AMP

* Rework cast

* Apply style
```
2fc6284f

[trainer] implement support for full fp16 in evaluation/predict (#10268) · 4eddc459

Stas Bekman authored Feb 18, 2021



* implement --fp16_full_eval

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* style

* add test
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

4eddc459

fix func signature (#10271) · d9a81fc0
Stas Bekman authored Feb 18, 2021

d9a81fc0

18 Feb, 2021 4 commits

[Trainer] memory tracker metrics (#10225) · 97e688bc

Stas Bekman authored Feb 18, 2021



* memory tracker metrics

* go back to eval for somewhat consistency

* handle no-gpu case

* deal with stackable eval calls

* restore callback order

* style

* simplify the API

* add test

* docs

* consistently use eval_ prefix

* improve docs

* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rename method

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

97e688bc

Reduce the time spent for the TF slow tests (#10152) · 2acae50a
Julien Plu authored Feb 18, 2021
```
* rework savedmodel slow test

* Improve savedmodel tests

* Remove useless content
```
2acae50a
Fix AMP (#10216) · 14ed3b97
Julien Plu authored Feb 18, 2021

14ed3b97
Making TF GPT2 compliant with XLA and AMP (#10230) · bdf1669e
Julien Plu authored Feb 18, 2021
```
* Fix XLA and AMP

* Fix AMP and XLA

* Apply style

* Apply Patrick's comment
```
bdf1669e

17 Feb, 2021 1 commit
- Make TF CTRL compliant with XLA and AMP (#10209) · 7246785a
  Julien Plu authored Feb 17, 2021
```
* Fix XLA and AMP

* Apply style

* Remove useless cast
```
  7246785a