Commits · 821d518e035211eb982aab73e7eb293cd2f05bbf · chenpangpang / transformers

08 Mar, 2021 11 commits
- Revert "Tests" · 821d518e
  Sylvain Gugger authored Mar 08, 2021
```
This reverts commit b35e7b68.
```
  821d518e
- Tests · b35e7b68
  Sylvain Gugger authored Mar 08, 2021
  
  b35e7b68
- [examples tests on multigpu] resolving require_torch_non_multi_gpu_but_fix_me (#10561) · f284089e
  Stas Bekman authored Mar 08, 2021
```
* batch 1

* this is tpu

* deebert attempt

* the rest
```
  f284089e
- [examples tests] various fixes (#10584) · 917f1045
  Stas Bekman authored Mar 08, 2021
```
* fix sharded ddp enum

* test fixes

* stronger validation + apex breaks other tests
```
  917f1045
- offline mode for firewalled envs (part 2) (#10569) · 6f84531e
  Stas Bekman authored Mar 08, 2021
```
* more readable test

* add all the missing places

* one more nltk

* better exception check

* revert
```
  6f84531e
- fix double wrapping + test (#10583) · f8829660
  Stas Bekman authored Mar 08, 2021
  
  f8829660
- tokenization_marian.py: use current_spm for decoding (#10357) · b8805084
  Mehrad Moradshahi authored Mar 08, 2021
```
* Fix Marian decoding

Tokenizer's decode and batch_decode now accepts a new argument (use_source_tokenizer) which indicates whether the source spm should be used to decode ids. This is useful for Marian models specificallly when decoding source input ids.

* Adapt docstrings
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
```
  b8805084
- [M2M100] fix positional embeddings (#10590) · 2a737bff
  Suraj Patil authored Mar 08, 2021
```
* fix tests

* emb should be a parameter

* fix positional embeddings

* fix make_weights

* don't save pos embeds

* add comment to describe the clamping
```
  2a737bff
- fix BART Summarization example in doc (#10582) · d59464db
  Oren Amsalem authored Mar 08, 2021
  
  d59464db
- Fix typo in docstring for pipeline (#10591) · 3b583d02
  Eunhyuk Shin authored Mar 08, 2021
  
  3b583d02
- fix tf doc bug (#10570) · 9dd054fb
  Yu authored Mar 08, 2021
  
  9dd054fb
06 Mar, 2021 2 commits

Suraj Patil authored Mar 06, 2021

* m2m_100

* no layernorm_embedding

* sinusoidal positional embeddings

* update pos embeddings

* add default config values

* tokenizer

* add conversion script

* fix config

* fix pos embed

* remove _float_tensor

* update tokenizer

* update lang codes

* handle lang codes

* fix pos embeds

* fix spm key

* put embedding weights on device

* remove qa and seq classification heads

* fix convert script

* lang codes pn one line

* fix embeds

* fix tokenizer

* fix tokenizer

* add fast tokenizer

* style

* M2M100MT => M2M100

* fix copyright, style

* tokenizer converter

* vocab file

* remove fast tokenizer

* fix embeds

* fix tokenizer

* fix tests

* add tokenizer tests

* add integration test

* quality

* fix model name

* fix test

* doc

* doc

* fix doc

* add copied from statements

* fix tokenizer tests

* apply review suggestions

* fix urls

* fix shift_tokens_right

* apply review suggestions

* fix

* fix doc

* add lang code to id

* remove unused function

* update checkpoint names

* fix copy

* fix tokenizer

* fix checkpoint names

* fix merge issue

* style

f6e74a63

offline mode for firewalled envs (#10407) · 88a951e3

Stas Bekman authored Mar 05, 2021



* offline mode start

* add specific values

* fix fallback

* add test

* better values check and range

* test that actually works

* document the offline mode

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more strict check

* cleaner test

* pt-only test

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

88a951e3

05 Mar, 2021 4 commits

Refactoring checkpoint names for multiple models (#10527) · 90ecc296

Daniel Hug authored Mar 05, 2021

* Refactor checkpoint name in ALBERT and ALBERT_tf

* Refactor checkpoint name in BART and BART_tf

* Refactor checkpoint name in BERT generation

* Refactor checkpoint name in Blenderbot_tf

* Refactor checkpoint name in Blenderbot_small_tf

* Refactor checkpoint name in ConvBERT AND CONVBERT_TF

* Refactor checkpoint name in CTRL AND CTRL_TF

* Refactor checkpoint name in DistilBERT AND DistilBERT_TF

* Refactor checkpoint name in DistilBERT redo

* Refactor checkpoint name in Electra and Electra_tf

* Refactor checkpoint name in FlauBERT and FlauBERT_tf

* Refactor checkpoint name in FSMT

* Refactor checkpoint name in GPT2 and GPT2_tf

* Refactor checkpoint name in IBERT

* Refactor checkpoint name in LED and LED_tf

* Refactor checkpoint name in Longformer and Longformer_tf

* Refactor checkpoint name in Lxmert and Lxmert_tf

* Refactor checkpoint name in Marian_tf

* Refactor checkpoint name in MBART and MBART_tf

* Refactor checkpoint name in MobileBERT and MobileBERT_tf

* Refactor checkpoint name in mpnet and mpnet_tf

* Refactor checkpoint name in openai and openai_tf

* Refactor checkpoint name in pegasus_tf

* Refactor checkpoint name in reformer

* Refactor checkpoint name in Roberta and Roberta_tf

* Refactor checkpoint name in SqueezeBert

* Refactor checkpoint name in Transformer_xl and Transformer_xl_tf

* Refactor checkpoint name in XLM and XLM_tf

* Refactor checkpoint name in XLNET and XLNET_tf

* Refactor checkpoint name in BERT_tf

* run make tests, style, quality, fixup

90ecc296

Fix embeddings for PyTorch 1.8 (#10549) · 7da995c0

Sylvain Gugger authored Mar 05, 2021

* Fix embeddings for PyTorch 1.8

* Try with PyTorch 1.8.0

* Fix embeddings init

* Fix copies

* Typo

* More typos

7da995c0

Typo correction. (#10531) · 3e056c10

Chen Liang authored Mar 05, 2021

DEBERTA_PRETRAINED_MODEL_ARCHIVE_LIST => DEBERTA_V2_PRETRAINED_MODEL_ARCHIVE_LIST in line 31.

3e056c10

fixed dead link in trainer doc (#10554) · 9f8bc87c
Joakim Warholm authored Mar 05, 2021

9f8bc87c

04 Mar, 2021 3 commits

[ProphetNet] Bart-like Refactor (#10501) · c503a1c1

Patrick von Platen authored Mar 04, 2021

* first step to refactor

* make all fast tests pass

* make all slow tests pass

* save intermediate

* correct cache

* finish PR

* make fp16 work

c503a1c1

Rework TPU checkpointing in Trainer (#10504) · 6290169e

Sylvain Gugger authored Mar 04, 2021

* Rework TPU checkpointing in Trainer

* Wraps the barrier in a dist test

* Address review comments

* Remove line

6290169e

Removes overwrites for output_dir (#10521) · 805c5200
Philipp Schmid authored Mar 04, 2021
```
* removed overwrites

* remove default value for output_dir

* adjusted typing
```
805c5200

03 Mar, 2021 7 commits
- Smp grad accum (#10488) · b70f441b
  Sylvain Gugger authored Mar 03, 2021
```
* Fix gradient accumulation for SM Model Parallelism

* Style and divide loss by grad accum steps
```
  b70f441b
- Fix the bug in constructing the all_hidden_states of DeBERTa v2 (#10466) · d064fb56
  felixgwu authored Mar 03, 2021
```
* fix all_hidden_states

* use output_states instead of next_kv
```
  d064fb56
- remap MODEL_FOR_QUESTION_ANSWERING_MAPPING classes to names auto-generated file (#10487) · 188574ac
  Stas Bekman authored Mar 03, 2021
```
* remap classes to strings

* missing new util

* style

* doc

* move the autogenerated file

* Trigger CI
```
  188574ac
- Refactor checkpoint name in BERT and MobileBERT (#10424) · 801ff969
  Sylvain Gugger authored Mar 03, 2021
```
* Refactor checkpoint name in BERT and MobileBERT

* Add option to check copies

* Add QuestionAnswering

* Add last models

* Make black happy
```
  801ff969
- [T5] Fix speed degradation bug t5 (#10496) · 2d2ed2cc
  Patrick von Platen authored Mar 03, 2021
```
* fix speed degradation bug t5

* fix for all models

* fix code quality
```
  2d2ed2cc
- Fixed minor spelling mistakes (#10489) · 5dc303e2
  WybeKoper authored Mar 03, 2021
```
Co-authored-by: WybeKoper <WybeKoper@users.noreply.github.com>
```
  5dc303e2
- Generate can return cross-attention weights too (#10493) · 1750e629
  Mehrad Moradshahi authored Mar 03, 2021
  
  1750e629
02 Mar, 2021 1 commit

Changed `num_beams` to `num_beams // num_beam_groups` when initialising... · b0138422

Martin Schmitt authored Mar 02, 2021

Changed `num_beams` to `num_beams // num_beam_groups` when initialising `PrefixConstrainedLogitsProcessor` in `_get_logits_processor` to fix compatibility issue when constrained decoding is used together with grouped beam search (#10475)

b0138422

01 Mar, 2021 3 commits

[Wav2Vec2FeatureExtractor] smal fixes (#10455) · a106bde5
Suraj Patil authored Mar 01, 2021
```
* smal fixes

* don't check for None
```
a106bde5
remove feature extraction config (#10457) · 11655faf
Patrick von Platen authored Mar 01, 2021

11655faf

Add Fine-Tuning for Wav2Vec2 (#10145) · 0234de84

Patrick von Platen authored Mar 01, 2021



* add encode labels function to tokenizer

* start adding finetuning

* init dropout

* upload

* correct convert script

* apply changes

* fix second typo

* make first dummy training run

* adapt convert script

* push confg for comparison

* remove conf

* finish training

* adapt data collator

* add research folder

* update according to fairseq feedback

* some minor corrections

* refactor masking indices a bit

* some minor changes

* clean tokenizer

* finish clean-up

* remove previous logic

* update run script

* correct training

* finish changes

* finish model

* correct bug

* fix training a bit more

* add some tests

* finish gradient checkpointing

* finish example

* correct gradient checkpointing

* improve tokenization method

* revert changes in tokenizer

* revert general change

* adapt fine-tuning

* update

* save intermediate test

* Update README.md

* finish finetuning

* delete conversion script

* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* finish wav2vec2 script

* finish wav2vec2 fine-tuning

* finalize test

* correct test

* adapt tests

* finish

* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

0234de84

28 Feb, 2021 1 commit

Introduce save_strategy training argument (#10286) · 256482ac

Tanmay Garg authored Feb 28, 2021

* Introduce save_strategy training argument

* deprecate EvaluationStrategy

* collapse EvaluationStrategy and LoggingStrategy into a single
  IntervalStrategy enum

* modify tests to use modified enum

256482ac

27 Feb, 2021 2 commits
- [examples] better model example (#10427) · ee04b698
  Stas Bekman authored Feb 26, 2021
```
* refactors

* typo
```
  ee04b698
- Ray Tune Integration Bug Fixes (#10406) · a85eb616
  Amog Kamsetty authored Feb 26, 2021
```
* fixes

* update resources

* formatting

* remove import

* add log statement

* use fstring

* add period

* Update src/transformers/integrations.py
```
  a85eb616
26 Feb, 2021 2 commits
- [LED] Correct Docs (#10419) · d03695f3
  Patrick von Platen authored Feb 26, 2021
```
* correct docs

* correct tf model docs as well
```
  d03695f3
- Sagemaker Model Parallel tensoboard writing fix (#10403) · 7fc686ef
  Mansi Mane authored Feb 26, 2021
```
* Added tb fix

* Removed local rank condition

* Updated reference to args
```
  7fc686ef
25 Feb, 2021 4 commits

Fix run_glue evaluation when model has a label correspondence (#10401) · 17b6e0d4
Sylvain Gugger authored Feb 25, 2021

17b6e0d4

Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354) · 9d14be5c

Sylvain Gugger authored Feb 25, 2021



* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale

* Quality

* Rework from review comments

* Add doc

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

9d14be5c

Ignore unexpected weights from PT conversion (#10397) · 88cc26dc
Lysandre Debut authored Feb 25, 2021

88cc26dc

I-BERT model support (#10153) · 63645b3b

Sehoon Kim authored Feb 26, 2021



* IBertConfig, IBertTokentizer added

* IBert Model names moified

* tokenizer bugfix

* embedding -> QuantEmbedding

* quant utils added

* quant_mode added to configuration

* QuantAct added, Embedding layer + QuantAct addition

* QuantAct added

* unused path removed, QKV quantized

* self attention layer all quantized, except softmax

* temporarl commit

* all liner layers quantized

* quant_utils bugfix

* bugfix: requantization missing

* IntGELU added

* IntSoftmax added

* LayerNorm implemented

* LayerNorm implemented all

* names changed: roberta->ibert

* config not inherit from ROberta

* No support for CausalLM

* static quantization added, quantize_model.py removed

* import modules uncommented

* copyrights fixed

* minor bugfix

* quant_modules, quant_utils merged as one file

* import * fixed

* unused runfile removed

* make style run

* configutration.py docstring fixed

* refactoring: comments removed, function name fixed

* unused dependency removed

* typo fixed

* comments(Copied from), assertion string added

* refactoring: super(..) -> super(), etc.

* refactoring

* refarctoring

* make style

* refactoring

* cuda -> to(x.device)

* weight initialization removed

* QuantLinear set_param removed

* QuantEmbedding set_param removed

* IntLayerNorm set_param removed

* assert string added

* assertion error message fixed

* is_decoder removed

* enc-dec arguments/functions removed

* Converter removed

* quant_modules docstring fixed

* conver_slow_tokenizer rolled back

* quant_utils docstring fixed

* unused aruments e.g. use_cache removed from config

* weight initialization condition fixed

* x_min, x_max initialized with small values to avoid div-zero exceptions

* testing code for ibert

* test emb, linear, gelu, softmax added

* test ln and act added

* style reformatted

* force_dequant added

* error tests overrided

* make style

* Style + Docs

* force dequant tests added

* Fix fast tokenizer in init

* Fix doc

* Remove space

* docstring, IBertConfig, chunk_size

* test_modeling_ibert refactoring

* quant_modules.py refactoring

* e2e integration test added

* tokenizers removed

* IBertConfig added to tokenizer_auto.py

* bugfix

* fix docs & test

* fix style num 2

* final fixes
Co-authored-by: Sehoon Kim <sehoonkim@berkeley.edu>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

63645b3b