Commits · 4c32f9f26e6a84f0d9843fec8757e6ce640bb44e · chenpangpang / transformers

12 Mar, 2021 1 commit
- AdamW is now supported by default (#9624) · 4c32f9f2
  Stas Bekman authored Mar 12, 2021
  
  4c32f9f2
11 Mar, 2021 2 commits
- Specify minimum version for sacrebleu (#10662) · 9fbb4cdc
  Lysandre Debut authored Mar 11, 2021
  
  9fbb4cdc
- Update README.md (#10647) · 27d9e05c
  ArvidYin authored Mar 11, 2021
```
correct spell error: 'nether'
```
  27d9e05c
10 Mar, 2021 2 commits
- Add new GLUE example with no Trainer. (#10555) · efb5c0a4
  Sylvain Gugger authored Mar 10, 2021
```
* Add new GLUE example with no Trainer.

* Style

* Address review comments
```
  efb5c0a4
- Fixes an issue in `text-classification` where MNLI eval/test datasets are not... · 6f52fce6
  Allen Wang authored Mar 09, 2021
```
Fixes an issue in `text-classification` where MNLI eval/test datasets are not being preprocessed. (#10621)

* Fix MNLI tests

* Linter fix
```
  6f52fce6
09 Mar, 2021 1 commit
- Fairscale FSDP fix model save (#10596) · 0d909f6b
  Sylvain Gugger authored Mar 09, 2021
```
* Hotfix fairscale FSDP

* Evaluation works

* Save on process zero
```
  0d909f6b
08 Mar, 2021 4 commits
- [examples tests on multigpu] resolving require_torch_non_multi_gpu_but_fix_me (#10561) · f284089e
  Stas Bekman authored Mar 08, 2021
```
* batch 1

* this is tpu

* deebert attempt

* the rest
```
  f284089e
- Added max_sample_ arguments (#10551) · dfd16af8
  Bhadresh Savani authored Mar 09, 2021
```
* reverted changes of logging and saving metrics

* added max_sample arguments

* fixed code

* white space diff

* reformetting code

* reformatted code
```
  dfd16af8
- [examples tests] various fixes (#10584) · 917f1045
  Stas Bekman authored Mar 08, 2021
```
* fix sharded ddp enum

* test fixes

* stronger validation + apex breaks other tests
```
  917f1045
- fix nltk lookup (#10585) · e6ce636e
  Stas Bekman authored Mar 07, 2021
  
  e6ce636e
06 Mar, 2021 1 commit

offline mode for firewalled envs (#10407) · 88a951e3

Stas Bekman authored Mar 05, 2021



* offline mode start

* add specific values

* fix fallback

* add test

* better values check and range

* test that actually works

* document the offline mode

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more strict check

* cleaner test

* pt-only test

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

88a951e3

05 Mar, 2021 1 commit
- fix run seq2seq (#10547) · 395ffcd7
  Patrick von Platen authored Mar 05, 2021
  
  395ffcd7
04 Mar, 2021 3 commits
- Not always consider a local model a checkpoint in run_glue (#10517) · a5bd40b7
  Sylvain Gugger authored Mar 04, 2021
  
  a5bd40b7
- Revert "Not always consider a local model a checkpoint in run_glue" · 745ea78d
  Sylvain Gugger authored Mar 04, 2021
```
This reverts commit f3660613.
```
  745ea78d
- Not always consider a local model a checkpoint in run_glue · f3660613
  Sylvain Gugger authored Mar 04, 2021
  
  f3660613
01 Mar, 2021 1 commit

Add Fine-Tuning for Wav2Vec2 (#10145) · 0234de84

Patrick von Platen authored Mar 01, 2021



* add encode labels function to tokenizer

* start adding finetuning

* init dropout

* upload

* correct convert script

* apply changes

* fix second typo

* make first dummy training run

* adapt convert script

* push confg for comparison

* remove conf

* finish training

* adapt data collator

* add research folder

* update according to fairseq feedback

* some minor corrections

* refactor masking indices a bit

* some minor changes

* clean tokenizer

* finish clean-up

* remove previous logic

* update run script

* correct training

* finish changes

* finish model

* correct bug

* fix training a bit more

* add some tests

* finish gradient checkpointing

* finish example

* correct gradient checkpointing

* improve tokenization method

* revert changes in tokenizer

* revert general change

* adapt fine-tuning

* update

* save intermediate test

* Update README.md

* finish finetuning

* delete conversion script

* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* finish wav2vec2 script

* finish wav2vec2 fine-tuning

* finalize test

* correct test

* adapt tests

* finish

* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

0234de84

27 Feb, 2021 3 commits
- updated logging and saving metrics (#10436) · aca6288f
  Bhadresh Savani authored Feb 27, 2021
```
* updated logging and saving metrics

* space removal
```
  aca6288f
- [run_seq2seq.py] restore functionality: saving to test_generations.txt (#10428) · f52a1589
  Stas Bekman authored Feb 27, 2021
```
This PR restores the original functionality that for some reason was modified.

Fixes: https://github.com/huggingface/transformers/issues/10381

@sgugger
```
  f52a1589
- [examples] better model example (#10427) · ee04b698
  Stas Bekman authored Feb 26, 2021
```
* refactors

* typo
```
  ee04b698
25 Feb, 2021 3 commits

Fix run_glue evaluation when model has a label correspondence (#10401) · 17b6e0d4
Sylvain Gugger authored Feb 25, 2021

17b6e0d4

Add support for ZeRO-2/3 and ZeRO-offload in fairscale (#10354) · 9d14be5c

Sylvain Gugger authored Feb 25, 2021



* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale

* Quality

* Rework from review comments

* Add doc

* Apply suggestions from code review
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

* Address review comments
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

9d14be5c

[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor,... · cb38ffcc

Patrick von Platen authored Feb 25, 2021

[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)

* push to show

* small improvement

* small improvement

* Update src/transformers/feature_extraction_utils.py

* Update src/transformers/feature_extraction_utils.py

* implement base

* add common tests

* make all tests pass for wav2vec2

* make padding work & add more tests

* finalize feature extractor utils

* add call method to feature extraction

* finalize feature processor

* finish tokenizer

* finish general processor design

* finish tests

* typo

* remove bogus file

* finish docstring

* add docs

* finish docs

* small fix

* correct docs

* save intermediate

* load changes

* apply changes

* apply changes to doc

* change tests

* apply surajs recommend

* final changes

* Apply suggestions from code review

* fix typo

* fix import

* correct docstring

cb38ffcc

24 Feb, 2021 1 commit

[Trainer/Deepspeed] handle get_last_lr() before first step() (#10362) · 3437d121

Stas Bekman authored Feb 23, 2021

* handle get_last_lr() before first step()

* abstract away the lr getting logic

* cleanup

* add test

* move to utils

3437d121

23 Feb, 2021 1 commit
- Fix broken examples/seq2seq/README.md markdown (#10344) · 23e87c27
  Akmal authored Feb 23, 2021
  
  23e87c27
22 Feb, 2021 3 commits
- [trainer] add Trainer methods for metrics logging and saving (#10266) · 622a8c59
  Stas Bekman authored Feb 22, 2021
```
* make logging and saving trainer built-in

* Update src/transformers/trainer.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
```
  622a8c59
- [Trainer] implement gradient_accumulation_steps support in DeepSpeed integration (#10310) · eab0afc1
  Stas Bekman authored Feb 22, 2021
```
* implement gradient_accumulation_steps support in DeepSpeed integration

* typo

* cleanup

* cleanup
```
  eab0afc1
- defensive programming + expand/correct README (#10295) · f991daed
  Stas Bekman authored Feb 22, 2021
  
  f991daed
19 Feb, 2021 2 commits
- Move the TF NER example (#10276) · 536aee99
  Julien Plu authored Feb 19, 2021
  
  536aee99
- Zero shot distillation script cuda patch (#10284) · cbadb524
  Joe Davison authored Feb 19, 2021
  
  cbadb524
18 Feb, 2021 2 commits

Script for distilling zero-shot classifier to more efficient student (#10244) · c6fe1755

Joe Davison authored Feb 18, 2021



* add zero-shot distillation script

* readme wordsmithing

* clean up code

* add multi-gpu teacher inference
plus tidying up more code

* add use_fast_tokenizer arg

* update results in readme

* more readme wordsmithing

* style

* Add handle to readme
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix code block

* add error+docs about distributed & tpu

* add @sgugger format requests

* xla -> tpu

* support fp16 for teacher preds

* no checkpoint by default

* add demo colab link

* add model sharing prompt + model link

* correct resulting acc of example
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

c6fe1755

[Trainer] memory tracker metrics (#10225) · 97e688bc

Stas Bekman authored Feb 18, 2021



* memory tracker metrics

* go back to eval for somewhat consistency

* handle no-gpu case

* deal with stackable eval calls

* restore callback order

* style

* simplify the API

* add test

* docs

* consistently use eval_ prefix

* improve docs

* Update src/transformers/trainer_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* rename method

* style
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

97e688bc

17 Feb, 2021 1 commit
- [CI] 2 fixes (#10248) · d1eb88f4
  Stas Bekman authored Feb 17, 2021
```
* fix invalid port

* missing requirements
```
  d1eb88f4
16 Feb, 2021 1 commit
- set tgt_lang of MBart Tokenizer for summarization (#10205) · df1b0fb5
  Zhang Cheng authored Feb 16, 2021
  
  df1b0fb5
15 Feb, 2021 2 commits

[WIP][examples/seq2seq] move old s2s scripts to legacy (#10136) · 1c8c2d9a

Suraj Patil authored Feb 16, 2021



* move old s2s scripts to legacy

* add the tests back

* proper rename

* restore

* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

1c8c2d9a

fix run_seq2seq.py; porting trainer tests to it (#10162) · 0b1f552a

Stas Bekman authored Feb 15, 2021

* fix run_seq2seq.py; porting DeepSpeed tests to it

* unrefactor

* defensive programming

* defensive programming 2

* port the rest of the trainer tests

* style

* a cleaner scripts dir finder

* cleanup

0b1f552a

12 Feb, 2021 1 commit
- [examples/run_s2s] remove task_specific_params and update rouge computation (#10133) · f51188cb
  Suraj Patil authored Feb 12, 2021
```
* fix rouge metrics and task specific params

* fix typo

* round metrics

* typo

* remove task_specific_params
```
  f51188cb
11 Feb, 2021 2 commits

[DeepSpeed in notebooks] Jupyter + Colab (#10130) · b54cb0bd

Stas Bekman authored Feb 11, 2021

* init devices/setup explicitly

* docs + test

* simplify

* cleanup

* cleanup

* cleanup

* correct the required dist setup

* derive local_rank from env LOCAL_RANK

b54cb0bd

Update run_xnli.py to use Datasets library (#9829) · 8dcfaea0

Qbiwan authored Feb 11, 2021

* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric

* fix

* fix

* fix

* push

* fix

* everything works

* fix init

* fix

* special treatment for sepconv1d

* style

* 🙏🏽

* add doc and cleanup


* fix doc

* fix doc again

* fix doc again

* Apply suggestions from code review

* make style

* Proposal that should work

* Remove needless code

* Fix test

* Apply suggestions from code review

* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric

* amend README

* removed data_args.task_name and replaced with task_name = "xnli"; use split function to load train and validation dataset separately; remove __post_init__; remove flag --task_name from README.

* removed dict task_to_keys, use str "xnli" instead of variable task_name, change preprocess_function to use examples["premise"], examples["hypothesis"] directly, remove sentence1_key and sentence2_key, change compute_metrics function to cater only to accuracy metric, add condition for train_langauge is None when using dataset.load_dataset()

* removed `torch.distributed.barrier()` and `import torch` as `from_pretrained` is able to do the work; amend README

8dcfaea0

10 Feb, 2021 2 commits
- [DeepSpeed] restore memory for evaluation (#10114) · 77b86284
  Stas Bekman authored Feb 10, 2021
```
* free up memory at the end of train

* rework tests

* consistent formatting

* correction
```
  77b86284
- Line endings should be LF across repo and not CRLF (#10119) · 0d8e554d
  Lysandre Debut authored Feb 10, 2021
  
  0d8e554d