- 12 Mar, 2021 1 commit
-
-
Stas Bekman authored
-
- 11 Mar, 2021 2 commits
-
-
Lysandre Debut authored
-
ArvidYin authored
correct spell error: 'nether'
-
- 10 Mar, 2021 2 commits
-
-
Sylvain Gugger authored
* Add new GLUE example with no Trainer. * Style * Address review comments
-
Allen Wang authored
Fixes an issue in `text-classification` where MNLI eval/test datasets are not being preprocessed. (#10621) * Fix MNLI tests * Linter fix
-
- 09 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
* Hotfix fairscale FSDP * Evaluation works * Save on process zero
-
- 08 Mar, 2021 4 commits
-
-
Stas Bekman authored
* batch 1 * this is tpu * deebert attempt * the rest
-
Bhadresh Savani authored
* reverted changes of logging and saving metrics * added max_sample arguments * fixed code * white space diff * reformetting code * reformatted code
-
Stas Bekman authored
* fix sharded ddp enum * test fixes * stronger validation + apex breaks other tests
-
Stas Bekman authored
-
- 06 Mar, 2021 1 commit
-
-
Stas Bekman authored
* offline mode start * add specific values * fix fallback * add test * better values check and range * test that actually works * document the offline mode * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more strict check * cleaner test * pt-only test * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 05 Mar, 2021 1 commit
-
-
Patrick von Platen authored
-
- 04 Mar, 2021 3 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
This reverts commit f3660613.
-
Sylvain Gugger authored
-
- 01 Mar, 2021 1 commit
-
-
Patrick von Platen authored
* add encode labels function to tokenizer * start adding finetuning * init dropout * upload * correct convert script * apply changes * fix second typo * make first dummy training run * adapt convert script * push confg for comparison * remove conf * finish training * adapt data collator * add research folder * update according to fairseq feedback * some minor corrections * refactor masking indices a bit * some minor changes * clean tokenizer * finish clean-up * remove previous logic * update run script * correct training * finish changes * finish model * correct bug * fix training a bit more * add some tests * finish gradient checkpointing * finish example * correct gradient checkpointing * improve tokenization method * revert changes in tokenizer * revert general change * adapt fine-tuning * update * save intermediate test * Update README.md * finish finetuning * delete conversion script * Update src/transformers/models/wav2vec2/configuration_wav2vec2.py * Update src/transformers/models/wav2vec2/processing_wav2vec2.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * finish wav2vec2 script * finish wav2vec2 fine-tuning * finalize test * correct test * adapt tests * finish * remove test file Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 27 Feb, 2021 3 commits
-
-
Bhadresh Savani authored
* updated logging and saving metrics * space removal
-
Stas Bekman authored
This PR restores the original functionality that for some reason was modified. Fixes: https://github.com/huggingface/transformers/issues/10381 @sgugger
-
Stas Bekman authored
* refactors * typo
-
- 25 Feb, 2021 3 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Ass support for ZeRO-2/3 and ZeRO-offload in fairscale * Quality * Rework from review comments * Add doc * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Patrick von Platen authored
[PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324) * push to show * small improvement * small improvement * Update src/transformers/feature_extraction_utils.py * Update src/transformers/feature_extraction_utils.py * implement base * add common tests * make all tests pass for wav2vec2 * make padding work & add more tests * finalize feature extractor utils * add call method to feature extraction * finalize feature processor * finish tokenizer * finish general processor design * finish tests * typo * remove bogus file * finish docstring * add docs * finish docs * small fix * correct docs * save intermediate * load changes * apply changes * apply changes to doc * change tests * apply surajs recommend * final changes * Apply suggestions from code review * fix typo * fix import * correct docstring
-
- 24 Feb, 2021 1 commit
-
-
Stas Bekman authored
* handle get_last_lr() before first step() * abstract away the lr getting logic * cleanup * add test * move to utils
-
- 23 Feb, 2021 1 commit
-
-
Akmal authored
-
- 22 Feb, 2021 3 commits
-
-
Stas Bekman authored
* make logging and saving trainer built-in * Update src/transformers/trainer.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* implement gradient_accumulation_steps support in DeepSpeed integration * typo * cleanup * cleanup
-
Stas Bekman authored
-
- 19 Feb, 2021 2 commits
-
-
Julien Plu authored
-
Joe Davison authored
-
- 18 Feb, 2021 2 commits
-
-
Joe Davison authored
* add zero-shot distillation script * readme wordsmithing * clean up code * add multi-gpu teacher inference plus tidying up more code * add use_fast_tokenizer arg * update results in readme * more readme wordsmithing * style * Add handle to readme Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * fix code block * add error+docs about distributed & tpu * add @sgugger format requests * xla -> tpu * support fp16 for teacher preds * no checkpoint by default * add demo colab link * add model sharing prompt + model link * correct resulting acc of example Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
* memory tracker metrics * go back to eval for somewhat consistency * handle no-gpu case * deal with stackable eval calls * restore callback order * style * simplify the API * add test * docs * consistently use eval_ prefix * improve docs * Update src/transformers/trainer_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * rename method * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 17 Feb, 2021 1 commit
-
-
Stas Bekman authored
* fix invalid port * missing requirements
-
- 16 Feb, 2021 1 commit
-
-
Zhang Cheng authored
-
- 15 Feb, 2021 2 commits
-
-
Suraj Patil authored
* move old s2s scripts to legacy * add the tests back * proper rename * restore * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas@stason.org> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* fix run_seq2seq.py; porting DeepSpeed tests to it * unrefactor * defensive programming * defensive programming 2 * port the rest of the trainer tests * style * a cleaner scripts dir finder * cleanup
-
- 12 Feb, 2021 1 commit
-
-
Suraj Patil authored
* fix rouge metrics and task specific params * fix typo * round metrics * typo * remove task_specific_params
-
- 11 Feb, 2021 2 commits
-
-
Stas Bekman authored
* init devices/setup explicitly * docs + test * simplify * cleanup * cleanup * cleanup * correct the required dist setup * derive local_rank from env LOCAL_RANK
-
Qbiwan authored
* remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric * fix * fix * fix * push * fix * everything works * fix init * fix * special treatment for sepconv1d * style *
馃檹 馃徑 * add doc and cleanup * fix doc * fix doc again * fix doc again * Apply suggestions from code review * make style * Proposal that should work * Remove needless code * Fix test * Apply suggestions from code review * remove xnli_compute_metrics, add load_dataset, load_metric, set_seed,metric.compute,load_metric * amend README * removed data_args.task_name and replaced with task_name = "xnli"; use split function to load train and validation dataset separately; remove __post_init__; remove flag --task_name from README. * removed dict task_to_keys, use str "xnli" instead of variable task_name, change preprocess_function to use examples["premise"], examples["hypothesis"] directly, remove sentence1_key and sentence2_key, change compute_metrics function to cater only to accuracy metric, add condition for train_langauge is None when using dataset.load_dataset() * removed `torch.distributed.barrier()` and `import torch` as `from_pretrained` is able to do the work; amend README
-
- 10 Feb, 2021 2 commits
-
-
Stas Bekman authored
* free up memory at the end of train * rework tests * consistent formatting * correction
-
Lysandre Debut authored
-