- 18 Dec, 2020 5 commits
-
-
Sylvain Gugger authored
* Add new run_swag example * Add check * Add sample * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Very important change to make Lysandre happy Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Manuel Romero authored
-
Wissam Antoun authored
-
Manuel Romero authored
-
Stas Bekman authored
-
- 17 Dec, 2020 1 commit
-
-
Stas Bekman authored
-
- 16 Dec, 2020 3 commits
-
-
Sylvain Gugger authored
* Experimental stupport for fairscale ShardedDDP * Add import error if fairscale not available * Address review comments * Fix seq2seq trainer
-
Sylvain Gugger authored
-
Patrick von Platen authored
* save intermediate * save intermediate * save intermediate * correct flax bert model file * new module / model naming * make style * almost finish BERT * finish roberta * make fix-copies * delete keys file * last refactor * fixes in run_mlm_flax.py * remove pooled from run_mlm_flax.py` * fix gelu | gelu_new * remove Module from inits * splits * dirty print * preventing warmup_steps == 0 * smaller splits * make fix-copies * dirty print * dirty print * initial_evaluation argument * declaration order fix * proper model initialization/loading * proper initialization * run_mlm_flax improvements: improper model inputs bugfix + automatic dataset splitting + tokenizers parallelism warning + avoiding warmup_steps=0 bug * removed tokenizers warning hack, fixed model re-initialization * reverted training_args.py changes * fix flax from pretrained * improve test in flax * apply sylvains tips * update init * make 0.3.0 compatible * revert tevens changes * revert tevens changes 2 * finalize revert * fix bug * add docs * add pretrained to init * Update src/transformers/modeling_flax_utils.py * fix copies * final improvements Co-authored-by:TevenLeScao <teven.lescao@gmail.com>
-
- 15 Dec, 2020 4 commits
-
-
Teven authored
* replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0 * Add automatic dataset splitting in language-modeling examples
-
Stas Bekman authored
update README with good news that the leak fix has been applied to pytorch-1.7.1.
-
Yoshitomo Matsubara authored
-
Stas Bekman authored
* trainer and finetune_trainer enhancements and fixes * add fallback default * move the fixing of incorrect keys back into finetune trainer * s/eval/val/ to match the split * trainer can now use a different prefix than eval_ for metrics * document new arg * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * use 'eval' as the default for metric_key_prefix * complete adjust var names + disambiguate * fix logger * add clarifying comment * add clarifying comment * style * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/trainer.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * complete removal of optional for metric_key_prefix * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 11 Dec, 2020 3 commits
-
-
Sylvain Gugger authored
-
dependabot[bot] authored
Bumps [notebook](https://github.com/jupyter/jupyterhub) from 6.1.4 to 6.1.5. - [Release notes](https://github.com/jupyter/jupyterhub/releases) - [Changelog](https://github.com/jupyterhub/jupyterhub/blob/master/CHECKLIST-Release.md) - [Commits](https://github.com/jupyter/jupyterhub/commits ) Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Sylvain Gugger authored
* Reorganize example folder * Continue reorganization * Change requirements for tests * Final cleanup * Finish regroup with tests all passing * Copyright * Requirements and readme * Make a full link for the documentation * Address review comments * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Add symlink * Reorg again * Apply suggestions from code review Co-authored-by:
Thomas Wolf <thomwolf@users.noreply.github.com> * Adapt title * Update to new strucutre * Remove test * Update READMEs Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Thomas Wolf <thomwolf@users.noreply.github.com>
-
- 10 Dec, 2020 1 commit
-
-
NatLun137 authored
There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](https://github.com/huggingface/transformers/issues/9012)
-
- 09 Dec, 2020 1 commit
-
-
Funtowicz Morgan authored
* Remove "Model" suffix from Flax models to look more :hugs: Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Initial working (forward + backward) for Flax MLM training example. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Simply code Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Addressing comments, using module and moving to LM task. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Restore parameter name "module" wrongly renamed model. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Restore correct output ordering... Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Actually commit the example
馃槄 Signed-off-by:Morgan Funtowicz <morgan@huggingface.co> * Add FlaxBertModelForMaskedLM after rebasing. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make it possible to initialize the training from scratch Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Reuse flax linen example of cross entropy loss Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added specific data collator for flax Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove todo for data collator Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added evaluation step Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added ability to provide dtype to support bfloat16 on TPU Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Enable flax tensorboard output Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Enable jax.pmap support. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Ensure batches are correctly sized to be dispatched with jax.pmap Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Enable bfloat16 with --fp16 cmdline args Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correctly export metrics to tensorboard Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added dropout and ability to use it. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Effectively enable & disable during training and evaluation steps. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Oops. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Enable specifying kernel initializer scale Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added warmup step to the learning rate scheduler. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix typo. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Print training loss Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * fix linter issue (flake8) Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix model matching Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix dummies Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix non default dtype on Flax models Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use the same create_position_ids_from_input_ids for FlaxRoberta Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make Roberta attention as Bert Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * fix copy Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Wording. Co-authored-by:
Marc van Zee <marcvanzee@gmail.com> Co-authored-by:
Marc van Zee <marcvanzee@gmail.com>
-
- 08 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add new SQUAD example * Same with a task-specific Trainer * Address review comment. * Small fixes * Initial work for XLNet * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Final clean up and working XLNet script * Test and debug * Final working version * Add tick * Update README * Address review comments Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 07 Dec, 2020 3 commits
-
-
Sylvain Gugger authored
* Add copyright everywhere missing * Style
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Use word_ids to get labels in run_ner * Add sanity check
-
- 05 Dec, 2020 1 commit
-
-
Ethan Perez authored
Without this fix, training a `BARTForSequenceClassification` model with `run_pl_glue.py` gives `TypeError: forward() got an unexpected keyword argument 'token_type_ids'`, because BART does not have token_type_ids. I've solved this issue in the same way as it's solved for the "distilbert" model, and I can train BART models on SNLI without errors now.
-
- 04 Dec, 2020 2 commits
-
-
Stas Bekman authored
* document the caveat of leaky native amp * Update examples/seq2seq/README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
-
- 01 Dec, 2020 1 commit
-
-
Stas Bekman authored
-
- 30 Nov, 2020 3 commits
-
-
Stas Bekman authored
* fix DP case on multi-gpu * make executable * test all 3 modes * use the correct check for distributed * dp doesn't need a special case * restore original name * cleanup
-
Sylvain Gugger authored
* Remove deprecated `evalutate_during_training` * Update src/transformers/training_args_tf.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Stefan Schweter authored
-
- 26 Nov, 2020 4 commits
-
-
Stas Bekman authored
-
chutaklee authored
* Fix pplm * fix style * make style Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
This reverts commit 5aa361f3.
-
Daniel Khashabi authored
-
- 24 Nov, 2020 3 commits
-
-
Stas Bekman authored
* implement support for run-time dependency version checking * try not escaping ! * use findall that works on py36 * small tweaks * autoformatter worship * simplify * shorter names * add support for non-versioned checks * add deps * revert * tokenizers not required, check version only if installed * make a proper distutils cmd and add make target * tqdm must be checked before tokenizers * workaround the DistributionNotFound peculiar setup * handle the rest of packages in setup.py * fully sync setup.py's install_requires - to check them all * nit * make install_requires more readable * typo * Update setup.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * restyle * add types * simplify * simplify2 Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Quentin Lhoest authored
-
zhiheng-huang authored
* Support BERT relative position embeddings * Fix typo in README.md * Address review comment * Fix failing tests * [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py * make fix copies * fix configs of electra and albert and fix longformer * remove copy statement from longformer * fix albert * fix electra * Add bert variants forward tests for various position embeddings * [tiny] Fix style for test_modeling_bert.py * improve docstring * [tiny] improve docstring and remove unnecessary dependency * [tiny] Remove unused import * re-add to ALBERT * make embeddings work for ALBERT * add test for albert Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
- 23 Nov, 2020 2 commits
-
-
Sylvain Gugger authored
-
Stas Bekman authored
* make generate work with multigpu * better fix - thanks @sgugger
-
- 22 Nov, 2020 1 commit
-
-
Santiago Castro authored
-
- 20 Nov, 2020 1 commit
-
-
Quentin Lhoest authored
* replace init_ddp_connection for index init * style * add finetune test * add test data * move generate tensors to device * add test on EM metric * style * allow multi process test * keep gloo process group for retrieval * add multi-gpu test * use custom accelerator * clean test finetune * minor * style * style * typo * use python call instead of imported main fumction * return_dict fix in modeling_rag * use float32 in retrieval * store as float32 as well in the custom knowledge dataset example * style * rename to finetune_rag * style * update readme * rename utils and callbacks to utils_rag and callbacks_rag * fix test * patrick's comments * generate dummy data in the finetue test script * remove dummy data files * style
-