- 24 Aug, 2020 10 commits
-
-
Stas Bekman authored
* Create PULL_REQUEST_TEMPLATE.md Proposing to copy this neat feature from pytorch. This is a small template that let's a PR submitter tell which issue that PR closes. * Update .github/PULL_REQUEST_TEMPLATE.md Co-authored-by:
Kevin Canwen Xu <canwenxu@126.com> Co-authored-by:
Kevin Canwen Xu <canwenxu@126.com>
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Add optuna hyperparameter search to Trainer * @julien-c suggestions Co-authored-by:
Julien Chaumond <chaumond@gmail.com> * Make compute_objective an arg function * Formatting * Rework to make it easier to add ray * Formatting * Initial support for Ray * Formatting * Polish and finalize * Add trial id to checkpoint with Ray * Smaller default * Use GPU in ray if available * Formatting * Fix test * Update install instruction Co-authored-by:
Richard Liaw <rliaw@berkeley.edu> * Address review comments * Formatting post-merge Co-authored-by:
Julien Chaumond <chaumond@gmail.com> Co-authored-by:
Richard Liaw <rliaw@berkeley.edu>
-
vblagoje authored
-
Sylvain Gugger authored
* Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks
-
Teven authored
* Fixed DataCollatorForLanguageModeling + PermutationLanguageModeling not accepting lists of lists * Update data_collator.py * black was grumpy
-
sgugger authored
-
Sylvain Gugger authored
* Don't reset the type of the dataset * Formatting * Update trainer.py Co-authored-by:Teven <teven.lescao@gmail.com>
-
Jared T Nielsen authored
-
Sagor Sarker authored
* Create README.md * Update README.md * Create README.md * Update README.md * added multiple codeswitch model
-
- 23 Aug, 2020 3 commits
-
-
Patrick von Platen authored
-
Sam Shleifer authored
-
Patrick von Platen authored
-
- 22 Aug, 2020 2 commits
-
-
Sagor Sarker authored
* Create README.md * Update README.md * Create README.md * Update README.md
-
Patrick von Platen authored
Model was trained with the wrong tokenizer. Retrained with correct tokenizer - thanks for spotting @lhoestq !
-
- 21 Aug, 2020 8 commits
-
-
Manuel Romero authored
I works like a charm! Look at the output of the example code!
-
Suraj Patil authored
-
Patrick von Platen authored
@julien-c
-
Patrick von Platen authored
* add pegasus to docs * Update docs/source/model_summary.rst
-
Suraj Patil authored
* added CamembertForCausalLM * add in __init__ and auto model * style * doc
-
josephrocca authored
-
Manuel Romero authored
-
Morgan Funtowicz authored
Signed-off-by:Morgan Funtowicz <funtowiczmo@gmail.com>
-
- 20 Aug, 2020 16 commits
-
-
Sylvain Gugger authored
* Add a classmethod to easily build a Trainer from nlp dataset and metric * Fix docstrings * Split train/eval * Formatting * Log dropped columns + docs * Authorize callable activations * Poc for auto activation * Be framework-agnostic * Formatting * Remove class method * Remove unnecessary code
-
Sam Shleifer authored
-
sgugger authored
-
Sylvain Gugger authored
* Move threshold up for flaky test with Electra * Update above as well
-
Ivan Dolgov authored
* xlnet fp16 bug fix * comment cast added * Update modeling_xlnet.py Co-authored-by:Kevin Canwen Xu <canwenxu@126.com>
-
Patrick von Platen authored
* fix distilbert * fix typo
-
Denisa Roberts authored
-
Joe Davison authored
* TFTrainer dataset doc & fix evaluation bug discussed in #6551 * add docstring to test/eval datasets
-
Sylvain Gugger authored
* Add tests to Trainer * Test if removing long breaks everything * Remove ugly hack * Fix distributed test * Use float for number of epochs
-
Joe Davison authored
* add intro to nlp lib + links * unique links...
-
sgugger authored
-
Prajjwal Bhargava authored
* removed redundant arg in prepare_inputs * made same change in prediction_loop
-
Romain Rigaux authored
Tested in a local build of the docs. e.g. Just above https://huggingface.co/transformers/task_summary.html#causal-language-modeling Copy will copy the full code, e.g. for token in top_5_tokens: print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token]))) Instead of currently only: for token in top_5_tokens: >>> for token in top_5_tokens: ... print(sequence.replace(tokenizer.mask_token, tokenizer.decode([token]))) Distilled models are smaller than the models they mimic. Using them instead of the large versions would help reduce our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help increase our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help decrease our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help offset our carbon footprint. Distilled models are smaller than the models they mimic. Using them instead of the large versions would help improve our carbon footprint. Docs for the option fix: https://sphinx-copybutton.readthedocs.io/en/latest/
-
Stas Bekman authored
-
Siddharth Jain authored
-
Oren Amsalem authored
-
- 19 Aug, 2020 1 commit
-
-
Sylvain Gugger authored
-