"vscode:/vscode.git/clone" did not exist on "e786844425b6b1112c76513d66217ce2fe6aea41"
- 29 Jun, 2021 1 commit
-
-
Stas Bekman authored
* [models] respect dtype of the model when instantiating it * cleanup * cleanup * rework to handle non-float dtype * fix * switch to fp32 tiny model * improve * use dtype.is_floating_point * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix the doc * recode to use explicit torch_dtype_auto_detect, torch_dtype args * docs and tweaks * docs and tweaks * docs and tweaks * merge 2 args, add docs * fix * fix * better doc * better doc Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 28 Jun, 2021 13 commits
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * add length computatan * finish masking * finish * upload * fix some bugs * finish * fix dependency table * correct tensorboard * Apply suggestions from code review * correct processing * slight change init * correct some more mistakes * apply suggestions * improve readme * fix indent * Apply suggestions from code review Co-authored-by:
SaulLu <55560583+SaulLu@users.noreply.github.com> * correct tokenizer * finish * finish * finish * finish Co-authored-by:
Patrick von Platen <patrick@huggingface.co> Co-authored-by:
SaulLu <55560583+SaulLu@users.noreply.github.com>
-
Stas Bekman authored
-
Matt authored
* Tensorflow MLM example * Add CLM example * Style fixes, adding missing checkpoint code from the CLM example * Fix TPU training, avoid massive dataset warnings * Fix incorrect training length calculation for multi-GPU training * Fix incorrect training length calculation for multi-GPU training * Refactors and nitpicks from the review * Style pass * Adding README
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * finish * correct summary writer * correct push to hub * fix indent * finish * finish * finish * finish * finish Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
Funtowicz Morgan authored
* debug albert einsum * Fix matmul computation * Let's use torch linear layer. * Style.
-
Sylvain Gugger authored
-
Patrick von Platen authored
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * boom boom * correct typos * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> * Apply suggestions from code review Co-authored-by:
Suzana Ili膰 <io.suzanai@gmail.com> * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Suzana Ili膰 <io.suzanai@gmail.com>
-
Bhadresh Savani authored
* added cotext manager to datasets map * fixed style and spaces * fixed warning of deprecation * changed desc
-
Stas Bekman authored
* add dependency table sync verification * improve the message * improve the message * revert * ready to merge
-
Sylvain Gugger authored
-
Taha ValizadehAslani authored
Before the code could not be used for validation only because of this line: extension = data_args.train_file.split(".")[-1] was assuming that extension must be extracted from the training dataset. This line would run regardless of the training or validation options of the user. This would lead to an error if the user only wants to run an evaluation only and does not want to do train (because the training file does not exist). I modified it to extract extension from the training file if the user wants to do train and extract it from the validation file if the user wants to run eval. This way the code can be used for both training and validation separately. -
Kilian Kluge authored
[Documentation] Warn that DataCollatorForWholeWordMask is limited to BertTokenizer-like tokenizers (#12371) * Notify users that DataCollatorForWholeWordMask is limited to BertTokenier-like tokenizers * Fix code formatting
-
- 26 Jun, 2021 2 commits
-
-
Bhadresh Savani authored
-
Bhadresh Savani authored
-
- 25 Jun, 2021 10 commits
-
-
Bhadresh Savani authored
* added log_level * fix comment * fixed log_level * Trigger CI * Unfied logging * simplified args for log_level
-
Stas Bekman authored
* main_process_first context manager * handle multi-node, add context description * sync desc
-
cronoik authored
* fixed multiplechoice tokenization The model would have seen two sequences: 1. [CLS]prompt[SEP]prompt[SEP] 2. [CLS]choice0[SEP]choice1[SEP] that is not correct as we want a contextualized embedding of prompt and choice * removed outer brackets for proper sequence generation
-
Stas Bekman authored
-
Sylvain Gugger authored
-
Kai Fricke authored
* Replace NotebookProgressReporter by ProgressReporter in Ray Tune run * Move to local import
-
Vasudev Gupta authored
* port bigbird script * adapt script a bit * change location * adapt more * save progress * init commit * style * dataset script tested * readme add
-
jglaser authored
* fix distributed_concat for scalar outputs * Update README.md * fixed typo (#12356) * simplify fix with terser syntax Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Trigger CI Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
michal pitr <21157924+MichalPitr@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
michal pitr authored
-
Patrick von Platen authored
-
- 24 Jun, 2021 5 commits
-
-
Marc van Zee authored
-
Sylvain Gugger authored
-
Lysandre Debut authored
* Fix torchscript tests * Better test * Remove bogus print
-
Suraj Patil authored
-
Richard Liaw authored
Signed-off-by:Richard Liaw <rliaw@berkeley.edu>
-
- 23 Jun, 2021 9 commits
-
-
Sylvain Gugger authored
-
Stas Bekman authored
* document sub_group_size * style * install + issues reporting * style * style * Update docs/source/main_classes/deepspeed.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * indent 4 * restore * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sam Havens authored
mention in `save_strategy` param description that `load_best_model_at_end` can override
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Patrick von Platen authored
* finish t5 flax fixes * improve naming
-
Sylvain Gugger authored
-
Michael Benayoun authored
Co-authored-by:Michael Benayoun <michael@huggingface.co>
-
Lysandre authored
-