- 28 Jun, 2021 2 commits
-
-
Taha ValizadehAslani authored
Before the code could not be used for validation only because of this line: extension = data_args.train_file.split(".")[-1] was assuming that extension must be extracted from the training dataset. This line would run regardless of the training or validation options of the user. This would lead to an error if the user only wants to run an evaluation only and does not want to do train (because the training file does not exist). I modified it to extract extension from the training file if the user wants to do train and extract it from the validation file if the user wants to run eval. This way the code can be used for both training and validation separately. -
Kilian Kluge authored
[Documentation] Warn that DataCollatorForWholeWordMask is limited to BertTokenizer-like tokenizers (#12371) * Notify users that DataCollatorForWholeWordMask is limited to BertTokenier-like tokenizers * Fix code formatting
-
- 26 Jun, 2021 2 commits
-
-
Bhadresh Savani authored
-
Bhadresh Savani authored
-
- 25 Jun, 2021 10 commits
-
-
Bhadresh Savani authored
* added log_level * fix comment * fixed log_level * Trigger CI * Unfied logging * simplified args for log_level
-
Stas Bekman authored
* main_process_first context manager * handle multi-node, add context description * sync desc
-
cronoik authored
* fixed multiplechoice tokenization The model would have seen two sequences: 1. [CLS]prompt[SEP]prompt[SEP] 2. [CLS]choice0[SEP]choice1[SEP] that is not correct as we want a contextualized embedding of prompt and choice * removed outer brackets for proper sequence generation
-
Stas Bekman authored
-
Sylvain Gugger authored
-
Kai Fricke authored
* Replace NotebookProgressReporter by ProgressReporter in Ray Tune run * Move to local import
-
Vasudev Gupta authored
* port bigbird script * adapt script a bit * change location * adapt more * save progress * init commit * style * dataset script tested * readme add
-
jglaser authored
* fix distributed_concat for scalar outputs * Update README.md * fixed typo (#12356) * simplify fix with terser syntax Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Trigger CI Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
michal pitr <21157924+MichalPitr@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
michal pitr authored
-
Patrick von Platen authored
-
- 24 Jun, 2021 5 commits
-
-
Marc van Zee authored
-
Sylvain Gugger authored
-
Lysandre Debut authored
* Fix torchscript tests * Better test * Remove bogus print
-
Suraj Patil authored
-
Richard Liaw authored
Signed-off-by:Richard Liaw <rliaw@berkeley.edu>
-
- 23 Jun, 2021 21 commits
-
-
Sylvain Gugger authored
-
Stas Bekman authored
* document sub_group_size * style * install + issues reporting * style * style * Update docs/source/main_classes/deepspeed.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * indent 4 * restore * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sam Havens authored
mention in `save_strategy` param description that `load_best_model_at_end` can override
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Patrick von Platen authored
* finish t5 flax fixes * improve naming
-
Sylvain Gugger authored
-
Michael Benayoun authored
Co-authored-by:Michael Benayoun <michael@huggingface.co>
-
Lysandre authored
-
Lysandre Debut authored
-
Sylvain Gugger authored
* Add all XxxPreTrainedModel to the main init * Add to template * Add to template bis * Add FlaxT5
-
Sylvain Gugger authored
* Clean push to hub API * Create working dir if it does not exist * Different tweak * New API + all models + test Flax * Adds the Trainer clean up * Update src/transformers/file_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Address review comments * (nit) output types * No need to set clone_from when folder exists * Update src/transformers/trainer.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Add generated_from_trainer tag * Update to new version * Fixes Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
chenht2010 authored
* fix error * make style check happy Co-authored-by:chenhaitao <chenhaitao@qiyi.com>
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * finish * make style
-
Lysandre Debut authored
-
Vasudev Gupta authored
* copy pytorch-t5 * init * boom boom * forward pass same * make generation work * add more tests * make test work * finish normal tests * make fix-copies * finish quality * correct slow example * correct slow test * version table * upload models * Update tests/test_modeling_flax_t5.py * correct incorrectly deleted line Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Patrick von Platen <patrick@huggingface.co>
-
David Fan authored
* Rewrite * [ONNX] rewrite
-
Suraj Patil authored
* add summrization script * fix arguments, preprocessing, metrics * add generation and metrics * auto model, prediction loop * prettify * label smoothing * adress Sylvain and Patricks suggestions * dynamically import shift_tokens_right * fix shift_tokens_right_fn call
-
Daniel Stancl authored
* Add output args to greedy search * Fix critical typo + make style quality * Handle generate_beam_search * Add dict_specific tests and fix the placement of encoder outputs * Add specific outputs * Update doc * Fix typo * Adjust handling encoder_outputs + Fix generating for T5 * Fix generate for RAG * Fix handling ouptut_attentions when target_mapping is not None Take care of situations when target_mapping is provided as there are 2-tuple of attentions Change from: if inputs["output_attentions"]: attentions = tuple(tf.transpose(t, perm(2, 3, 0, 1)) for t in attentions) to: if inputs["output_attentions"]: if inputs["target_mapping"] is not None: # when target_mapping is provided, there are 2-tuple of attentions attentions = tuple( tuple(tf.transpose(attn_stream, perm=(2, 3, 0, 1)) for attn_stream in t) for t in attentions ) else: attentions = tuple(tf.transpose(t, perm=(2, 3, 0, 1)) for t in attentions) * Rename kwargs to model_kwargs * make style quality * Move imports in test_modeling_tf_common.py Move ModelOutput-related imports in test_modeling_tf_common.py into the `is_tf_available():` statement. * Rewrite nested if-statements * Fix added tests -
Nicolas Patry authored
* Optimizing away the `fill-mask` pipeline. - Don't send anything to the tokenizer unless needed. Vocab check is much faster - Keep BC by sending data to the tokenizer when needed. User handling warning messages will see performance benefits again - Make `targets` and `top_k` work together better `top_k` cannot be higher than `len(targets)` but can be smaller still. - Actually simplify the `target_ids` in case of duplicate (it can happen because we're parsing raw strings) - Removed useless code to fail on empty strings. It works only if empty string is in first position, moved to ignoring them instead. - Changed the related tests as only the tests would fail correctly (having incorrect value in first position) * Make tests compatible for 2 different vocabs... (at the price of a warning). Co-authored-by: @EtaoinWu * ValueError working globally * Update src/transformers/pipelines/fill_mask.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * `tokenizer.vocab` -> `tokenizer.get_vocab()` for more compatiblity + fallback. Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Kevin Canwen Xu authored
* Add optional dependency * Add CodeCarbon integration * Add CodeCarbon integration * Add CodeCarbon integration * typo
-