- 04 Dec, 2020 3 commits
-
-
Lysandre Debut authored
-
Julien Plu authored
* Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add deprecated arguments * Add input processing to TF XLM * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Bug fix * Retry to bugfix * Retry bug fix * Fix wrong model name * Try another fix * Fix BART * Fix input precessing * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Bug fix * Address Patrick's comments * Address Patrick's comments * Address Sylvain's comments * Add boolean processing for the inputs * Apply style * Missing optional * Fix missing some input proc * Update the template * Fix missing inputs * Missing input * Fix args parameter * Trigger CI * Trigger CI * Trigger CI * Address Patrick's and Sylvain's comments * Replace warn by warning * Trigger CI * Fix XLNET * Fix detection
-
Stas Bekman authored
-
- 03 Dec, 2020 7 commits
-
-
Lysandre Debut authored
* Patch model parallel test * Remove line * Remove `ci_*` from scheduled branches
-
Lysandre Debut authored
* conda * Guide * correct tag * Update README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/installation.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Sylvain's comments Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Julien Chaumond authored
* Add badge w/ number of models on the hub * try to apease @sgugger
馃槆 * not sure what this `c` was about [ci skip] * Fix script and move stuff around * Fix doc styling error Co-authored-by:Sylvain Gugger <sylvain.gugger@gmail.com>
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Skye Wanderman-Milne authored
-
Julien Chaumond authored
(and wasn't needed here anyways as it was added automatically)
-
- 02 Dec, 2020 7 commits
-
-
Patrick von Platen authored
* fix resize tokens * correct mobile_bert * move embedding fix into modeling_utils.py * refactor * fix lm head resize * refactor * break lines to make sylvain happy * add news tests * fix typo * improve test * skip bart-like for now * check if base_model = get(...) is necessary * clean files * improve test * fix tests * revert style templates * Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py -
Devangi Purkayastha authored
-
ryota-mo authored
-
Stas Bekman authored
* [trainer] improve code This PR: - removes redundant code ``` self.model = model if model is not None else None ``` and ``` self.model = model ``` are the same. * separate attribute assignment from code logic - which simplifies things further. * whitespace
-
Nicolas Patry authored
* Warning about too long input for fast tokenizers too If truncation is not set in tokenizers, but the tokenization is too long for the model (`model_max_length`), we used to trigger a warning that The input would probably fail (which it most likely will). This PR re-enables the warning for fast tokenizers too and uses common code for the trigger to make sure it's consistent across. * Checking for pair of inputs too. * Making the function private and adding it's doc. * Remove formatting ?? in odd place. * Missed uppercase.
-
sandip authored
* Transfoxl sequence classification * Transfoxl sequence classification
-
Stas Bekman authored
* check that we get any match first * docs only * 2 docs only * add code * restore
-
- 01 Dec, 2020 11 commits
-
-
Stas Bekman authored
-
Sylvain Gugger authored
* Add a `distributed_env` property to TrainingArguments * Change name * Address comment
-
Sylvain Gugger authored
-
Stas Bekman authored
* restore skip * Revert "Remove deprecated `evalutate_during_training` (#8852)" This reverts commit 55302990. * check that pipeline.git.base_revision is defined before proceeding * Revert "Revert "Remove deprecated `evalutate_during_training` (#8852)"" This reverts commit dfec84db3fdce1079f01f1bc8dfaf21db2ccaba1. * check that pipeline.git.base_revision is defined before proceeding * doc only * doc + code * restore * restore * typo
-
Lysandre Debut authored
-
Adam Pocock authored
Prevent BatchEncoding from blindly passing casts down to the tensors it contains. Fixes #6582. (#8860) Update src/transformers/tokenization_utils_base.py with review fix Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sylvain Gugger authored
-
Ratthachat (Jung) authored
* 2 typos - from_question_encoder_generator_configs fix 2 typos from_encoder_generator_configs --> from_question_encoder_generator_configs * apply make style
-
Rodolfo Quispe authored
-
elk-cloner authored
* add CTRLForSequenceClassification * pass local test * merge with master * fix modeling test for sequence classification * fix deco * fix assert
- 30 Nov, 2020 12 commits
-
-
Stas Bekman authored
* fix DP case on multi-gpu * make executable * test all 3 modes * use the correct check for distributed * dp doesn't need a special case * restore original name * cleanup
-
Nicolas Patry authored
* NerPipeline (TokenClassification) now outputs offsets of words - It happens that the offsets are missing, it forces the user to pattern match the "word" from his input, which is not always feasible. For instance if a sentence contains the same word twice, then there is no way to know which is which. - This PR proposes to fix that by outputting 2 new keys for this pipelines outputs, "start" and "end", which correspond to the string offsets of the word. That means that we should always have the invariant: ```python input[entity["start"]: entity["end"]] == entity["entity_group"] # or entity["entity"] if not grouped ``` * Fixing doc style -
LysandreJik authored
-
Funtowicz Morgan authored
* Slightly increase tolerance between pytorch and flax output Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * test_multiple_sentences doesn't require torch Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Simplify parameterization on "jit" to use boolean rather than str Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use `require_torch` on `test_multiple_sentences` because we pull the weight from the hub. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Rename "jit" parameter to "use_jit" for (hopefully) making it self-documenting. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove pytest.mark.parametrize which seems to fail in some circumstances Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix unused imports. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Give default parameters values for traced model. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Review comment: Change sentences to sequences Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
LysandreJik authored
-
LysandreJik authored
-
Sylvain Gugger authored
* Remove deprecated `evalutate_during_training` * Update src/transformers/training_args_tf.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Shai Erera authored
* Use model.from_pretrained for DataParallel also When training on multiple GPUs, the code wraps a model with torch.nn.DataParallel. However if the model has custom from_pretrained logic, it does not get applied during load_best_model_at_end. This commit uses the underlying model during load_best_model_at_end, and re-wraps the loaded model with DataParallel. If you choose to reject this change, then could you please move the this logic to a function, e.g. def load_best_model_checkpoint(best_model_checkpoint) or something, so that it can be overridden? * Fix silly bug * Address review comments Thanks for the feedback. I made the change that you proposed, but I also think we should update L811 to check if `self.mode` is an instance of `PreTrained`, otherwise we would still not get into that `if` section, right?
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Fraser Greenlee authored
Related issue: https://github.com/huggingface/transformers/issues/8837
-