- 11 Dec, 2020 2 commits
-
-
Sylvain Gugger authored
* Fix PreTrainedTokenizer.pad when first inputs are empty * Handle empty inputs case
-
Cola authored
-
- 10 Dec, 2020 5 commits
-
-
Julien Plu authored
* Remove value error * Try a fix for parameter ordering * Restore previous behavior * Add documentation * Review the comment
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Cola authored
-
- 09 Dec, 2020 6 commits
-
-
Patrick von Platen authored
* remove make on the fly linear embedding * start refactor * big first refactor * save intermediate * save intermediat * correct mask issue * save tests * refactor padding masks * make all tests pass * further refactor * make pegasus test pass * fix bool if * fix leftover tests * continue * bart renaming * delete torchscript test hack * fix imports in tests * correct shift * fix docs and repo cons * re-add fix for FSTM * typo in test * fix typo * fix another typo * continue * hot fix 2 for tf * small fixes * refactor types linting * continue * finish refactor * fix import in tests * better bart names * further refactor and add test * delete hack * apply sylvains and lysandres commens * small perf improv * further perf improv * improv perf * fix typo * make style * small perf improv
-
Funtowicz Morgan authored
* Remove "Model" suffix from Flax models to look more :hugs: Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Initial working (forward + backward) for Flax MLM training example. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Simply code Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Addressing comments, using module and moving to LM task. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Restore parameter name "module" wrongly renamed model. Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Restore correct output ordering... Signed-off-by:
Morgan Funtowicz <morgan@huggingface.co> * Actually commit the example
😅 Signed-off-by:Morgan Funtowicz <morgan@huggingface.co> * Add FlaxBertModelForMaskedLM after rebasing. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make it possible to initialize the training from scratch Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Reuse flax linen example of cross entropy loss Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added specific data collator for flax Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Remove todo for data collator Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added evaluation step Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added ability to provide dtype to support bfloat16 on TPU Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Enable flax tensorboard output Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Enable jax.pmap support. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Ensure batches are correctly sized to be dispatched with jax.pmap Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Enable bfloat16 with --fp16 cmdline args Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Correctly export metrics to tensorboard Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added dropout and ability to use it. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Effectively enable & disable during training and evaluation steps. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Oops. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Enable specifying kernel initializer scale Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Style. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Added warmup step to the learning rate scheduler. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix typo. Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Print training loss Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make style Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * fix linter issue (flake8) Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix model matching Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix dummies Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Fix non default dtype on Flax models Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Use the same create_position_ids_from_input_ids for FlaxRoberta Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Make Roberta attention as Bert Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * fix copy Signed-off-by:
Morgan Funtowicz <funtowiczmo@gmail.com> * Wording. Co-authored-by:
Marc van Zee <marcvanzee@gmail.com> Co-authored-by:
Marc van Zee <marcvanzee@gmail.com>
-
StillKeepTry authored
-
Patrick von Platen authored
-
Sylvain Gugger authored
-
Patrick von Platen authored
* diverse beam search * bug fixes * bug fixes * bug fix * separate out diverse_beam_search function * separate out diverse_beam_search function * bug fix * improve code quality * bug fix * bug fix * separate out diverse beam search scorer * code format * code format * code format * code format * add test * code format * documentation changes * code quality * add slow integration tests * more general name * refactor into logits processor * add test * avoid too much copy paste * refactor * add to docs * fix-copies * bug fix * Revert "bug fix" This reverts commit c99eb5a8dc57a7b0d33a8ac06d8c6a32a7812ad4. * improve comment * implement sylvains feedback Co-authored-by:
Ayush Jain <a.jain@sprinklr.com> Co-authored-by:
ayushtiku5 <40797286+ayushtiku5@users.noreply.github.com>
-
- 08 Dec, 2020 6 commits
-
-
Lysandre Debut authored
-
guillaume-be authored
* Removed unused `encoder_hidden_states` and `encoder_attention_mask` from MobileBert * Removed decoder tests for MobileBert * Removed now unnecessary import
-
Lysandre Debut authored
-
Sylvain Gugger authored
-
Julien Plu authored
* Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add deprecated arguments * Add input processing to TF XLM * remove unused imports * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Fix wrong model name * Fix BART * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Address Patrick's comments * Address Patrick's comments * Add boolean processing for the inputs * Take into account the optional layers * Add missing/unexpected weights in the other models * Apply style * rename parameters * Apply style * Remove useless * Remove useless * Remove useless * Update num parameters * Fix tests * Address Patrick's comment * Remove useless attribute
-
Stas Bekman authored
* [training] SAVE_STATE_WARNING was removed in pytorch FYI `SAVE_STATE_WARNING` has been removed 3 days ago: pytorch/pytorch#46813 Fixes: #8232 @sgugger * style, but add () to prevent autoformatters from botching it * switch to try/except * cleanup
-
- 07 Dec, 2020 4 commits
-
-
Sylvain Gugger authored
* Add copyright everywhere missing * Style
-
Julien Chaumond authored
* initial commit * [cli] lfs commands * Fix FileSlice * Tweak to FileSlice * [hf_api] Backport filetype arg from `datasets` cc @lhoestq * Silm down the CI while i'm working * Ok let's try this in CI * Update config.yml * Do not try this at home * one more try * Update lfs.py * Revert "Tweak to FileSlice" This reverts commit d7e32c4b3500400486411e85a2b74e57fb6b52f5. * Update test_hf_api.py * Update test_hf_api.py * Update test_hf_api.py * CI still green? * make CI green again? * Update test_hf_api.py * make CI red again? * Update test_hf_api.py * add CI style back * Fix CI? * oh my * doc + switch back to real staging endpoint * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Pierric Cistac <Pierrci@users.noreply.github.com> * Fix docblock + f-strings Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Pierric Cistac <Pierrci@users.noreply.github.com>
-
sandip authored
* Add TFGPT2ForSequenceClassification based on DialogRPT * Add TFGPT2ForSequenceClassification based on DialogRPT * TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing * Add TFGPT2ForSequenceClassification based on DialogRPT * TFGPT2ForSequenceClassification based on DialogRPT-refactored code, implemented review comments and added input processing * code refactor for latest other TF PR * code refactor * code refactor * Update modeling_tf_gpt2.py
-
Sylvain Gugger authored
-
- 05 Dec, 2020 1 commit
-
-
Machel Reid authored
Self-explanatory ;) - Hope it helps!
-
- 04 Dec, 2020 2 commits
-
-
Lysandre Debut authored
-
Julien Plu authored
* Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add deprecated arguments * Add input processing to TF XLM * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Bug fix * Retry to bugfix * Retry bug fix * Fix wrong model name * Try another fix * Fix BART * Fix input precessing * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Bug fix * Address Patrick's comments * Address Patrick's comments * Address Sylvain's comments * Add boolean processing for the inputs * Apply style * Missing optional * Fix missing some input proc * Update the template * Fix missing inputs * Missing input * Fix args parameter * Trigger CI * Trigger CI * Trigger CI * Address Patrick's and Sylvain's comments * Replace warn by warning * Trigger CI * Fix XLNET * Fix detection
-
- 03 Dec, 2020 3 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Skye Wanderman-Milne authored
-
- 02 Dec, 2020 5 commits
-
-
Patrick von Platen authored
* fix resize tokens * correct mobile_bert * move embedding fix into modeling_utils.py * refactor * fix lm head resize * refactor * break lines to make sylvain happy * add news tests * fix typo * improve test * skip bart-like for now * check if base_model = get(...) is necessary * clean files * improve test * fix tests * revert style templates * Update templates/adding_a_new_model/cookiecutter-template-{{cookiecutter.modelname}}/modeling_{{cookiecutter.lowercase_modelname}}.py -
ryota-mo authored
-
Stas Bekman authored
* [trainer] improve code This PR: - removes redundant code ``` self.model = model if model is not None else None ``` and ``` self.model = model ``` are the same. * separate attribute assignment from code logic - which simplifies things further. * whitespace
-
Nicolas Patry authored
* Warning about too long input for fast tokenizers too If truncation is not set in tokenizers, but the tokenization is too long for the model (`model_max_length`), we used to trigger a warning that The input would probably fail (which it most likely will). This PR re-enables the warning for fast tokenizers too and uses common code for the trigger to make sure it's consistent across. * Checking for pair of inputs too. * Making the function private and adding it's doc. * Remove formatting ?? in odd place. * Missed uppercase.
-
sandip authored
* Transfoxl sequence classification * Transfoxl sequence classification
-
- 01 Dec, 2020 6 commits
-
-
Sylvain Gugger authored
* Add a `distributed_env` property to TrainingArguments * Change name * Address comment
-
Sylvain Gugger authored
-
Lysandre Debut authored
-
Adam Pocock authored
Prevent BatchEncoding from blindly passing casts down to the tensors it contains. Fixes #6582. (#8860) Update src/transformers/tokenization_utils_base.py with review fix Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Sylvain Gugger authored
-
Ratthachat (Jung) authored
* 2 typos - from_question_encoder_generator_configs fix 2 typos from_encoder_generator_configs --> from_question_encoder_generator_configs * apply make style
-