"docs/vscode:/vscode.git/clone" did not exist on "2dce350b3374ac76432fd6659e81b28083b705fa"
- 05 Oct, 2021 6 commits
-
-
Michael Benayoun authored
* Symbolic trace dynamic axes support for BERT like models (albert, bert, distilbert, mobilebert, electra, megatron-bert) * Sanity checks before tracing that make sure the model to trace is supported * Adapted to PyTorch 1.9 Co-authored-by:Michael Benayoun <michael@huggingface.co>
-
Alex Hedges authored
* Improve error message when loading models from Hub * Adjust error message wording
-
Nicolas Patry authored
* Fixing empty prompts for text-generation when BOS exists. * Fixing odd case with Pegasus. * Fixing Bert is Assertion Error.
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Nicolas Patry authored
-
Sam Hardwick authored
* Update Tatoeba conversion
-
- 04 Oct, 2021 6 commits
-
-
Bram Vanroy authored
* update no_* argument Changes the order so that the no_* argument is created after the original argument AND sets the default for this no_* argument to False * import copy * update test * make style * Use kwargs to set default=False * make style
-
Nathan Raw authored
*
✨ update image classification example *📌 update reqs -
Evgeniy Zheltonozhskiy authored
* Fix broken link to distill models * Missing symbol * Fix spaces
-
Sidd Karamcheti authored
* Add layer-wise scaling * Add reorder & upcasting argument * Add OpenAI GPT-2 weight initialization scheme * start `layer_idx` count at zero for consistency * disentangle attn and reordered and upscaled attn function * rename `scale_attn_by_layer` to `scale_attn_by_layer_id` * make autocast from amp compatible with pytorch<1.6 * fix docstring * style fixes * Add fixes from PR feedback, style tweaks * Fix doc whitespace * Reformat * First pass scale_attn_by_layer_idx and reorder_and_upcast_attn tests * Rename scale_attn_by_layer_idx, add tip * Remove extra newline * add test for weight initialization * update code format * add assert check weights are fp32 * remove assert * Fix incorrect merge * Fix shape mismatch in baddbmm * Add generation test for Mistral flags Co-authored-by:
leandro <leandro.vonwerra@spoud.io> Co-authored-by:
Keshav Santhanam <keshav2@stanford.edu> Co-authored-by:
J38 <jebolton@stanford.edu>
-
Yaser Abdelaziz authored
-
Gunjan Chhablani authored
-
- 01 Oct, 2021 5 commits
-
-
Stas Bekman authored
-
Silviu Oprea authored
In BartForConditionalGeneration.forward, if labels are provided, decoder_input_ids are set to the labels shifted to the right. This is problematic: if decoder_inputs_embeds is also set, the call to self.model, which eventually gets to BartDecoder.forward, will raise an error. The fix is quite simple, similar to what is there already in BartModel.forward. Mainly, we should not compute decoder_input_ids if decoder_inputs_embeds is provided. Co-authored-by:Silviu Vlad Oprea <silviuvo@amazon.co.uk>
-
Anton Lozhkov authored
* Restore broken merge * Additional args, DDP, remove CommonLanguage * Update examples for V100, add training results * Style * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Remove custom datasets for simplicity, apply suggestions from code review * Add the attention_mask flag, reorganize README Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Arfon Smith authored
-
Yuta Hayashibe authored
* Removed wrong warning * Raise a warning when `max_length` is given with wrong `truncation` * Update the error message * Update the warning message Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 30 Sep, 2021 8 commits
-
-
Suraj Patil authored
-
Patrick von Platen authored
* update * add to docs and init * make fix-copies
-
Patrick von Platen authored
-
Gunjan Chhablani authored
* Init multibert checkpoint conversion script * Rename conversion script * Fix MultiBerts Conversion Script * Apply suggestions from code review Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
Stas Bekman authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Suraj Patil authored
* use Repository for push_to_hub * update readme * update other flax scripts * update readme * update qa example * fix push_to_hub call * fix typo * fix more typos * update readme * use abosolute path to get repo name * fix glue script
-
- 29 Sep, 2021 10 commits
-
-
Stas Bekman authored
* missing requirement * list both
-
Suraj Patil authored
* add a note about tokenizer * add tips to load model is less RAM * fix link * fix more links
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Matt authored
-
Sylvain Gugger authored
* Fix length of IterableDatasetShard and add test * Add comments
-
Li-Huai (Allan) Lin authored
* Enable readme link synchronization * Style * Reuse regex pattern * Apply suggestions * Update
-
Nishant Prabhu authored
Fix LayoutLM ONNX test error
-
Matt authored
* Keras callback to push to hub each epoch, or after N steps * Reworked the callback to use Repository * Use an Enum for save_strategy * Style pass * Correct type for tokenizer * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/keras_callbacks.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Adding print message to the final upload * Adding print message to the final upload * Change how we wait for the last process to finish * is_done is a property, not a method, derp * Docstrings and documentation * Style pass * Style edit * Docstring reformat * Docstring rewrite * Replacing print with internal logger Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Patrick von Platen authored
-
- 28 Sep, 2021 2 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
- 27 Sep, 2021 3 commits
-
-
Sylvain Gugger authored
-
Lysandre authored
-
Lysandre authored
-