- 22 Apr, 2022 7 commits
-
-
cavdard authored
* changes in create optimizer to support tensor parallelism with SMP * Update src/transformers/trainer.py Convert if check to one line. Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Cavdar <dcavdar@a07817b12d7e.ant.amazon.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Joao Gante authored
-
Thomas Chaigneau authored
* add OnnxConfig for ConvBert Co-authored-by:ChainYo <t.chaigneau.tc@gmail.com>
-
Minh Chien Vu authored
* Add doctest BERT * make fixup * fix typo * change checkpoints * make fixup * define doctest output value, update doctest for mobilebert * solve fix-copies * update QA target start index and end index * change checkpoint for docs and reuse defined variable * Update src/transformers/models/bert/modeling_tf_bert.py Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * make fixup * Add Doctest for Albert and Bigbird * make fixup * overwrite examples for Albert and Bigbird * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * update longer examples for Bigbird * using examples from squad_v2 * print out example text * change name token-classification-big-bird checkpoint to random Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Mario 艩a拧ko authored
* Minor improvements to `convert_file_size_to_int` * Add <unit>bit version to kilos and megas * Minor fix
-
Joao Gante authored
-
Yih-Dar authored
* add missing entries in some mappings Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 21 Apr, 2022 9 commits
-
-
Loubna Ben Allal authored
* add tflops logging and fix grad accumulation * add accelerate tracking and checkpointing * scale loss of last batch correctly * fix typo * compress loss computation Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * add resume from checkpoint argument * add load_state accelerate from checkpoint, register lr scheduler and add tflops function * reformat code * reformat code * add condition on path for resume checkpoint * combine if conditions Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * add source for tflops formula Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Thomas Chaigneau authored
* add gptj to TOKENIZER_MAPPING_NAMES * fix int32 to float to avoid problem in onnx * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
ChainYo <t.chaigneau.tc@gmail.com> Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com>
-
Eldar Kurtic authored
- all activations should be fetched through ACT2FN - it returns ReLU as `nn.Module`, which allows attaching hooks on the activation function and prints it to stdout when `print(model)`
-
Sylvain Gugger authored
-
Nicolas Patry authored
* Adding support for `array` key in raw dictionnaries in ASR pipeline. * ES . * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Making it work by not popping `array` first. * Black 22.3 Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
ghlai9665 authored
* tweak to allow BatchEncoding.char_to_token(0) * update docstring * remote trailing whitespace * make fixup * make value checking for span_indices explicit Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stefan Schweter authored
* t5: add conversion script for T5X to FLAX * t5: make flake happy * t5: add copyright message to t5x conversion script * t5: fix lm head for v1.0 checkpoints
-
Nicolas Patry authored
* Temporary commit witht the long QA fix. * Adding slow tests covering this fix. * Removing fast test as it doesn't fail anyway.
-
- 20 Apr, 2022 6 commits
-
-
Zachary Mueller authored
-
Sylvain Gugger authored
-
Stas Bekman authored
-
Stas Bekman authored
* less cpu memory with sharded checkpoint loading * Trigger CI * Trigger CI
-
Nicolas Patry authored
* Fixing return type tensor with `num_return_sequences>1`. * Nit.
-
Yang Ming authored
Co-authored-by:
alcinos <carion.nicolas@gmail.com> Co-authored-by:
SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by:
Nicolas Carion <carion.nicolas@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 19 Apr, 2022 18 commits
-
-
Patrick von Platen authored
-
Manuel R. Ciosici authored
* Add initial BNB integration * fixup! Add initial BNB integration * Add bnb test decorator * Update Adamw8bit option name * Use the full bnb package name * Overide bnb for all embedding layers * Fix package name * Formatting * Remove unnecessary import * Update src/transformers/trainer.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Rename AdamwBNB optimizer option * Add training test checking that bnb memory utilization is lower * fix merge * fix merge; fix + extend new test * cleanup * expand bnb * move all require_* candidates to testing_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas@stason.org>
-
Yih-Dar authored
* Update test_pt_tf_model_equivalence on PT side Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Dahlbomii authored
* Type hints added * return hints added * Update src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py Co-authored-by:Matt <Rocketknight1@users.noreply.github.com>
-
SaulLu authored
* replace `Speech2TextTokenizer` by `Speech2TextFeatureExtractor` in docstring * quality
-
Jeevesh Juneja authored
* Correct Logging of Eval metric to Tensorboard An empty dictionary ``eval_metrics`` was being logged, is replaced by ``eval_metric`` which is the output dictionary of ``metric.compute()``. * Remove unused variable
-
Joao Gante authored
-
wiio12 authored
* Add doc about `attention_mask` on gpt2 Add a simple sentence describing how `attention_mask` needs to be constructed when ``past_key_values` is used. * Add doc about attention_mask on gpt2_tf * clean up style * remove empty line white spaces * remove whitespace in empty line
-
NielsRogge authored
* Add first draft * Improve README and run fixup * Make script aligned with other scripts, improve README * Improve script and add test * Remove print statement * Apply suggestions from code review * Add num_labels to make test pass * Improve README
-
Patrick von Platen authored
* correct * up
-
Ella Charlaix authored
* Add export of models with a multiple-choice classification head
-
Wonjae Kim authored
-
Dahlbomii authored
* Type hints added * make style * Return type hints added * fixed typo Co-authored-by:matt <rocketknight1@gmail.com>
-
code-review-doctor authored
* Fix issue avoid-misusing-assert-true found at https://codereview.doctor * fix tests * fix tf Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Suraj Patil authored
* begin do_init * add params_shape_tree * raise error if params are accessed when do_init is False * don't allow do_init=False when keys are missing * make shape tree a property * assign self._params at the end * add test for do_init * add do_init arg to all flax models * fix param setting * disbale do_init for composite models * update test * add do_init in FlaxBigBirdForMultipleChoice * better names and errors * improve test * style * add a warning when do_init=False * remove extra if * set params after _required_params * add test for from_pretrained * do_init => _do_init * chage warning to info * fix typo * add params in init_weights * add params to gpt neo init * add params to init_weights * update do_init test * Trigger CI * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * update template * trigger CI * style * style * fix template Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Arthur authored
* Solved href rendering issue in heading Markdown references in headings such as '####' don't render well. Replaced it with <h4>...<a></a></h> banners. * PhonemeTokenizer optimization using phonemizer lib The backend should only be initialized once, otherwise it is reloaded. Added `init_backend` function, intializes a backend attribute. Phonemize re-uses self.backend. Should give ~10 times faster phonemization. * formatted file with make style * Documentation suggestion Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update /tokenization_wav2vec2_phoneme.py based on PR suggestion Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update CONTRIBUTING.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Li-Huai (Allan) Lin authored
* Fix docstrings * Fix up * Fix
-
NielsRogge authored
* Add first draft from previous PR * First draft * Improve README and remove num_labels * Make script more aligned with other scripts * Improve README and apply suggestion from code review
-