- 26 Apr, 2022 2 commits
-
-
Manuel authored
* apply torch int div * black linting fixup * update path to torch_int_div * clarify imports
-
Sylvain Gugger authored
* Limit the use of PreTrainedModel.device * Fix
-
- 25 Apr, 2022 11 commits
-
-
-
Sanchit Gandhi authored
-
Joao Gante authored
Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Rushi Chaudhari authored
* added deit onnx config
-
Joao Gante authored
Co-authored-by:Matt <Rocketknight1@users.noreply.github.com>
-
Joao Gante authored
* XLA min len, forced eos, and forced bos Co-authored-by:Matt <Rocketknight1@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* add torch.cuda.empty_cache in some PT RAG tests * torch.cuda.empty_cache in tearDownModule() * tearDown() * add gc.collect() Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* add missing ckpt in config docs * add more missing ckpt in config docs * fix wrong ckpts * fix realm ckpt * fix s2t2 * fix xlm_roberta ckpt * Fix for deberta v2 * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * use only one checkpoint for DPR * Apply suggestions from code review Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com>
-
Patrick von Platen authored
* fix doc test * fix doc test Co-authored-by:Patrick <patrick@pop-os.localdomain>
-
Thomas Chaigneau authored
Co-authored-by:ChainYo <t.chaigneau.tc@gmail.com>
-
- 23 Apr, 2022 1 commit
-
-
Patrick von Platen authored
* [DocTests] Fix some doc tests * hacky fix * correct
-
- 22 Apr, 2022 7 commits
-
-
cavdard authored
* changes in create optimizer to support tensor parallelism with SMP * Update src/transformers/trainer.py Convert if check to one line. Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Cavdar <dcavdar@a07817b12d7e.ant.amazon.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Joao Gante authored
-
Thomas Chaigneau authored
* add OnnxConfig for ConvBert Co-authored-by:ChainYo <t.chaigneau.tc@gmail.com>
-
Minh Chien Vu authored
* Add doctest BERT * make fixup * fix typo * change checkpoints * make fixup * define doctest output value, update doctest for mobilebert * solve fix-copies * update QA target start index and end index * change checkpoint for docs and reuse defined variable * Update src/transformers/models/bert/modeling_tf_bert.py Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * make fixup * Add Doctest for Albert and Bigbird * make fixup * overwrite examples for Albert and Bigbird * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * update longer examples for Bigbird * using examples from squad_v2 * print out example text * change name token-classification-big-bird checkpoint to random Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Mario 艩a拧ko authored
* Minor improvements to `convert_file_size_to_int` * Add <unit>bit version to kilos and megas * Minor fix
-
Joao Gante authored
-
Yih-Dar authored
* add missing entries in some mappings Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 21 Apr, 2022 9 commits
-
-
Loubna Ben Allal authored
* add tflops logging and fix grad accumulation * add accelerate tracking and checkpointing * scale loss of last batch correctly * fix typo * compress loss computation Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * add resume from checkpoint argument * add load_state accelerate from checkpoint, register lr scheduler and add tflops function * reformat code * reformat code * add condition on path for resume checkpoint * combine if conditions Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * add source for tflops formula Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Thomas Chaigneau authored
* add gptj to TOKENIZER_MAPPING_NAMES * fix int32 to float to avoid problem in onnx * Update src/transformers/models/gptj/modeling_gptj.py Co-authored-by:
ChainYo <t.chaigneau.tc@gmail.com> Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com>
-
Eldar Kurtic authored
- all activations should be fetched through ACT2FN - it returns ReLU as `nn.Module`, which allows attaching hooks on the activation function and prints it to stdout when `print(model)`
-
Sylvain Gugger authored
-
Nicolas Patry authored
* Adding support for `array` key in raw dictionnaries in ASR pipeline. * ES . * Update src/transformers/pipelines/automatic_speech_recognition.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Making it work by not popping `array` first. * Black 22.3 Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
ghlai9665 authored
* tweak to allow BatchEncoding.char_to_token(0) * update docstring * remote trailing whitespace * make fixup * make value checking for span_indices explicit Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stefan Schweter authored
* t5: add conversion script for T5X to FLAX * t5: make flake happy * t5: add copyright message to t5x conversion script * t5: fix lm head for v1.0 checkpoints
-
Nicolas Patry authored
* Temporary commit witht the long QA fix. * Adding slow tests covering this fix. * Removing fast test as it doesn't fail anyway.
-
- 20 Apr, 2022 6 commits
-
-
Zachary Mueller authored
-
Sylvain Gugger authored
-
Stas Bekman authored
-
Stas Bekman authored
* less cpu memory with sharded checkpoint loading * Trigger CI * Trigger CI
-
Nicolas Patry authored
* Fixing return type tensor with `num_return_sequences>1`. * Nit.
-
Yang Ming authored
Co-authored-by:
alcinos <carion.nicolas@gmail.com> Co-authored-by:
SaulLu <55560583+SaulLu@users.noreply.github.com> Co-authored-by:
Nicolas Carion <carion.nicolas@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 19 Apr, 2022 4 commits
-
-
Patrick von Platen authored
-
Manuel R. Ciosici authored
* Add initial BNB integration * fixup! Add initial BNB integration * Add bnb test decorator * Update Adamw8bit option name * Use the full bnb package name * Overide bnb for all embedding layers * Fix package name * Formatting * Remove unnecessary import * Update src/transformers/trainer.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Rename AdamwBNB optimizer option * Add training test checking that bnb memory utilization is lower * fix merge * fix merge; fix + extend new test * cleanup * expand bnb * move all require_* candidates to testing_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas@stason.org>
-
Yih-Dar authored
* Update test_pt_tf_model_equivalence on PT side Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Dahlbomii authored
* Type hints added * return hints added * Update src/transformers/models/speech_to_text/modeling_tf_speech_to_text.py Co-authored-by:Matt <Rocketknight1@users.noreply.github.com>
-