- 14 Oct, 2021 1 commit
-
-
Li-Huai (Allan) Lin authored
* Remove wrong model_args of config.from_pretrained * Fix tf & flax
-
- 11 Oct, 2021 1 commit
-
-
Patrick von Platen authored
[Gradient checkpoining] Correct disabling `find_unused_parameters` in Trainer when gradient checkpointing is enabled (#13961) * up * correct test
-
- 08 Oct, 2021 1 commit
-
-
Stella Biderman authored
* Added `framework` attribute * Update modeling_utils.py * Update modeling_flax_utils.py * Update modeling_tf_utils.py * Update modeling_utils.py * Update modeling_tf_utils.py * Update modeling_tf_utils.py * Update modeling_flax_utils.py * Update modeling_tf_utils.py * Update modeling_utils.py * Update modeling_utils.py * Update modeling_tf_utils.py * Update modeling_flax_utils.py * string -> str * Update modeling_tf_utils.py * string -> str * fixup * make flake happy Co-authored-by:patil-suraj <surajp815@gmail.com>
-
- 07 Oct, 2021 2 commits
-
-
Mishig Davaadorj authored
-
Alex Hedges authored
-
- 05 Oct, 2021 1 commit
-
-
Alex Hedges authored
* Improve error message when loading models from Hub * Adjust error message wording
-
- 24 Sep, 2021 1 commit
-
-
Josh Devins authored
This moves the assertion on checking input dimensions into a block that will only be called if the function is actually going to do chunking forward. This is often not the case at inference time and PyTorch tracing a model with this assertion in it leads to a tracing warning. TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors
-
- 23 Sep, 2021 1 commit
-
-
Stas Bekman authored
* one possible solution * low mem from_pretrained * edge cases * solve the persistent buffers * style * parametrize * for later * proper solution * cleanup * refactor; rework based on suggestions * revert splitting into 2 parts, move checks into main func
-
- 22 Sep, 2021 1 commit
-
-
Sylvain Gugger authored
* Make gradient_checkpointing a training argument * Update src/transformers/modeling_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/configuration_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Fix tests * Style * document Gradient Checkpointing as a performance feature * Small rename * PoC for not using the config * Adapt BC to new PoC * Forgot to save * Rollout changes to all other models * Fix typo Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas@stason.org>
-
- 17 Sep, 2021 1 commit
-
-
Alex Hedges authored
-
- 16 Sep, 2021 1 commit
-
-
Stas Bekman authored
* [deepspeed] replaced deprecated init arg * Trigger CI
-
- 15 Sep, 2021 1 commit
-
-
Patrick von Platen authored
* finish * delete bogus file * correct some stuff * finish * finish
-
- 08 Sep, 2021 1 commit
-
-
Lysandre Debut authored
* Better error raised when cloned without lfs * add from e
-
- 30 Aug, 2021 1 commit
-
-
arfy slowy authored
* fix: typo spelling grammar * fix: make fixup
-
- 26 Aug, 2021 1 commit
-
-
Bram Vanroy authored
* add error message concerning revision * Update src/transformers/configuration_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * re-add double line endings * is not None instead of implicit bool casting Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 06 Aug, 2021 1 commit
-
-
Sylvain Gugger authored
* Fix tied weights on TPU * Manually tie weights in no trainer examples * Fix for test * One last missing * Gettning owned by my scripts * Address review comments * Fix test * Fix tests * Fix reformer tests
-
- 04 Aug, 2021 1 commit
-
-
Sylvain Gugger authored
* Fix from_pretrained with corrupted state_dict * Adapt test * Use better checkpoint * Style * Clean up
-
- 17 Jul, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 13 Jul, 2021 3 commits
-
-
Stas Bekman authored
* zero_to_fp32 tests * args change * remove unnecessary work * use transformers.trainer_utils.get_last_checkpoint * document the new features * cleanup * wip * fix fsmt * add bert * cleanup * add xlm-roberta * electra works * cleanup * sync * split off the model zoo tests * cleanup * cleanup * cleanup * cleanup * reformat * cleanup * casing * deepspeed>=0.4.3 * adjust distilbert * Update docs/source/main_classes/deepspeed.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
qqaatw authored
-
Sylvain Gugger authored
* Add option to load a pretrained model with mismatched shapes * Fail at loading when mismatched shapes in Flax * Fix tests * Update src/transformers/modeling_flax_utils.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 08 Jul, 2021 1 commit
-
-
Stas Bekman authored
* [model.from_pretrained] raise exception early on failed load Currently if `load` pretrained weights fails in `from_pretrained`, we first print a whole bunch of successful messages and then fail - this PR puts the exception first to avoid all the misleading messages. * style Co-authored-by:Suraj Patil <surajp815@gmail.com>
-
- 01 Jul, 2021 2 commits
-
-
Stas Bekman authored
* fix lm_head.decoder.weight ignore_key handling * fix the mutable class variable * Update src/transformers/models/roberta/modeling_roberta.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * replicate the comment * make deterministic Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Teven authored
* fixing bug with param count without embeddings * Update src/transformers/modeling_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 29 Jun, 2021 1 commit
-
-
Stas Bekman authored
* [models] respect dtype of the model when instantiating it * cleanup * cleanup * rework to handle non-float dtype * fix * switch to fp32 tiny model * improve * use dtype.is_floating_point * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix the doc * recode to use explicit torch_dtype_auto_detect, torch_dtype args * docs and tweaks * docs and tweaks * docs and tweaks * merge 2 args, add docs * fix * fix * better doc * better doc Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 23 Jun, 2021 1 commit
-
-
Sylvain Gugger authored
* Clean push to hub API * Create working dir if it does not exist * Different tweak * New API + all models + test Flax * Adds the Trainer clean up * Update src/transformers/file_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Address review comments * (nit) output types * No need to set clone_from when folder exists * Update src/transformers/trainer.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Add generated_from_trainer tag * Update to new version * Fixes Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 14 Jun, 2021 1 commit
-
-
Stas Bekman authored
* consistent nn. and nn.functional * fix glitch * fix glitch #2
-
- 02 Jun, 2021 1 commit
-
-
Stas Bekman authored
* move code and docs * style * moved * restore
-
- 12 May, 2021 2 commits
-
-
Patrick von Platen authored
* fix encoder-decoder & RAG * finalize * Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/rag/modeling_rag.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Patrick von Platen <patrick@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Philip May authored
-
- 06 May, 2021 1 commit
-
-
Patrick von Platen authored
-
- 05 May, 2021 1 commit
-
-
Patrick von Platen authored
* lazy_init_weights * remove ipdb * save int * add necessary code * remove unnecessary utils * Update src/transformers/models/t5/modeling_t5.py * clean * add tests * correct * finish tests * finish tests * fix some more tests * fix xlnet & transfo-xl * fix more tests * make sure tests are independent * fix tests more * finist tests * final touches * Update src/transformers/modeling_utils.py * Apply suggestions from code review * Update src/transformers/modeling_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * clean tests * give arg positive name * add more mock weights to xlnet Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 03 May, 2021 1 commit
-
-
Stas Bekman authored
-
- 30 Apr, 2021 1 commit
-
-
Stas Bekman authored
* prep for deepspeed==0.3.16 * new version * too soon * support and test fp32 mode * troubleshooting doc start * workaround no longer needed * add fp32 doc * style * cleanup, add tf32 note * clarify * release was made
-
- 26 Apr, 2021 2 commits
-
-
Stas Bekman authored
* adding Z-inf * revamp config process * up version requirement * wip * massive rewrite * cleanup * cleanup * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * consistent json commas * act on suggestions * leave this feature for 0.3.16 * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
LSinev authored
-
- 23 Apr, 2021 2 commits
-
-
Sylvain Gugger authored
* Initial support for upload to hub * push -> upload * Fixes + examples * Fix torchhub test * Torchhub test I hate you * push_model_to_hub -> push_to_hub * Apply mixin to other pretrained models * Remove ABC inheritance * Add tests * Typo * Run tests * Install git-lfs * Change approach * Add push_to_hub to all * Staging test suite * Typo * Maybe like this? * More deps * Cache * Adapt name * Quality * MOAR tests * Put it in testing_utils * Docs + torchhub last hope * Styling * Wrong method * Typos * Update src/transformers/file_utils.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Address review comments * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* improve flax * refactor * typos * Update src/transformers/modeling_flax_utils.py * Apply suggestions from code review * Update src/transformers/modeling_flax_utils.py * fix typo * improve error tolerance * typo * correct nasty saving bug * fix from pretrained * correct tree map * add note * correct weight tying
-
- 14 Apr, 2021 1 commit
-
-
Stas Bekman authored
* add 2 points of reference to the offline mode * link the new doc * add error message * Update src/transformers/modeling_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style * rename * Trigger CI Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 08 Apr, 2021 1 commit
-
-
Stas Bekman authored
* synced gpus * fix * fix * need to use t5-small for quality tests * notes * complete merge * fix a disappearing std stream problem * start zero3 tests * wip * tune params * sorting out the pre-trained model loading * reworking generate loop wip * wip * style * fix tests * split the tests * refactor tests * wip * parameterized * fix * workout the resume from non-ds checkpoint pass + test * cleanup * remove no longer needed code * split getter/setter functions * complete the docs * suggestions * gpus and their compute capabilities link * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * style * remove invalid paramgd * automatically configure zero3 params that rely on hidden size * make _get_resized_embeddings zero3-aware * add test exercising resize_token_embeddings() * add docstring Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-