"docs/source/zh/main_classes/output.md" did not exist on "84724efd101af52ed3d6af878e41ff8fd651a9cc"
- 22 Jun, 2022 2 commits
-
-
Sylvain Gugger authored
* Offload fixes * Add a test
-
Arthur authored
-
- 20 Jun, 2022 2 commits
-
-
Yih-Dar authored
* Use torch.finfo(self.dtype).min * for GPTNeoX * for Albert * For Splinter * Update src/transformers/models/data2vec/modeling_data2vec_audio.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * fix -inf used in Bart-like models * Fix a few remaining -inf * more fix * clean up * For CLIP * For FSMT * clean up * fix test * Add dtype argument and use it for LayoutLMv3 * update FlaxLongT5Attention Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Sylvain Gugger authored
* Fix cache for GPT-Neo-X * Add more tests
-
- 13 Jun, 2022 1 commit
-
-
Sylvain Gugger authored
* Fix dtype getters * Proper fix for dtype getter * Style and commant * Always use last for consistency * Quality
-
- 10 Jun, 2022 1 commit
-
-
Sylvain Gugger authored
-
- 09 Jun, 2022 1 commit
-
-
Stas Bekman authored
* [modeling_utils] torch_dtype/auto fixes * add test * apply suggestions * add missing fallback * Renaming things * Use for else Co-authored-by:Sylvain Gugger <Sylvain.gugger@gmail.com>
-
- 03 Jun, 2022 1 commit
-
-
Sylvain Gugger authored
-
- 02 Jun, 2022 1 commit
-
-
Sylvain Gugger authored
-
- 31 May, 2022 1 commit
-
-
Sylvain Gugger authored
* Fix offload to disk for big models * Add test * Fix test for other models
-
- 25 May, 2022 1 commit
-
-
Sylvain Gugger authored
-
- 23 May, 2022 1 commit
-
-
Sylvain Gugger authored
* Initial work * More or less finished with first draft * Update src/transformers/modeling_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Fix randomly initialized weights * Update src/transformers/modeling_utils.py Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr> * Address review comments * Rename DeepSpeed folder to temporarily fix the test issue? * Revert to try if Accelerate fix works * Use latest Accelerate release * Quality and fixes * Style * Quality * Add doc * Test + fix * More blocks Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr>
-
- 19 May, 2022 1 commit
-
-
Nathan Dahlberg authored
-
- 17 May, 2022 1 commit
-
-
regisss authored
- Add --ignore_mismatched_sizes argument to classification examples - Expand the error message when loading a model whose head dimensions are different from expected dimensions
-
- 12 May, 2022 1 commit
-
-
Sylvain Gugger authored
* Black preview * Fixup too! * Fix check copies * Use the same version as the CI * Bump black
-
- 03 May, 2022 3 commits
-
-
Pavel Belevich authored
-
Sylvain Gugger authored
* Fix RNG reload in resume training from epoch checkpoint * Fix test
-
Sylvain Gugger authored
* Make Trainer compatible with sharded checkpoints * Add doc
-
- 29 Apr, 2022 1 commit
-
-
Pavel Belevich authored
-
- 27 Apr, 2022 1 commit
-
-
Sylvain Gugger authored
* Fix multiple deletions of the same files in save_pretrained * Add is_main_process argument
-
- 26 Apr, 2022 2 commits
-
-
Yongliang Shen authored
-
Sylvain Gugger authored
* Limit the use of PreTrainedModel.device * Fix
-
- 22 Apr, 2022 1 commit
-
-
Mario 艩a拧ko authored
* Minor improvements to `convert_file_size_to_int` * Add <unit>bit version to kilos and megas * Minor fix
-
- 20 Apr, 2022 1 commit
-
-
Stas Bekman authored
* less cpu memory with sharded checkpoint loading * Trigger CI * Trigger CI
-
- 19 Apr, 2022 1 commit
-
-
Patrick von Platen authored
-
- 15 Apr, 2022 1 commit
-
-
Stas Bekman authored
* add low_cpu_mem_usage tests * wip: revamping * wip * install /usr/bin/time * wip * cleanup * cleanup * cleanup * cleanup * cleanup * fix assert * put the wrapper back * cleanup; switch to bert-base-cased * Trigger CI * Trigger CI
-
- 13 Apr, 2022 2 commits
-
-
Stas Bekman authored
-
Stas Bekman authored
-
- 12 Apr, 2022 2 commits
-
-
Anmol Joshi authored
* Moved functions to pytorch_utils.py * isort formatting * Reverted tf changes * isort, make fix-copies * documentation fix * Fixed Conv1D import * Reverted research examples file * backward compatibility for pytorch_utils * missing import * isort fix
-
smelm authored
This avoids an unnecessary call and avoids problems during initialization of class hierarchies. Co-authored-by:Samuel Melm <samuel.melm@stud.uni-heidelberg.de>
-
- 08 Apr, 2022 1 commit
-
-
Laura Hanu authored
-
- 07 Apr, 2022 1 commit
-
-
Francesco Saverio Zuppichini authored
* Updated _load_pretrained_model_low_mem to check if keys are in the stored state_dict * update after conversions
-
- 06 Apr, 2022 3 commits
-
-
Stas Bekman authored
-
Stas Bekman authored
-
Suraj Patil authored
-
- 05 Apr, 2022 2 commits
-
-
Suraj Patil authored
-
Francesco Saverio Zuppichini authored
-
- 04 Apr, 2022 1 commit
-
-
Nicolas Patry authored
-
- 25 Mar, 2022 2 commits
-
-
Sylvain Gugger authored
* Sharded checkpoint support * Handle distant sharded checkpoints * Add tests * TODO is done * Apply suggestions from code review Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Fix docstring * Add example and format * Address review comments * More review comments * End of merge * Revert unintentional change * VsCode what did you do? * Style * Changes * Address final comments * Quality * Moar tests * Move import beneath is_pt_available Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Sylvain Gugger authored
* Big file_utils cleanup * This one still needs to be treated separately
-