- 25 Aug, 2023 1 commit
-
-
Younes Belkada authored
* move deepspeed to `lib_integrations.deepspeed` * more refactor * oops * fix slow tests * Fix docs * fix docs * addess feedback * address feedback * final modifs for PEFT * fixup * ok now * trigger CI * trigger CI again * Update docs/source/en/main_classes/deepspeed.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * import from `integrations` * address feedback * revert removal of `deepspeed` module * revert removal of `deepspeed` module * fix conflicts * ooops * oops * add deprecation warning * place it on the top * put `FutureWarning` * fix conflicts with not_doctested.txt * add back `bitsandbytes` module with a depr warning * fix * fix * fixup * oops * fix doctests --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 25 Apr, 2023 1 commit
-
-
Lingepumpe authored
* Avoid invalid escape sequences, use raw strings * Integrate PR feedback
-
- 22 Feb, 2023 1 commit
-
-
Aaron Gokaslan authored
-
- 06 Feb, 2023 1 commit
-
-
Sylvain Gugger authored
* Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies
-
- 27 Sep, 2022 1 commit
-
-
Arijit Mukherjee authored
* add wav2vec2_alignment * Update alignment.py * Update examples/research_projects/wav2vec2/alignment.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update examples/research_projects/wav2vec2/alignment.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update examples/research_projects/wav2vec2/alignment.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update examples/research_projects/wav2vec2/alignment.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update README.md * fix style * fix imports * fix multithread * fix bash script * [@anton-l] Style fixes and docstrings * [@anton-l] Style fixes and docstrings * Update alignment.py fix blank id in backtrack Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
anton-l <aglozhkov@gmail.com>
-
- 03 Aug, 2022 1 commit
-
-
LSinev authored
Comparisons like version.parse(torch.__version__) > version.parse("1.6") are True for torch==1.6.0+cu101 or torch==1.6.0+cpu version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py
-
- 29 Jul, 2022 1 commit
-
-
Sylvain Gugger authored
* Preliminary work on tokenizers * Quality + fix tests * Treat processors * Fix pad * Remove all uses of in tests, docs and examples * Replace all as_target_tokenizer * Fix tests * Fix quality * Update examples/flax/image-captioning/run_image_captioning_flax.py Co-authored-by:
amyeroberts <amy@huggingface.co> * Style Co-authored-by:
amyeroberts <amy@huggingface.co>
-
- 19 May, 2022 1 commit
-
-
ddobokki authored
-
- 12 May, 2022 1 commit
-
-
Sylvain Gugger authored
* Black preview * Fixup too! * Fix check copies * Use the same version as the CI * Bump black
-
- 30 Mar, 2022 1 commit
-
-
Stas Bekman authored
* [examples] max samples can't be bigger than then len of dataset * do tf and flax
-
- 23 Mar, 2022 1 commit
-
-
Lysandre Debut authored
* Updates the default branch from master to main * Links from `master` to `main` * Typo * Update examples/flax/README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 12 Mar, 2022 1 commit
-
-
Stas Bekman authored
* [WIP] add support for bf16 mode * prep for bf16 * prep for bf16 * fix; zero2/bf16 is ok * check bf16 is available * test fixes * enable zero3_bf16 * config files * docs * split stage_dtype; merge back to non-dtype-specific config file * fix doc * cleanup * cleanup * bfloat16 => bf16 to match the PR changes * s/zero_gather_fp16_weights_on_model_save/zero_gather_16bit_weights_on_model_save/; s/save_fp16_model/save_16bit_model/ * test fixes/skipping * move * fix * Update docs/source/main_classes/deepspeed.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * backticks * cleanup * cleanup * cleanup * new version * add note about grad accum in bf16 Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 09 Feb, 2022 1 commit
-
-
Lysandre Debut authored
* Upgrade black to version ~=22.0 * Check copies * Fix code
-
- 10 Jan, 2022 1 commit
-
-
Patrick von Platen authored
* up * up * up * up * up * up * improve * up * up * Update src/transformers/trainer.py * up * up * up
-
- 11 Nov, 2021 1 commit
-
-
Stas Bekman authored
-
- 22 Oct, 2021 1 commit
-
-
Antonio Carlos Falc茫o Petri authored
-
- 14 Oct, 2021 1 commit
-
-
Patrick von Platen authored
-
- 08 Aug, 2021 1 commit
-
-
Patrick von Platen authored
-
- 30 Jul, 2021 1 commit
-
-
21jun authored
help for `ModelArguments.gradient_checkpointing` should be "If True, use gradient checkpointing to save memory at the expense of slower backward pass." not "Whether to freeze the feature extractor layers of the model." (which is duplicated from `freeze_feature_extractor` arg)
-
- 23 Jul, 2021 1 commit
-
-
Stas Bekman authored
-
- 15 Jul, 2021 1 commit
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * start adding tests * correct wav2vec2 pretraining * up * up Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
- 25 Jun, 2021 1 commit
-
-
Stas Bekman authored
-
- 14 Jun, 2021 1 commit
-
-
Stas Bekman authored
* consistent nn. and nn.functional: p4 examples * restore
-
- 09 Jun, 2021 2 commits
-
-
Anton Lozhkov authored
* Working quantizer forward * Working quantizer forward * Clean up unused model parts, test reproducibility * Working quantizer forward * Clean up unused model parts, test reproducibility * Remove custom outputs from the shared ones * correct conversion * correct bug * add first pretrain script * save intermediate * static shapes * save intermediate * finish first pretrain script version * more refactor * remove wanddb * refactor more * improve test * correct perplexity compute bug * finish model implementation * add to docs * finish docs * finish pretraining script * finish pretraining script * remove wandb * finish PR for merge * finish config * finish * make deepspeed work * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions * fix flaky test Co-authored-by:
patrickvonplaten <patrick.v.platen@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
-
- 08 Jun, 2021 1 commit
-
-
Stas Bekman authored
* wip * wip - but working with https://github.com/microsoft/DeepSpeed/pull/1044 * cleanup * workaround * working 5/8 modes * solve fp32 distributed zero3 * style * sync * sync * rework * deprecation * cleanup * https://github.com/microsoft/DeepSpeed/pull/1044 pr was merged * clean up * add a guide * more prose * more prose * fix * more prose * sub_group_size was too big * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * refactor * bug fix * make the true check explicit * new deepspeed release Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 12 May, 2021 1 commit
-
-
Philip May authored
-
- 14 Apr, 2021 1 commit
-
-
Nithin Holla authored
Co-authored-by:nithin19 <nithin@amberscript.com>
-
- 30 Mar, 2021 1 commit
-
-
Yih-Dar authored
-
- 22 Mar, 2021 2 commits
-
-
Qiushi Pan authored
Fix typo.
-
Patrick von Platen authored
-
- 21 Mar, 2021 4 commits
-
-
Suraj Patil authored
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Suraj Patil authored
-
- 19 Mar, 2021 3 commits
-
-
Julien Chaumond authored
* wording/typos tweaks * Make model upload instructions simpler
-
Patrick von Platen authored
-
Patrick von Platen authored
* finish * fix * fix * fix * fix
-
- 18 Mar, 2021 2 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
-