- 12 Sep, 2023 2 commits
-
-
Younes Belkada authored
import tensorflow inside relevant methods in trainer_utils
-
Arthur authored
* intiial commit * updates * nits * update conversion script * update conversion script * use path to load * add tips etc * some modeling logic * modeling update * more nits * nits * normal layer norm * update config and doc * nits * update doc remove unused * update * fix inits and stuff * fixup * revert wrong changes * updates * more nits * add default config values to the configuration file * fixup happy * update * 2 tests left * update readmes * more nits * slow test and more documentation * update readme * fix licences * styling * use fast if possible when saving tokenizer * remove todo * remove tokenization tests * small last nits * Apply suggestions from code review Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> * nits to skip the timout doctest * fix integration test * fix test * update eos token * update to allow fast tokenization * styling * fix codeLlama as well for the update post processor * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more copied from statements * update * doc passes doctest * remove `# final layer norm?` * change docstring prompot * update * Update README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * don't doctest the conversion script as it requires more packages * don't init a model in the config * oups * fix doctest --------- Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 11 Sep, 2023 2 commits
-
-
Patrick von Platen authored
* improve import time * Update src/transformers/integrations/__init__.py * sort import
-
Hang authored
only main process should call _save when deepspeed zero3
-
- 08 Sep, 2023 4 commits
-
-
Arthur authored
* fix `set_infilling_processor` to properly reset * Add docstring! * fixups * more details in the docuemtation about the tokenization * styl;e
-
Angela Yi authored
* Ignore warning if tracing with dynamo * fix import error * separate to function * add test
-
Thien Tran authored
* add missing doc for activation dropout * fix doc for SEW-D dropout * deprecate hidden_dropout for SEW-D
-
Alexander Krauck authored
This commit corrects the dropout implementation in Graphormer, aligning it with the original implementation and improving performance. Specifically: 1. The `attention_dropout` variable, intended for use in GraphormerMultiheadAttention, was defined but not used. This has been corrected to use `attention_dropout` instead of the regular `dropout`. 2. The `activation_dropout` for the activations in the feed-forward layers was missing. Instead, the regular `dropout` was used. This commit adds `activation_dropout` to the feed-forward layers. These changes ensure the dropout implementation matches the original Graphormer and delivers empirically better performance.
-
- 07 Sep, 2023 6 commits
-
-
dumpmemory authored
* fix loss inconsistent after resume #25340 * fix typo * clean code * reformatted code * adjust code according to comments * adjust check_dataloader_randomsampler location * return sampler only * handle sampler is None * Update src/transformers/trainer_pt_utils.py thanks @amyeroberts Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
MyungHa Kwon authored
fix typo
-
raghavanone authored
* Fix vilt config init parameter to match the ones in documentation * Fix the documentation
-
CokeDong authored
* Add tgs metrics * bugfix and black formatting * workaround for tokens counting * formating and bugfix * Fix * Add opt-in for tgs metrics * make style and fix error * Fix doc * fix docbuild * hf-doc-build * fix * test * Update src/transformers/training_args.py renaming Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * Update src/transformers/training_args.py renaming Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * Fix some symbol * test * Update src/transformers/trainer_utils.py match nameing patterns Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/trainer.py nice Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix reviews * Fix * Fix black --------- Co-authored-by:
Zach Mueller <muellerzr@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Kai authored
-
Zach Mueller authored
* Fix err * Use version check
-
- 06 Sep, 2023 3 commits
-
-
Marc Sun authored
* add new arg for gptq * add tests * add min version autogptq * fix order * skip test * fix * Update src/transformers/modeling_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix style * change model path --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Matt authored
* stash commit * More OPT updates * Update src/transformers/models/opt/modeling_tf_opt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Lysandre Debut authored
* Fix revision propagation * Cleaner
-
- 05 Sep, 2023 13 commits
-
-
tju_skywalker authored
* fix convert megatron model too large * fix convert megatron model too large
-
Tanay Mehta authored
* add: potential fix to mega chunking in decoder only model bug * add: decoder with chunking test * add: input_mask passed with input_ids
-
Sanchit Gandhi authored
* [Wav2Vec2 Conformer] Fix inference float16 * fix test * fix test more * clean pipe test
-
Sourab Mangrulkar authored
deepspeed resume from ckpt fixes and adding support for deepspeed optimizer and HF scheduler (#25863) * Add support for deepspeed optimizer and HF scheduler * fix bug * fix the import * fix issue with deepspeed scheduler saving for hf optim + hf scheduler scenario * fix loading of hf scheduler when loading deepspeed checkpoint * fix import of `DeepSpeedSchedulerWrapper` * add tests * add the comment and skip the failing tests * address comment
-
raghavanone authored
* Add TFDebertaV2ForMultipleChoice * Import newer model in main init * Fix import issues * Fix copies * Add doc * Fix tests * Fix copies * Fix docstring
-
andreeahedes authored
* no_split_modules * no_split_modules * inputs_embeds+pos same device * update _no_split_modules * update _no_split_modules
-
Abhilash Majumder authored
* patch with accelerate xpu * patch with accelerate xpu * formatting * fix tests * revert ruff unrelated fixes * revert ruff unrelated fixes * revert ruff unrelated fixes * fix test * review fixes * review fixes * black fixed * review commits * review commits * style fix * use pytorch_utils * revert markuplm test
-
Joao Gante authored
-
Sahel Sharify authored
This cl iterates through a list of keys rather than dict items while updating the dict elements. Fixes the following error: File "..../transformers/training_args.py", line 1544, in post_init for k, v in self.fsdp_config.items(): RuntimeError: dictionary keys changed during iteration
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Kai authored
rename doanloading to downloading
-
Huazhong Ji authored
nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the minimum PyTorch version we currently support is 1.10.0 (#25974) nn.Identity is not required to be compatible with PyTorch < 1.1.0 as the minimum PyTorch version we currently support is 1.10.0
-
Susnato Dhar authored
* Update feature_extraction_clap.py * changed all lenght to length
-
- 04 Sep, 2023 9 commits
-
-
Lysandre authored
-
Younes Belkada authored
* remove SDPA for falcon * revert previous behaviour and add warning * nit * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/models/falcon/modeling_falcon.py --------- Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
Lysandre Debut authored
* Put Falcon back * Update src/transformers/models/auto/configuration_auto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update test --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
David Reguera authored
* Add missing type hints and consistency to `RegNet` models * Add missing type hints and consistency to `TFSamModel` * Add missing type hints to `TFSegformerDecodeHead` * Add missing type hints and consistency to `TransfoXL` family models * Add missing type hints and consistency to `TFWav2Vec2ForSequenceClassification` * Add type hints to `TFXLMModel` * Fix linter * Revert the type hints for `RegNet` to python 3.8 compliant * Remove the redundant np.ndarray type hint.
-
Yih-Dar authored
* fix * fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Lysandre Debut authored
-
Matt authored
* Add proper Falcon docs and conversion script * Autodetect the decoder architecture instead of using an arg * Update docs now that we can autodetect * Fix doc error * Add doc to toctree * Quick doc update
-
Omar Sanseviero authored
Follow up from #25599
-
Sanchit Gandhi authored
-
- 01 Sep, 2023 1 commit
-
-
Arthur authored
* some bug fixes * updates * Update code_llama.md Co-authored-by:
Omar Sanseviero <osanseviero@users.noreply.github.com> * Add co author Co-authored-by:
pcuenca <pedro@latenitesoft.com> * add a test * fixup * nits * some updates * fix-coies * adress comments * nits * nits * fix docsting * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update * add int for https://huggingface.co/spaces/hf-accelerate/model-memory-usage --------- Co-authored-by:
Omar Sanseviero <osanseviero@users.noreply.github.com> Co-authored-by:
pcuenca <pedro@latenitesoft.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-