- 11 Jul, 2024 14 commits
-
-
Apoorv Khandelwal authored
* Change `Trainer.get_optimizer_cls_and_kwargs` to `self.` * Make `get_optimizer_cls_and_kwargs` an instance method * Fixing typo * Revert `get_optimizer_cls_and_kwargs` to staticmethod * restore newline to trainer.py eof
-
t11s authored
fix(SigLip): remove spurious exclusion of first vision output token in classifier
-
Joao Gante authored
fix sliding cache
-
Arthur authored
* dumb commit * nit * update * something like this * unpack in modeling utils * safe import * oups * update * nits * diff convert gemma * update * start propagating * udpate other modeling code as well * update for sliding window models * nits * more init cleanups * styling * fixup * noice * pass fixup * typo typing_extension -> typing_extensions * torch.nn.functionnal -> torch.nn.functional * add to import structure * unpack * simplify a bit more for this first version * nut * update * update * nit * ease the import of `Unpack` * remove useless `use_sliding_window` * no qua please * protect import? * style * [run-slow] * [run slow] llama,gemma,mistral,mixtral * remove extra kwargs * fix llama * address review comments * apply diff_model_converter to modeling_gemma.py * remove cache_position 1 * remove cache_position 2 * some cleaning * refactor gemma2 as well * apply review comments * rename file to modeling_flash_attention_utils.py * siglip refactor * remove dead code * is the hub down? * still down? * fix siglip * fix gemma2 * fatal: Could not read from remote repository. * fix typo in softcap implem * flacky * Failed: Timeout >120.0s --------- Co-authored-by:fxmarty <9808326+fxmarty@users.noreply.github.com>
-
fxmarty authored
* fix tests * [test_all] check * address review comments
-
Omar Salman authored
* Add warning message for and parameters * Fix when the warning is raised * Formatting changes * Improve testing and remove duplicated warning from _fix_key
-
Sangbum Daniel Choi authored
* add gather_use_object arguments * fix name and pass the CI test for Seq2SeqTrainer * make style * make it to functools * fix typo * add accelerate version: * adding warning * Update src/transformers/trainer.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * make style * Update src/transformers/training_args.py * check function move to initial part * add test for eval_use_gather_object * fix minor --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
Sai-Suraj-27 authored
Fixed the first argument name in few classmethods.
-
Isotr0py authored
* add missing methods for FuyuForCausalLM * fix a typo * format code * add missing tie_weights * format code
-
Arthur authored
* Support softcapping * strictly greater than * update
-
Arthur authored
* preserve the order * oups * oups * nit * trick * fix issues
-
Raushan Turganbay authored
* accept kwargs in processors * return unused kwargs * fix tests * typo * update the other way
-
turboderp authored
* HybridCache: Flip order of alternating global-attn/sliding-attn layers * HybridCache: Read sliding_window argument from cache_kwargs * Gemma2Model: Flip order of alternating global-attn/sliding-attn layers * Code formatting
-
Raushan Turganbay authored
* update docs * one more change
-
- 10 Jul, 2024 9 commits
-
-
haikuoxin authored
fix bug: https://github.com/huggingface/transformers/issues/31852
-
Yih-Dar authored
* fix * [test_all] check before merge --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
NielsRogge authored
* Add resources * Address comments
-
Marc Sun authored
Save sharded checkpoint in Trainer
-
Sai-Suraj-27 authored
Removed duplicate field definitions in classes.
-
Yih-Dar authored
* Revert "Revert "Fix `_init_weights` for `ResNetPreTrainedModel`" (#31868)" This reverts commit b45dd5de . * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check * fix * [test_all] check --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Noah Young authored
fix data split file type checks
-
yukionfire authored
-
Raushan Turganbay authored
* add conversion for interleave llava * remove debug lines * remove unused imports * Update src/transformers/models/llava/convert_llava_weights_to_hf.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * small changes + docs --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 09 Jul, 2024 15 commits
-
-
Yun Dai authored
* add warning when using with FSDP full shard * fix style * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add hybrid shard warn * fix style --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
dependabot[bot] authored
Bump certifi in /examples/research_projects/visual_bert Bumps [certifi](https://github.com/certifi/python-certifi) from 2023.7.22 to 2024.7.4. - [Commits](https://github.com/certifi/python-certifi/compare/2023.07.22...2024.07.04 ) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Mauricio Villegas authored
Update modeling_utils.py Add return type annotation to PreTrainedModel.from_pretrained
-
dependabot[bot] authored
Bump zipp in /examples/research_projects/decision_transformer Bumps [zipp](https://github.com/jaraco/zipp) from 3.7.0 to 3.19.1. - [Release notes](https://github.com/jaraco/zipp/releases) - [Changelog](https://github.com/jaraco/zipp/blob/main/NEWS.rst) - [Commits](https://github.com/jaraco/zipp/compare/v3.7.0...v3.19.1 ) --- updated-dependencies: - dependency-name: zipp dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Merve Noyan authored
--------- Co-authored-by:
Merve Noyan <mervenoyan@Merve-MacBook-Pro.local> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Yih-Dar authored
* init * test --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yung-Sung Chuang authored
Co-authored-by:Joao Gante <joao@huggingface.co>
-
chenk authored
Signed-off-by:chenk <hen.keinan@gmail.com>
-
Joao Gante authored
fix test
-
kallewoof authored
-
hatti authored
remove duplicate words
-
NielsRogge authored
Add model
-
fxmarty authored
only test input_embeds, not decoder_input_embeds
-
Raushan Turganbay authored
* deprrecate `vocab_size` in other two VLMs * Update src/transformers/models/fuyu/configuration_fuyu.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * depracate until 4.44 --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 08 Jul, 2024 2 commits
-
-
Joao Gante authored
* enable strict signature * this should not have been deleted * recurrent_gemma too
-
Andr茅 Storhaug authored
* Fix wrong acclerator device setup when using MPS * More robust TrainingArguments MPS handling * Update training_args.py * Cleanup
-