- 25 Apr, 2024 4 commits
-
-
Arthur authored
* fix codellama conversion * nit
-
Younes Belkada authored
Update ssh-runner.yml
-
Younes Belkada authored
Update push-important-models.yml
-
Younes Belkada authored
* add SSH into our runners workflow * fix * fix * fix * use our previous approaches * forward contrib credits from discussions --------- Co-authored-by:Yih-Dar <ydshieh@users.noreply.github.com>
-
- 24 Apr, 2024 19 commits
-
-
Yih-Dar authored
* better names * run better names * update * update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Zach Mueller authored
* Non blocking support * Check for optimization * Doc
-
Zach Mueller authored
* Check removing flag for torch * LLM oops * Getting there... * More discoveries * Change * Clean up and prettify * Logic check * Not
-
jeffhataws authored
save_safetensor=True is default as of release 4.35.0, which then required TPU hotfix https://github.com/huggingface/transformers/pull/27799 (issue https://github.com/huggingface/transformers/issues/27578). However, when the flag save_safetensor is set to False (compatibility mode), moving the model to CPU causes generation of too many graphs during checkpoint https://github.com/huggingface/transformers/issues/28438. This PR disable moving of model to CPU when save_safetensor=False.
-
Arthur authored
update most of decision transformers research project
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Gustavo de Rosa authored
* chore(root): Initial commit of Phi-3 files. * fix(root): Fixes Phi-3 missing on readme. * fix(root): Ensures files are consistent. * fix(phi3): Fixes unit tests. * fix(tests): Fixes style of phi-3 test file. * chore(tests): Adds integration tests for Phi-3. * fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and rmsnorm. * fix(phi3): Fixes incorrect docstrings. * fix(phi3): Fixes docstring typos. * fix(phi3): Adds support for Su and Yarn embeddings. * fix(phi3): Improves according first batch of reviews. * fix(phi3): Uses up_states instead of y in Phi3MLP. * fix(phi3): Uses gemma rotary embedding to support torch.compile. * fix(phi3): Improves how rotary embedding classes are defined. * fix(phi3): Fixes inv_freq not being re-computed for extended RoPE. * fix(phi3): Adds last suggestions to modeling file. * fix(phi3): Splits inv_freq calculation in two lines.
-
Yih-Dar authored
* trigger * remove the last job --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Eduardo Pacheco authored
* Fixed main train issues * Added loss test * Update src/transformers/models/seggpt/modeling_seggpt.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Added missing labels arg in SegGptModel forward * Fixed typo * Added slow test to test loss calculation --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Marc Sun authored
* fix jamba slow foward for multi-gpu * remove comm * oups * style
-
Anton Vlasjuk authored
* fix clip's/siglip's _init_weights to reflect linear layers in "for image classification" * trigger slow tests
-
Fanli Lin authored
* make device-agnostic * clean code
-
Arthur authored
* nit * nit and fmt skip * fixup * Update src/transformers/convert_slow_tokenizer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * set to true --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Pavel Iakubovskii authored
* Add test for square image that fails * Fix for square images * Extend test cases * Fix resizing in tests * Style fixup
-
Arthur authored
* nuke * add co-author * add co-author * update card * fixup and fix copies to please our ci * nit fixup * super small nits * remove tokenizer_path from call to `write_model` * always safe serialize by default --------- Co-authored-by:
pcuenca <pcuenca@users.noreply.github.com> Co-authored-by:
xenova <xenova@users.noreply.github.com>
-
Yih-Dar authored
* You should not pass Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Lysandre Debut authored
Remove mentions of models in the READMEs and link to the documentation page in which they are featured. (#30420) * REAMDEs * REAMDEs v2
-
Lysandre Debut authored
* Remove add-new-model in favor of add-new-model-like * nits
-
Lysandre Debut authored
-
- 23 Apr, 2024 16 commits
-
-
Arthur authored
* push legacy to fast as well * super strange * Update src/transformers/convert_slow_tokenizer.py * make sure we are BC * fix Llama test * nit * revert * more test * style * update * small update w.r.t tokenizers * nit * don't split * lol * add a test for `add_prefix_space=False` * fix gemma tokenizer as well * update * fix gemma * nicer failures * fixup * update * fix the example for legacy = False * use `huggyllama/llama-7b` for the PR doctest * nit * use from_slow * fix llama
-
Jiewen Tan authored
* Fix use_cache for xla fsdp * Fix linters
-
Steven Basart authored
torch.run does not exist anywhere as far as I can tell.
-
Matt authored
* Remove old TF port guide * repo-consistency * Remove some translations as well for consistency * Remove some translations as well for consistency
-
Yih-Dar authored
* fix * try suggestion * update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Younes Belkada authored
Update Dockerfile
-
Pedro Cuenca authored
-
Wing Lian authored
* fix for itemsize => element_size() for torch backwards compat * improve handling of element counting * Update src/transformers/modeling_utils.py * fixup * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Younes Belkada <younesbelkada@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Raushan Turganbay authored
* clean commit history I hope * get kv seq length correctly * PR suggestions * Update src/transformers/testing_utils.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * add comment * give gpt bigcode it's own overriden method * remove code --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
Joao Gante authored
scipy pin for jax
-
Fanli Lin authored
* add cuda flag * check for sdpa * add bitsandbytes
-
Nick Doiron authored
fix: link to HF repo tree when a file is missing
-
Russell Klopfer authored
-
Eduardo Pacheco authored
* Added cross attention support * Fixed dtypes * Fixed assumption * Moved to decoder
-
Raushan Turganbay authored
* Add inputs embeds in generation * always scale embeds * fix-copies * fix failing test * fix copies once more * remove embeds for models with scaling * second try to revert * codestyle
-
Arthur authored
-
- 22 Apr, 2024 1 commit
-
-
Steven Liu authored
* first draft * feedback * static cache snippet * feedback * feedback
-