- 26 Jul, 2024 5 commits
-
-
Sai-Suraj-27 authored
* Refactored to remove un-necessary object base class. * small fix.
-
João Nadkarni authored
* don't log base model architecture in wandb is log model is false * Update src/transformers/integrations/integration_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * convert log model setting into an enum * fix formatting --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Raushan Turganbay authored
* fix resize when deepspeed * deepsped uses new embeds * we needed this
-
Raushan Turganbay authored
* llava w/o images * tests
-
Raushan Turganbay authored
* fix * move changes to prompt lookup * add test * set eos in assistant model * style * fix flakiness * changes for new `main` * Update tests/generation/test_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/generation/test_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add comment to explain --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 25 Jul, 2024 9 commits
-
-
Pavel Iakubovskii authored
Fix code snippet for grounding-dino
-
jrhe authored
Allow a specific microphone to be used by the ffmpeg audio pipeline utility functions. Default to using the currently active microphone on Mac (#31846) * use currently active microphone on mac for ffmpeg_microphone * Allow ffmpeg_microphone device to be specified Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Huazhong Ji authored
* translate philosophy.md to chinese * add the missing link
-
Yih-Dar authored
* fix * [test_all] trigger full CI --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Kashif Rasul authored
fix E721 warnings
-
Kashif Rasul authored
set _supports_param_buffer_assignment to False
-
Austin authored
-
Huazhong Ji authored
remove unnecessary guard code related with pytorch versions 1.4.2 ~ 1.7.0
-
Sanchit Gandhi authored
* [whisper] fix short-form output type * add test * make style * update long-form tests * fixes * last fix * finalise test
-
- 24 Jul, 2024 11 commits
-
-
Sai-Suraj-27 authored
Replaced deprecated unittest method with the correct one.
-
Matt authored
* No more default chat templates * Add the template to the GPT-SW3 tests since it's not available by default now * Fix GPT2 test * Fix Bloom test * Fix Bloom test * Remove default templates again
-
Penut Chen authored
* support gguf fp16 * support gguf bf16 with pytorch * add gguf f16 test * remove bf16
-
Marc Sun authored
* Fix float8_e4m3fn in modeling_utils * style * fix * comment
-
Raushan Turganbay authored
fix resize when deepspeed
-
Arthur authored
* let's not warn when someone is running a foward without cache + self.training * more models * fixup
-
Joao Gante authored
* relaxed rope check * lets also accept rope_type=None, defaulting to the original implementation * type and rope_type can coexist
-
amyeroberts authored
Remove conversation pipeline tests
-
Dr. Artificial曾小健 authored
* Update qwen2.md outdated description * Update qwen2.md amended * Update qwen2.md Update * Update qwen2.md fix wrong version code, now good to go
-
조준래 authored
fix: default value reflects the runtime environment variables rather than the ones present at import time. (#32153) * fix: default value reflects the runtime environment variables rather than the ones present at import time. * Fix: Change `deterministic` to None by default; use env var if None
-
Rohit Dwivedula authored
* adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer * style fix with ruff:
-
- 23 Jul, 2024 15 commits
-
-
Fanli Lin authored
fix
-
Sai-Suraj-27 authored
Fixed an if condition always evaluating to true.
-
Joao Gante authored
-
Lysandre authored
Co-authored-by:Arthur Zucker <arthur.zucker@gmail.com>
-
Lysandre authored
-
Sai-Suraj-27 authored
* Updated ruff version and fixed the required code accorindg to the latest version. * Updated ruff version and fixed the required code accorindg to the latest version. * Added noqa directive to ignore 1 error shown by ruff
-
RhuiDih authored
* add DataCollatorBatchFlattening * Update data_collator.py * change name * new FA2 flow if position_ids is provided * add comments * minor fix * minor fix data collator * add test cases for models * add test case for data collator * remove extra code * formating for ruff check and check_repo.py * ruff format ruff format tests src utils * custom_init_isort.py
-
Deep Gandhi authored
Update integration_utils.py Added additional kwarg
-
Alvaro Moran authored
* feat(cache): StaticCache uses index_copy_ to avoid useless copy Using index_copy_ allows for explicit in-place change of the tensor. Some backends (XLA) will otherwise copy the tensor, making the code slower and using more memory. Proposed implementation will end up using less memory and on XLA will result in less compilation, but the change is also quite generic, making no change whatsoever on CUDA or CPU backend. * feat(cache): SlidingWindowCache uses index_copy_ to avoid useless copy Applying the same change done in StaticCache. * fix(cache): fallback of index_copy_ when not implemented * fix(cache): in index_copy_ ensure tensors are on same device * [run slow] llama * fix(cache): add move of cache_position to same device in SlidingWindowCache * Revert "[run slow] llama" This reverts commit 02608dd14253ccd464e31c108e0cd94364f0e8b9.
-
amyeroberts authored
-
Sanchit Gandhi authored
Revert "Incorrect Whisper long-form decoding timestamps (#32003)" This reverts commit cd48553f.
-
Amit Garg authored
* renamed phi3 rope_scaling type * fixed trailing whitespaces * fixed test * added warning * fixed format
-
Alexandre TL authored
* Update README.md * tests: forward ok * backward test done * done testing * removed check. scripts * Update README.md * added use_mambapy arg * fixed typo in warning * protected imports w/ mambapy package * delete pscan.py + raise rather than assert * Update import_utils.py * fix whitespaces and unused import * trailing whitespace + import block unformatted * Update modeling_mamba.py * transpose before pscan * shape comment * ran make style * use_mambapy=False by default Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * ran make fix-copies --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Merve Noyan authored
--------- Co-authored-by:Merve Noyan <mervenoyan@Merve-MacBook-Pro.local>
-
Cyril Vallez authored
Add the lru_cache for speed
-