- 19 Jul, 2024 1 commit
-
-
Raushan Turganbay authored
fixes
-
- 18 Jul, 2024 1 commit
-
-
Raushan Turganbay authored
* fix merging * make chameleon conditional
-
- 17 Jul, 2024 1 commit
-
-
Raushan Turganbay authored
* Chameleon model integration Co-authored-by:
Jacob Kahn <jacobkahn1@gmail.com> Co-authored-by:
Leonid Shamis <leonid.shamis@gmail.com> * fix 7B, again. mask away image tokens * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove pretrained_config_map * make fixup passing up to utils/check_config_docstrings.py; vqgan moved to the modeling file * remove tokenizer (use llama's); remove codechameleon tests * a few copied from statements and minor changes * copied from in ChameleonModel * some copies in ChameleonForCausalLM * a few more copies * VQModel moved to ChameleonModel (as opposed to being in the processor) * ChameleonProcessor ready * Fix chameleon weights convert * update conversion script * clean-up processing * update modeling a bit * update * update (throws error...) * correct conversion ready * fix tests * fix docs * docs * ve swin norm * fix device for vocab map * add normalization * update * update script with rope rotations * final fix on model conversion * add slow tests * more info in docs * fix repo consistency tests * fix repo tests * fix-copies * hope this will make CI happy * fix for 30b model * Update docs/source/en/index.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/modeling_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/model_doc/chameleon.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/auto/configuration_auto.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/image_processing_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/modeling_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/processing_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/chameleon/processing_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/chameleon/test_modeling_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/chameleon/test_modeling_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/chameleon/test_modeling_chameleon.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * address comments * remove assertion in conversion script * add image processor test * not copied * port changes for qk layernorm * fix-copies * read token decorator for tests * [run-slow] chameleon * one more read-token * address some comments * qk norm changes * tests and repo check * moved rope permutations to conversion, YAY! * fix past kv check * docs * layernorm done! * let's be consistent in naming * fix slow tests * weird thing with slow CI, but let's see * once more try * remove past-kv as tuple following llama * ignore * style --------- Co-authored-by:
Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by:
ArthurZucker <arthur.zucker@gmail.com> Co-authored-by:
jacobkahn <jacobkahn1@gmail.com> Co-authored-by:
Leonid Shamis <leonid.shamis@gmail.com> Co-authored-by:
Leonid Shamis <lshamis@meta.com> Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Joao Gante <joao@huggingface.co> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 14 Jul, 2024 1 commit
-
-
Joao Gante authored
* tmp commit * shorter * nit * explicit kwargs * propagate changes * mass propagation with a few manual touches (let's see how CI behaves) * fix cacheless case * Update src/transformers/generation/utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * make fixup --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 11 Jul, 2024 1 commit
-
-
Arthur authored
* dumb commit * nit * update * something like this * unpack in modeling utils * safe import * oups * update * nits * diff convert gemma * update * start propagating * udpate other modeling code as well * update for sliding window models * nits * more init cleanups * styling * fixup * noice * pass fixup * typo typing_extension -> typing_extensions * torch.nn.functionnal -> torch.nn.functional * add to import structure * unpack * simplify a bit more for this first version * nut * update * update * nit * ease the import of `Unpack` * remove useless `use_sliding_window` * no qua please * protect import? * style * [run-slow] * [run slow] llama,gemma,mistral,mixtral * remove extra kwargs * fix llama * address review comments * apply diff_model_converter to modeling_gemma.py * remove cache_position 1 * remove cache_position 2 * some cleaning * refactor gemma2 as well * apply review comments * rename file to modeling_flash_attention_utils.py * siglip refactor * remove dead code * is the hub down? * still down? * fix siglip * fix gemma2 * fatal: Could not read from remote repository. * fix typo in softcap implem * flacky * Failed: Timeout >120.0s --------- Co-authored-by:fxmarty <9808326+fxmarty@users.noreply.github.com>
-
- 26 Jun, 2024 1 commit
-
-
Younes Belkada authored
* fix llama fsdp * fixup * adding FSDP tests for CPU offloading * fixes * fix tests * fix tests * add it for mixtral * propagate the changes on other models * Update src/transformers/models/phi/modeling_phi.py * Delete utils/testing_scripts/fsdp_cpu_offloading.py Remove script - FSDP + CPU offloading it tested in the test suite * Delete utils/testing_scripts/dummy_fsdp_config.yml * Update + add cache_positions docstring --------- Co-authored-by:amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 21 Jun, 2024 1 commit
-
-
Raushan Turganbay authored
* tmp * update models * revert utils * delete * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * modify warning msg --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 18 Jun, 2024 1 commit
-
-
Kevin Hu authored
* Update modeling_qwen2.py * Fix llama * More fixes
-
- 05 Jun, 2024 1 commit
-
-
Cyril Vallez authored
* Fix contrastive_search for new cache structure, and improve performance by removing inneficient torch.stack(torch.split(x, top_k, dim=0)) * Fix _contrastive_search for non-standard cache using ellipsis slicing * Fix all outputs.logits memory leaks for all decoding strategies! * Fix small error in _contrastive_search() * Make all necessary change and revert for the new class * Apply coding style * Remove pipes in type hints for compatibility * correct type hint * apply style * Use DynamicCache by default and solve conflicts * Fix rebase issues * Add `_supports_dynamic_cache_class` in models for models that support DynamicCache but not other caches to make DynamicCache the default for more models * Create generation config to return legacy format by default, or to choose not to * style * Fix case when use_cache is False * Remove default DynamicCache in assiste_decoding if assistant_model does not support it + fix _seen_tokens when cropping cache * Update prepare_inputs_for_generation() for case with empty DynamicCache * Correct return of args in _assisted_decoding * Remove EfficientDynamicCache as it is no longer needed * Correct mistake in generation config * Move cache logic of assisted decoding to AssistedCandidateGenerator.__init__ * change DynamicCache function names from "split" to "batch_split" for readability + apply coding style * Remove `_supports_dynamic_cache_class` attribute after rebase * Correct missing line lost in conflict resolution during rebasing * Add special case for Jamba * Fix jamba test * Coding style * coding style * Correct missing import in rebasing * Simplify _validate_model_kwargs based on removal of _supports_dynamic_cache attribute * Simplify code paths in _contrastive_search * coding style * Update docstrings of cache methods * Update prepare_inputs_for_generation() -> past_key_values are always Cache objects
-
- 03 Jun, 2024 1 commit
-
-
Arthur authored
* fixes * fix-copies
-
- 31 May, 2024 1 commit
-
-
Arthur authored
* current working example! * commit regex and result file * update * nit * push the conversion file * oups * roadmap and nits * attempt diffs for 3 files * persimmon * nit * add diff file that is the same as the modeling_llama.py * fix rope nits * updates * updates with converted versions * give some breathing space to the code * delete * update * update * push the actual result * update regex patterns * update regex patterns * fix some issues * fix some issues * fix some issues * updates * updates * updates * updates * updates * revert changes done to llama * updates * update gemma * updates * oups * current state * current state * update * ouiiii * nit * clear diffs * nit * fixup * update * doc
馃殌 *馃敟 * for now use gemma * deal with comments * style * handle funtions * deal with assigns * todos * process inheritage * keep decorators? *馃 * deal with duplicates * fixup * correctly remove duplicate code * run ruff post script * ruff deals pretty well with imports, let's leave it to him * ah maybe not lol * for now remove all imports from child. * nit * conversion of llama * okay * convert starcoder2 * synch with main * update llama diff * updates * https://docs.astral.sh/ruff/rules/redefined-while-unused/ fixes the imports, bit needs later version of ruff * updates * okay actual state * non zero exit * update! * revert unrelated * remove other diff files * updates * cleanup * update * less diff! * stash * current updates * updates * No need for call * finished fining deps * update * current changes * current state * current state * new status * nit * finally * fixes * nits * order is now expected * use logger info instead of prints * fixup * up * nit * update * nits * update * correct merge * update * update * update * add warning * update caution message * update * better merging strategy * copy class statements :wink * fixups * nits * update * Apply suggestions from code review Co-authored-by:amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits * smaller header * do cleanup some stuff * even simpler header? * fixup * updates * ruff * update examples * nit * TODO * state * OUUUUUUF * current state * nits * final state * add a readme * fixup * remove diff llama * fix * nit * dummy noy funny * ruff format tests src utils --check * everless diffs * less diffs and fix test * fixes * naming nit? * update converter and add supper example * nits * updated for function signatures * update * update * add converted dummies * autoformat * single target assign fix * fixup * fix some imports * fixes * don't push them * `# noqa: F841` --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 23 May, 2024 1 commit
-
-
Raushan Turganbay authored
* clean-up * Update src/transformers/cache_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/cache_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/cache_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/generation/configuration_utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * more suggestions * mapping if torch available * run tests & add 'support_quantized' flag * fix jamba test * revert, will be fixed by another PR * codestyle * HQQ and versatile cache classes * final update * typo * make tests happy --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 20 May, 2024 3 commits
-
-
Longjie Zheng authored
* first version * fix sliding window * fix style * add sliding window cache * fix style * address comments * fix test * fix style * move sliding window check inside cache init * revert changes on irrelevant files & add comment on SlidingWindowCache * address comments & fix style fix style * update causal mask * [run-slow] mistral * [run-slow] mistral * [run-slow] mistral * [run-slow] mistral * [run-slow] mistral * [run-slow] llama * [run-slow] mistral * [run-slow] mistral * [run-slow] mistral * revert CI from a10 to t4 * wrap up
-
Benjamin Warner authored
* add torch.compile dynamic support * Add SDPA dynamic shapes compile test & improve SDPA comment * comment consistency
-
Joseph Enguehard authored
* Add MistralForTokenClassification * Add tests and docs * Add token classification for Mixtral and Qwen2 * Save llma for token classification draft * Add token classification support for Llama, Gemma, Persimmon, StableLm and StarCoder2 * Formatting * Add token classification support for Qwen2Moe model * Add dropout layer to each ForTokenClassification model * Add copied from in tests * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Propagate suggested changes * Style --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 17 May, 2024 1 commit
-
-
amyeroberts authored
* Remove deprecated logic and warnings * Add back some code that seems to be important... * Let's just add all he nllb stuff back; removing it is a bit more involved * Remove kwargs * Remove more kwargs
-
- 16 May, 2024 1 commit
-
-
Joao Gante authored
* jamba cache * new flag * generate exception
-
- 15 May, 2024 1 commit
-
-
Edoardo Cetin authored
* Fix llama model forward function with attention=True, same-length encoded sequence. * Fix style * propagate fix to modeling_cohere, gemma, dbrx, and olmo (which copy the same sdpa masking logic from llama) * Fix style * ignore unnecessary sdpa mask converter when output_attentions=True * add tests checking sdpa and eager outputs match when output_attentions=True * Split if statements in two lines Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix formatting * Add fix to new jetmoe model * Add missing output_attentions argument to jetmoe mask creation --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 13 May, 2024 1 commit
-
-
Poedator authored
* 4d mask fixes * Update custom 4D mask logic * test moved to mixin * extra tests 4d mask * upd 4d mask and StaticCache handling * added Mask4DTestHard to mistral tests * post-rebase fixes * test fixes for StaticCache * make fix-copies * upd 1 after #30476 * fix common tests * rm elif attention_mask.dim() == 4: * tests combined, fixed, mixtral supported * bigbird style chg reverted * rm if attention_mask.dim() == 2 * modeling_llama formatting chg --------- Co-authored-by:Joao Gante <joao@huggingface.co>
-
- 09 May, 2024 1 commit
-
-
Raushan Turganbay authored
kv_cache is no longer a model attribute
-
- 08 May, 2024 1 commit
-
-
Joao Gante authored
-
- 03 May, 2024 1 commit
-
-
Mayank Mishra authored
* add bias * fix quality
-
- 02 May, 2024 1 commit
-
-
Michael Benayoun authored
-
- 30 Apr, 2024 1 commit
-
-
Joao Gante authored
-
- 29 Apr, 2024 1 commit
-
-
Benjamin Warner authored
* Reenable SDPA's FA2 during training with torch.compile * fix Olmo's SDPA FA2 dispatching too * update formatting * improved SDPA comment * formatting and explanatory comment * is_causal if statement to one-liner
-
- 22 Apr, 2024 1 commit
-
-
Arthur authored
* nit to make sure cache positions are not sliced * fix other models * nit * style
-
- 18 Apr, 2024 2 commits
-
-
Younes Belkada authored
FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + revert #30070 at the same time (#30317) * Update awq.py * style * revert felix PR * fix * add felix comments
-
- 17 Apr, 2024 1 commit
-
-
fxmarty authored
* tentatively re-enable FA2 + SDPA * better comment * _ignore_causal_mask_sdpa as staticmethod * type hints * use past_seen_tokens instead * enable copied from for sdpa * ruff * llama simplifications on review * remove unnecessary self.is_causal check * fix copies * cleaning * precise message * better doc * add test * simplify * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * style --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 05 Apr, 2024 1 commit
-
-
Michael Benayoun authored
* [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * Apply changes to other models
-
- 30 Mar, 2024 1 commit
-
-
TechxGenus authored
fix awq quant
-
- 28 Mar, 2024 1 commit
-
-
Arthur authored
* fi xbc? * nit
-
- 21 Mar, 2024 1 commit
-
-
Joao Gante authored
* always convert the mask * rebase and fix copies
-
- 20 Mar, 2024 1 commit
-
-
Arthur authored
* attempt to fix * the actual fix that works with compilation! * this? * temporary update * nit? * dispatcg to memory efficient? * update both models that have static cache support * fix copies fix compile * make sure fix * fix cohere and gemma * fix beams? * nit * slipped through the cracks * nit * nits * update * fix-copies * skip failing tests * nits
-
- 19 Mar, 2024 1 commit
-
-
Joao Gante authored
* partial 4d masks * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 14 Mar, 2024 1 commit
-
-
Joao Gante authored
-
- 13 Mar, 2024 1 commit
-
-
Joao Gante authored
-
- 08 Mar, 2024 1 commit
-
-
liangjs authored
* fix stablelm dropout argument type error * fix docs of _flash_attention_forward * fix all docs of _flash_attention_forward * fix docs of _flash_attention_forward in starcoder2 --------- Co-authored-by:oliang <oliang@tencent.com>
-
- 07 Mar, 2024 1 commit
-
-
Joao Gante authored
-
- 06 Mar, 2024 1 commit
-
-
Park Jun authored
* Fix: Disable torch.autocast in RotaryEmbedding of Gemma and LLaMa for MPS devices * Update src/transformers/models/gemma/modeling_gemma.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update llama ang gemma rope use cpu in mps device --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-