- 19 Oct, 2023 1 commit
-
-
Younes Belkada authored
supprot fa-2 + right padding + forward
-
- 18 Oct, 2023 2 commits
-
-
Younes Belkada authored
revert
-
Younes Belkada authored
* final fix for FA2 dtype * try * oops * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * apply fix everywhere --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 13 Oct, 2023 1 commit
-
-
Younes Belkada authored
* fix fa-2 import * nit
-
- 11 Oct, 2023 1 commit
-
-
Billy Bradley authored
In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) (#25242) * In assisted decoding, pass model_kwargs to model's forward call Previously, assisted decoding would ignore any additional kwargs that it doesn't explicitly handle. This was inconsistent with other generation methods, which pass the model_kwargs through prepare_inputs_for_generation and forward the returned dict to the model's forward call. The prepare_inputs_for_generation method needs to be amended in all models, as previously it only kept the last input ID when a past_key_values was passed. * Improve variable names in _extend_attention_mask * Refactor extending token_type_ids into a function * Replace deepcopy with copy to optimize performance * Update new persimmon model with llama changes for assisted generation * Update new mistral model for assisted generation with prepare_inputs_for_generation * Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation
-
- 06 Oct, 2023 1 commit
-
-
fxmarty authored
* remove unnecessary unsqueeze-squeeze in llama * correct other models * fix * revert gpt_neox_japanese * fix copie * fix test
-
- 03 Oct, 2023 1 commit
-
-
Younes Belkada authored
* add FA-2 support for mistral * fixup * add sliding windows * fixing few nits * v1 slicing cache - logits do not match * add comment * fix bugs * more mem efficient * add warning once * add warning once * oops * fixup * more comments * copy * add safety checker * fixup * Update src/transformers/models/mistral/modeling_mistral.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * copied from * up * raise when padding side is right * fixup * add doc + few minor changes * fixup --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 27 Sep, 2023 1 commit
-
-
Chris Bamford authored
* [Mistral] Mistral-7B-v0.1 support * fixing names * slightly longer test * fixups * not_doctested * wrongly formatted references * make fixuped --------- Co-authored-by:
Timothee Lacroix <t@eugen.ai> Co-authored-by:
timlacroix <t@mistral.ai>
-
- 22 Sep, 2023 2 commits
-
-
Younes Belkada authored
* v1 * oops * working v1 * fixup * add some TODOs * fixup * padding support + try with module replacement * nit * alternative design * oops * add `use_cache` support for llama * v1 falcon * nit * a bit of refactor * nit * nits nits * add v1 padding support falcon (even though it seemed to work before) * nit * falcon works * fixup * v1 tests * nit * fix generation llama flash * update tests * fix tests + nits * fix copies * fix nit * test- padding mask * stype * add more mem efficient support * Update src/transformers/modeling_utils.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * fixup * nit * fixup * remove it from config when saving * fixup * revert docstring * add more checks * use values * oops * new version * fixup * add same trick for falcon * nit * add another test * change tests * fix issues with GC and also falcon * fixup * oops * Update src/transformers/models/falcon/modeling_falcon.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * add init_rope * updates * fix copies * fixup * fixup * more clarification * fixup * right padding tests * add docs * add FA in docker image * more clarifications * add some figures * add todo * rectify comment * Change to FA2 * Update docs/source/en/perf_infer_gpu_one.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * split in two lines * change test name * add more tests * some clean up * remove `rearrange` deps * add more docs * revert changes on dockerfile * Revert "revert changes on dockerfile" This reverts commit 8d72a66b4b9b771abc3f15a9b9506b4246d62d8e. * revert changes on dockerfile * Apply suggestions from code review Co-authored-by:
Lysandre Debut <hi@lysand.re> * address some comments * docs * use inheritance * Update src/transformers/testing_utils.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * fixup * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py * final comments * clean up * style * add cast + warning for PEFT models * fixup --------- Co-authored-by:
Felix Marty <9808326+fxmarty@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
Yih-Dar authored
fix doc CI Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 18 Sep, 2023 1 commit
-
-
Sanchit Gandhi authored
fix copies
-
- 12 Sep, 2023 1 commit
-
-
Arthur authored
* intiial commit * updates * nits * update conversion script * update conversion script * use path to load * add tips etc * some modeling logic * modeling update * more nits * nits * normal layer norm * update config and doc * nits * update doc remove unused * update * fix inits and stuff * fixup * revert wrong changes * updates * more nits * add default config values to the configuration file * fixup happy * update * 2 tests left * update readmes * more nits * slow test and more documentation * update readme * fix licences * styling * use fast if possible when saving tokenizer * remove todo * remove tokenization tests * small last nits * Apply suggestions from code review Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> * nits to skip the timout doctest * fix integration test * fix test * update eos token * update to allow fast tokenization * styling * fix codeLlama as well for the update post processor * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more copied from statements * update * doc passes doctest * remove `# final layer norm?` * change docstring prompot * update * Update README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * don't doctest the conversion script as it requires more packages * don't init a model in the config * oups * fix doctest --------- Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 25 Aug, 2023 1 commit
-
-
Arthur authored
* add all * Revert "Delete .github directory" This reverts commit 9b0ff7b052e2b20b629a26fb13606b78a42944d1. * make conversion script backward compatible * fixup * more styling * copy to llama changes * fix repo consistency * nits * document correct classes * updates * more fixes * nits * update auto mappings * add readmes * smallupdates * llama-code replace with llama_code * make fixup * updates to the testsing suite * fix fast nits * more small fixes * fix decode * fix template processing * properly reset the normalizer * nits processor * tokenization tests pass * styling * last tests * additional nits * one test is left * nits Co-authored-by faabian <faabian@users.noreply.github.com> * update failing test * fixup * remove decode infilling users should handle it on their onw after generation, padding can be a problem * update * make test slow and more meaningfull * fixup * doc update * fixup * Apply suggestions from code review * add kwargs doc * tokenizer requires `requires_backend` * type requires_backends * CodeLlama instead of LlamaCode * more name cahnges * nits * make doctests happy * small pipeline nits * last nit * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update * add codellama to toctree --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 18 Aug, 2023 1 commit
-
-
Arthur authored
* nit * update * make sure use_default_system_prompt is saved * update checkpointing * consistency * use_default_system_prompt for test
-
- 25 Jul, 2023 1 commit
-
-
Arthur authored
* support left padding * nit * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py * Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
-
- 21 Jul, 2023 1 commit
-
-
Arthur authored
remove persistent tensor
-
- 19 Jul, 2023 1 commit
-
-
Younes Belkada authored
* add possibility to disable TP * fixup * adapt from offline discussions
-
- 18 Jul, 2023 1 commit
-
-
Arthur authored
* add llama * add other readmes * update padding id in readme * add link to paper * fix paths and tokenizer * more nits * styling * fit operation in 2 lines when possible * nits * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add form * update reademe * update readme, we don't have a default pad token * update test and tokenization * LLaMA instead of Llama * nits * add expected text * add greeedy output * styling * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * sequential device map * skip relevant changes --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 13 Jul, 2023 2 commits
-
-
Joao Gante authored
* add rope_scaling * tmp commit * add gptneox * add tests * GPTNeoX can now handle long inputs, so the pipeline test was wrong * Update src/transformers/models/open_llama/configuration_open_llama.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove ntk * remove redundant validation --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Liyang90 authored
* Update modeling_llama.py Removing unnecessary `device=device` * fix in all occurrences of _make_causal_mask
-
- 04 Jul, 2023 1 commit
-
-
Prathik Rao authored
* open llama fp16 bug fix * bug fix * bug fixed * make style * Update modeling_llama.py * apply formatting * Address amy's comment --------- Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net> Co-authored-by:root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
-
- 27 Jun, 2023 1 commit
-
-
Sylvain Gugger authored
* Preliminary work on some models * Fix test load missing and make sure nonpersistent buffers are tested * Always ignore nonpersistent buffers if in state_dict * Treat models * More models * Treat remaining models * Fix quality * Fix tests * Remove draft * This test is not needed anymore * Fix copies * Fix last test * Newly added models * Fix last tests * Address review comments
-
- 22 Jun, 2023 1 commit
-
-
Younes Belkada authored
Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)" This reverts commit 285a4801.
-
- 21 Jun, 2023 1 commit
-
-
Younes Belkada authored
* fix gc bug * continue PoC on OPT * fixes * :exploding_head: * fix tests * remove pytest.mark * fixup * forward contrib credits from discussions * forward contrib credits from discussions * reverting changes on untouched files. --------- Co-authored-by:
zhaoqf123 <zhaoqf123@users.noreply.github.com> Co-authored-by:
7eu7d7 <7eu7d7@users.noreply.github.com>
-
- 15 Jun, 2023 1 commit
-
-
Fei Wang authored
* Fix LLaMa beam search when using parallelize same issue as T5 #11717 * fix code format in modeling_llama.py * fix format of _reorder_cache in modeling_llama.py
-
- 13 Jun, 2023 1 commit
-
-
Sylvain Gugger authored
* First test * Add info for all models * style * Repo consistency * Fix last model and cleanup prints * Repo consistency * Use consistent function for detecting tied weights
-
- 12 Jun, 2023 1 commit
-
-
fxmarty authored
* fix dtype init * fix copies * fix fixcopies mess * edit forward as well * copy
-
- 08 Jun, 2023 1 commit
-
-
Serge Panev authored
* Fix typo in Llama docstrings Signed-off-by:
Serge Panev <spanev@nvidia.com> * Update Signed-off-by:
Serge Panev <spanev@nvidia.com> * make style Signed-off-by:
Serge Panev <spanev@nvidia.com> --------- Signed-off-by:
Serge Panev <spanev@nvidia.com>
-
- 31 May, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 22 May, 2023 2 commits
-
-
Tim Dettmers authored
* Fixed bug where LLaMA layer norm would change input type. * make fix-copies --------- Co-authored-by:younesbelkada <younesbelkada@gmail.com>
-
zspo authored
* Fix tensor device while attention_mask is not None * Fix tensor device while attention_mask is not None
-
- 24 Apr, 2023 1 commit
-
-
othertea authored
-
- 17 Apr, 2023 2 commits
-
-
Kunhao ZHENG authored
fix-squeeze-tuple
-
fpgaminer authored
-
- 07 Apr, 2023 1 commit
-
-
Shikhar Chauhan authored
* (feat): Move labels to the same device as logits * Trigger CI * Trigger CI * Trigger CI * (feat): Making changes for Blip2
-
- 31 Mar, 2023 1 commit
-
-
Nicolas Patry authored
* Making sure we can use safetensors to serialize all the time. * Expanding the tests for increased coverage. * Update the test. * Getting current state of affairs. * Tentative fix. * Fixing black version. * Fixing the worst offenders. * Try to modify less files. * Fixing blip_2 (Weird solution right now). * Fixing deta. * Fix blip ? * Missing extra newline. * No deta modification. * Adding some comments. * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Addressing comments. * Addressing comments. * creating warn_once. * Warning_once ! --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 30 Mar, 2023 1 commit
-
-
Joao Gante authored
* Llama now supports max_position_embeddings * Save config; Cosmetic edits
-
- 28 Mar, 2023 1 commit
-
-
Jeff Rasley authored
* ensure causal_mask is created directly on device * add copy tag to opt, update bart implementation * add device to all _make_causal_mask copies * formatting fixes * more manual fixes due to unlinked versions of _prepare_decoder_attention_mask
-
- 27 Mar, 2023 2 commits
-
-
Joao Gante authored
-
кѳѳsнī authored
balanced 8bit memory
-