- 01 Feb, 2024 1 commit
-
-
Steven Liu authored
* backbones * fix path * fix paths * fix code snippet * fix links
-
- 25 Jan, 2024 1 commit
-
-
Merve Noyan authored
Update backbones.md
-
- 24 Jan, 2024 1 commit
-
-
Steven Liu authored
* config * optim * pre deploy * deploy * save weights, memory, troubleshoot, non-Trainer * done
-
- 12 Jan, 2024 1 commit
-
-
Joao Gante authored
-
- 02 Jan, 2024 1 commit
-
-
Dean Wyatte authored
update docs around mixing hf scheduler with deepspeed optimizer
-
- 20 Dec, 2023 1 commit
-
-
Steven Liu authored
* fsdp, debugging, gpu selection * fix hfoption * fix
-
- 18 Dec, 2023 1 commit
-
-
Steven Liu authored
* doc fix friday * deprecated objects * update not_doctested * update toctree
-
- 15 Dec, 2023 2 commits
-
-
Steven Liu authored
* mps docs * toctree
-
Steven Liu authored
* first draft * add to toctree * edits * feedback
-
- 11 Dec, 2023 1 commit
-
-
Merve Noyan authored
* Initial commit for AutoBackbone & Backbone * Added timm and clarified out_indices * Swapped the example to out_indices * fix toctree * Update autoclass_tutorial.md * Update backbones.md * Update autoclass_tutorial.md * Add dummy torch input instead * Add dummy torch input * Update autoclass_tutorial.md * Update backbones.md * minor fix * Update docs/source/en/main_classes/backbones.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/autoclass_tutorial.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Added illustrations and explained backbone & neck * Update docs/source/en/main_classes/backbones.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update backbones.md --------- Co-authored-by:
Maria Khalusova <kafooster@gmail.com>
-
- 28 Nov, 2023 1 commit
-
-
Steven Liu authored
* first draft * benchmarks * feedback
-
- 27 Nov, 2023 1 commit
-
-
Peter Pan authored
* docs: replace torch.distributed.run by torchrun `transformers` now officially support pytorch >= 1.10. The entrypoint `torchrun`` is present from 1.10 onwards. Signed-off-by:
Peter Pan <Peter.Pan@daocloud.io> * Update src/transformers/trainer.py with @ArthurZucker's suggestion Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Signed-off-by:
Peter Pan <Peter.Pan@daocloud.io> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 24 Nov, 2023 2 commits
-
-
fxmarty authored
* reflect RoCm support in the documentation * Update docs/source/en/main_classes/trainer.md Co-authored-by:
Lysandre Debut <hi@lysand.re> * fix review comments * use ROCm instead of RoCm --------- Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
Sourab Mangrulkar authored
* add code changes 1. Refactor FSDP 2. Add `--save_only_model` option: When checkpointing, whether to only save the model, or also the optimizer, scheduler & rng state. 3. Bump up the minimum `accelerate` version to `0.21.0` * quality * fix quality? * Revert "fix quality?" This reverts commit 149330a6abc078827be274db84c8a2d26a76eba1. * fix fsdp doc strings * fix quality * Update src/transformers/training_args.py Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * please fix the quality issue
馃槄 * Apply suggestions from code review Co-authored-by:Benjamin Bossan <BenjaminBossan@users.noreply.github.com> * address comment * simplify conditional check as per the comment * update documentation --------- Co-authored-by:
Zach Mueller <muellerzr@gmail.com> Co-authored-by:
Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
-
- 20 Nov, 2023 1 commit
-
-
Peter Pan authored
Signed-off-by:Peter Pan <Peter.Pan@daocloud.io>
-
- 13 Nov, 2023 1 commit
-
-
adismort14 authored
Update pipelines.md
-
- 09 Nov, 2023 1 commit
-
-
Dave Berenbaum authored
* dvclive trainer callback * style fixes * dvclive link fixes
-
- 06 Nov, 2023 2 commits
-
-
Maria Khalusova authored
* fixed links with 404 * make style
-
Arthur authored
-
- 01 Nov, 2023 2 commits
-
-
Marc Sun authored
* add_ xllamav2 arg * add test * style * add check * add doc * replace by use_exllama_v2 * fix tests * fix doc * style * better condition * fix logic * add deprecate msg * deprecate exllama * remove disable_exllama from the linter * remove * fix warning * Revert the commits deprecating exllama * deprecate disable_exllama for use_exllama * fix * fix loading attribute * better handling of args * remove disable_exllama from init and linter * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * better arg * fix warning * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * switch to dict * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * style * nits * style * better tests * style --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Younes Belkada authored
* working v1 * oops * Update src/transformers/modeling_utils.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * fixup * oops * push * more changes * add docs * some fixes * fix copies * add v1 doc * added installation guide * relax constraints * revert * attempt llm-awq * oops * oops * fixup * raise error when incorrect cuda compute capability * nit * add instructions for llm-awq * fixup * fix copies * fixup and docs * change * few changes + add demo * add v1 tests * add autoawq in dockerfile * finalize * Update tests/quantization/autoawq/test_awq.py * fix test * fix * fix issue * Update src/transformers/integrations/awq.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/main_classes/quantization.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/main_classes/quantization.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/integrations/awq.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/integrations/awq.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * add link to example script * Update docs/source/en/main_classes/quantization.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * add more content * add more details * add link to quantization docs * camel case + change backend class name * change to string * fixup * raise errors if libs not installed * change to `bits` and `group_size` * nit * nit * Apply suggestions from code review Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * disable training * address some comments and fix nits * fix * final nits and fix tests * adapt to our new runners * make fix-copies * Update src/transformers/utils/quantization_config.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/utils/quantization_config.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/integrations/awq.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/integrations/awq.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * move to top * add conversion test * final nit * add more elaborated test --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 31 Oct, 2023 2 commits
-
-
Younes Belkada authored
* add v1 neftune * use `unwrap_model` instead * add test + docs * Apply suggestions from code review Co-authored-by:
Zach Mueller <muellerzr@gmail.com> * more details * fixup * Update docs/source/en/main_classes/trainer.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor a bit * more elaborated test * fix unwrap issue --------- Co-authored-by:
Zach Mueller <muellerzr@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Vivek Khandelwal authored
* Add support for loading GPTQ models on CPU Right now, we can only load the GPTQ Quantized model on the CUDA device. The attribute `gptq_supports_cpu` checks if the current auto_gptq version is the one which has the cpu support for the model or not. The larger variants of the model are hard to load/run/trace on the GPU and that's the rationale behind adding this attribute. Signed-Off By: Vivek Khandelwal <vivek@nod-labs.com> * Update quantization.md * Update quantization.md * Update quantization.md
-
- 30 Oct, 2023 1 commit
-
-
Rockerz authored
* add * add * add * Add deepspeed.md * Add * add * Update docs/source/ja/main_classes/callback.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/output.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/pipelines.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/processors.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/processors.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/text_generation.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/main_classes/processors.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update logging.md * Update toctree.yml * Update docs/source/ja/main_classes/deepspeed.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Add suggesitons * m * Update docs/source/ja/main_classes/trainer.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update toctree.yml * Update Quantization.md * Update docs/source/ja/_toctree.yml Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update toctree.yml * Update docs/source/en/main_classes/deepspeed.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/main_classes/deepspeed.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
- 27 Oct, 2023 1 commit
-
- 26 Oct, 2023 1 commit
-
-
Marc Sun authored
* add_ xllamav2 arg * add test * style * add check * add doc * replace by use_exllama_v2 * fix tests * fix doc * style * better condition * fix logic * add deprecate msg
-
- 25 Oct, 2023 1 commit
-
-
Younes Belkada authored
* add `MaskGenerationPipeline` in docs * Update __init__.py * fix repo consistency and clarify docstring * add on check docstirngs * actually we do have a tf sam * oops
-
- 24 Oct, 2023 1 commit
-
-
Leandro von Werra authored
* add info on TRL docs * add TRL link * tweak text * tweak text
-
- 16 Oct, 2023 1 commit
-
-
Shreyas S authored
Update feature_extractor.md
-
- 12 Oct, 2023 2 commits
-
-
Heinz-Alexander Fuetterer authored
-
Lysandre Debut authored
* Logger level Co-authored-by:
Sahil Bhosale <sahilbhosale63@live.com> Co-authored-by:
Adithya4720 <hegdeadithyak@gmail.com> Co-authored-by:
Sachin Singh <sachinishu02@gmail.com> Co-authored-by:
Riya Dhanduke <113622644+riiyaa24@users.noreply.github.com> * More comprehensive documentation --------- Co-authored-by:
Sahil Bhosale <sahilbhosale63@live.com> Co-authored-by:
Adithya4720 <hegdeadithyak@gmail.com> Co-authored-by:
Sachin Singh <sachinishu02@gmail.com> Co-authored-by:
Riya Dhanduke <113622644+riiyaa24@users.noreply.github.com>
-
- 11 Oct, 2023 1 commit
-
-
Ben Gubler authored
* feat: update callback doc to explain disabling callbacks using report_to * docs: update report_to docstring
-
- 10 Oct, 2023 1 commit
-
-
Tuowei Wang authored
-
- 22 Sep, 2023 1 commit
-
-
LeviVasconcelos authored
* Add image to image pipeline Add image to image pipeline * remove swin2sr from tf auto * make ImageToImage importable * make style make style make style make style * remove tf support * remove nonused imports * fix postprocessing * add important comments; add unit tests * add documentation * remove support for TF * make fixup * fix typehint Image.Image * fix documentation code * address review request; fix unittest type checking * address review request; fix unittest type checking * make fixup * address reviews * Update src/transformers/pipelines/image_to_image.py Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> * enhance docs * make style * make style * improve docetest time * improve docetest time * Update tests/pipelines/test_pipelines_image_to_image.py Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> * Update tests/pipelines/test_pipelines_image_to_image.py Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> * make fixup * undo faulty merge * undo faulty merge * add image-to-image to test pipeline mixin * Update src/transformers/pipelines/image_to_image.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/pipelines/test_pipelines_image_to_image.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * improve docs --------- Co-authored-by:
NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by:
Nicolas Patry <patry.nicolas@protonmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 15 Sep, 2023 1 commit
-
-
Matt authored
* Put tokenizer methods in the right alphabetical order in the docs * Quick tweak to ConversationalPipeline * Typo fixes in the developer doc * make fixup
-
- 14 Sep, 2023 1 commit
-
-
Matt authored
* First commit while I figure this out * make fixup * Remove unused method * Store prompt attrib * Fix prompt argument for tests * Make same changes in fast tokenizer * Remove global prompts from fast tokenizer too * stash commit * stash commit * Migrate PromptConfig to its True Final Location * Replace Conversation entirely with the new class * Import/dependency fixes * Import/dependency fixes * Change format for lots of default prompts * More default prompt fixups * Revert llama old methods so we can compare * Fix some default configs * Fix some default configs * Fix misspelled kwarg * Fixes for Blenderbot * make fixup * little rebase cleanup * Add basic documentation * Quick doc fix * Truncate docstring for now * Add handling for the case when messages is a single string * Quick llama merges * Update conversational pipeline and tests * Add a couple of legacy properties for backward compatibility * More legacy handling * Add docstring for build_conversation_input_ids * Restructure PromptConfig * Let's start T E M P L A T I N G * Refactor all default configs to use templates instead * Revert changes to the special token properties since we don't need them anymore * More class templates * Make the sandbox even sandier * Everything replaced with pure templating * Remove docs for PromptConfig * Add testing and optional requirement boilerplate * Fix imports and make fixup * Fix LLaMA tests and add Conversation docstring * Finally get LLaMA working with the template system * Finally get LLaMA working with the template system * make fixup * make fixup * fmt-off for the long lists of test tokens * Rename method to apply_chat_template for now * Start on documentation * Make chat_template a property that reads through to the default if it's not set * Expand docs * Expand chat templating doc some more * trim/lstrip blocks by default and update doc * Few doc tweaks * rebase cleanup * Clarify docstring * rebase cleanup * rebase cleanup * make fixup * Quick doc edit * Reformat the standard template to match ChatML * Re-add PEFT check * Update docs/source/en/chat_templating.md Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Add apply_chat_template to the tokenizer doc * make fixup * Add doc links * Fix chat links * Fix chat links * Explain system messages in the doc * Add chat template test * Proper save-loading for chat template attribute * Add test skips for layout models * Remove _build_conversation_input_ids, add default_chat_template to code_llama * Make sure all LLaMA models are using the latest template * Remove default_system_prompt block in code_llama because it has no default prompt * Update ConversationPipeline preprocess * Add correct #Copied from links to the default_chat_templates * Remove unneeded type checking line * Add a dummy mark_processsed method * Reorganize Conversation to have **deprecated_kwargs * Update chat_templating.md * Quick fix to LLAMA tests * Small doc tweaks * Add proper docstrings and "copied from" statements to all default chat templates * Merge use_default_system_prompt support for code_llama too * Improve clarity around self.chat_template * Docstring fix * Fix blenderbot default template * More doctest fix * Break out some tokenizer kwargs * Update doc to explain default templates * Quick tweaks to tokenizer args * Cleanups for tokenizer args * Add note about cacheing * Quick tweak to the chat-templating doc * Update the LLaMA template with error checking and correct system message embedding * make fixup * make fixup * add requires_jinja * Cleanup to expected output formatting * Add cacheing * Fix typo in llama default template * Update LLaMA tests * Update documentation * Improved legacy handling in the Conversation class * Update Jinja template with proper error handling * Quick bugfix * Proper exception raising * Change cacheing behaviour so it doesn't try to pickle an entire Jinja env * make fixup * rebase cleanup --------- Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 13 Sep, 2023 1 commit
-
-
Maria Khalusova authored
* last hidden state clarification * feedback addressed
-
- 05 Sep, 2023 1 commit
-
-
Julien Chaumond authored
-
- 29 Aug, 2023 2 commits
-
-
Aman Gupta Karmani authored
-
Arup De authored
* add FSDP config option to enable activation-checkpointing * update docs * add checks and remove redundant code * fix formatting error
-