"configs/vscode:/vscode.git/clone" did not exist on "be5fdae5739283dd782e1c3029eaec075900b3f4"
- 07 Aug, 2024 7 commits
-
-
Jiwook Han authored
* docs: ko: tasks/images_feature_extraction.md * feat: nmt draft * fix: manual edits * fix: manual edits * fix: manual edits * fix: manual edits * feat: manual edits * Update docs/source/ko/tasks/image_feature_extraction.md Co-authored-by:
Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * Update docs/source/ko/tasks/image_feature_extraction.md Co-authored-by:
Jihun Lim <31366038+heuristicwave@users.noreply.github.com> * fix: manual edits --------- Co-authored-by:
Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
-
Sungmin Oh authored
* docs: ko: quantization/quanto.md * feat: nmt draft * fix: resolve suggestions Co-authored-by:
SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by:
Minki Kim <100768622+1kmmk1@users.noreply.github.com> Co-authored-by:
김준재 <55151385+junejae@users.noreply.github.com> * fix: resolve suggestions Co-authored-by:
SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> --------- Co-authored-by:
SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by:
Minki Kim <100768622+1kmmk1@users.noreply.github.com> Co-authored-by:
김준재 <55151385+junejae@users.noreply.github.com>
-
Chaewon Song authored
* docs: ko: tasks/prompting.md * feat: nmt-draft * fix: update translation in prompting.md * fix: update toctree.yml * fix: manual edits * fix: toctree edits * fix: resolve suggestions Co-authored-by:
boyunJang <gobook1234@naver.com> Co-authored-by:
Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by:
timdalxx <48753785+jeongiin@users.noreply.github.com> --------- Co-authored-by:
boyunJang <gobook1234@naver.com> Co-authored-by:
Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by:
timdalxx <48753785+jeongiin@users.noreply.github.com>
-
Minki Kim authored
* fix: manual edits * fix: manual edits2 * fix: delete files * fix: resolve suggestions Co-authored-by:
Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by:
SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by:
김준재 <55151385+junejae@users.noreply.github.com> * fix: resolve suggestions Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Sungmin Oh <fabxoe.kor@gmail.com> Co-authored-by:
SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com> Co-authored-by:
김준재 <55151385+junejae@users.noreply.github.com> Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Joao Gante authored
* logits * words
-
Aymeric Roucher authored
* Allow optional use of grammars to constrain generation
-
Raushan Turganbay authored
* gemma2 fallback to dynamic cache * Update src/transformers/models/gemma2/modeling_gemma2.py Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/models/gemma2/modeling_gemma2.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * raise error and dont fallback to dynamic cache * prev will break most forward calls/tests * Update src/transformers/models/gemma2/modeling_gemma2.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * update * fix copies --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 06 Aug, 2024 7 commits
-
-
HyunJi Shin authored
* docs: ko: tasks/image_to_image.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by:
Jihun Lim <31366038+heuristicwave@users.noreply.github.com> Co-authored-by:
Jiwook Han <33192762+mreraser@users.noreply.github.com> * fix: handle remaining suggestions Co-authored-by:
Jiwook Han <33192762+mreraser@users.noreply.github.com> --------- Co-authored-by:
Jihun Lim <31366038+heuristicwave@users.noreply.github.com> Co-authored-by:
Jiwook Han <33192762+mreraser@users.noreply.github.com>
-
boyunJang authored
* docs: ko: tasks/idefics.md * feat: nmt draft * fix: manual edits * fix: resolve suggestions Co-authored-by:
Chaewon Song <chaewon1019@ewhain.net> Co-authored-by:
Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by:
timdalxx <48753785+jeongiin@users.noreply.github.com> --------- Co-authored-by:
Chaewon Song <chaewon1019@ewhain.net> Co-authored-by:
Harheem Kim <49297157+harheem@users.noreply.github.com> Co-authored-by:
timdalxx <48753785+jeongiin@users.noreply.github.com>
-
timdalxx authored
* docs: ko: tasks/mask_generation.md * feat: nmt draft * fix : toc local * fix : manual edits * fix : ko-toctree * fix: resolve suggestions Co-authored-by:
boyunJang <gobook1234@naver.com> Co-authored-by:
Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions Co-authored-by:
boyunJang <gobook1234@naver.com> Co-authored-by:
Chaewon Song <chaewon1019@ewhain.net> * fix: resolve suggestions * fix: resolve suggestions * fix: resolve suggestions --------- Co-authored-by:
boyunJang <gobook1234@naver.com> Co-authored-by:
Chaewon Song <chaewon1019@ewhain.net>
-
Chris Toukmaji authored
Update nllb.md
-
Pablo Montalvo authored
* add new model like * draft cuda forward - mismatched keys (sharding on conv1) * match keys successfully * fix split * get generation/forward running (wrong gens, norm?) * :update * some refactoring * fixes * works up until copy to cache * fix * update * NON WORKING VERSION * version that work? * nit * fix config * fix conversion script * working cuda forward * nit * update * simplifcation * make mamba slow simple work * no einops * todo * fix style * no einops * update fix no einsum * nit * remove einops * bug: scan_output differs strongly * add rms norm option * fix fast + slow generation with and w/o cache
✔ * draft integration tests * remove a big chunk of the einsum * fix slow, fast generations, without any einsum * fix copies * fix structure * fix up modeling and tests * fix tests * clamping is indeed worse * recover mamba2 cache test * fix copies * no cache position (yet) * fix tf tests * fix matmul for generate * fixup * skip cache tests for now * [run-slow]mamba2 * tune out hidden states for padding * test batched generation * propagate attention mask changes * fix past length * fix integration test * style * address comments * update readme * add mamba2 version check * fix tests * [run-slow]mamba2 * skip edge tests * [run-slow]mamba2 * last fixup * [run-slow]mamba2 * update README --------- Co-authored-by:Arthur Zucker <arthur.zucker@gmail.com>
-
Ao Tang authored
* Add nemotron support * fix inference * add unit test * add layernorm1p as a class to avoid meta device mismatch * test fixed * Add copied_from statements * remove pretraining_tp args * remove nemotronlayernorm * force LN computation done in FP32 * remove nemotrontokenizer and use llamatokenizer * license update * add option for kv_channels for minitron8b * remove assert * o_proj fixed * o_proj reshape * add gated_proj option * typo * remove todos * fix broken test after merging latest main * remove nezha/nat after meging main * chnage default config to 15b model * add nemo conversion script * rename conversion script * remove gate_proj option * pr comment resolved * fix unit test * rename kv_channels to head_dim * resolve PR issue * add nemotron md * fix broken tests * refactor rope for nemotron * test fix * remove linearscaling * whitespace and import * fix some copied-from * code style fix * reformatted * add position_embedding to nemotronattention * rope refactor to only use config, copied-from fix * format * Run make fix-copies * nemotron md with autodoc * doc fix * fix order * pass check_config_docstrings.py * fix config_attributes * remove all llama BC related code * Use PreTrainedTokenizerFast * ruff check examples * conversion script update * add nemotron to toctree
-
Raushan Turganbay authored
* draft * updates * works? * try adding python example in hidden section * another try * hwo do i render python * format as html code? * Update docs/source/en/kv_cache.md Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update docs/source/en/kv_cache.md Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update docs/source/en/kv_cache.md Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update docs/source/en/kv_cache.md Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * Update docs/source/en/kv_cache.md Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com> * one more small update * should render hidden secrtion now * add outputs * fix links * check links * update all links * update with offloaded cache * all cache is importable, so they appear in docs * fix copies * docstring... --------- Co-authored-by:
Joao Gante <joaofranciscocardosogante@gmail.com>
-
- 05 Aug, 2024 1 commit
-
-
Francisco Kurucz authored
-
- 02 Aug, 2024 1 commit
-
-
Raushan Turganbay authored
nits
-
- 01 Aug, 2024 2 commits
-
-
Nikos Karampatziakis authored
* Initial implementation of OffloadedCache * enable usage via cache_implementation * Address feedback, add tests, remove legacy methods. * Remove flash-attn, discover synchronization bugs, fix bugs * Prevent usage in CPU only mode * Add a section about offloaded KV cache to the docs * Fix typos in docs * Clarifications and better explanation of streams
-
Sanchit Gandhi authored
* [whisper] compile compatibility with long-form decoding * clarify comment * fix after rebase * finalise * fix bsz * fix cache split * remove contiguous * style * finish * update doc * prevent cuda graph trace
-
- 30 Jul, 2024 2 commits
-
-
Joao Gante authored
* doc formatting nits * ignore non-autodocs * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/esm/modeling_esm.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/esm/modeling_esm.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * make fixup --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Gilad Turok authored
Docs: fix GaLore optimizer example Fix incorrect usage of GaLore optimizer in Transformers trainer code example. The GaLore optimizer uses low-rank gradient updates to reduce memory usage. GaLore is quite popular and is implemented by the authors in [https://github.com/jiaweizzhao/GaLore](https://github.com/jiaweizzhao/GaLore). A few months ago GaLore was added to the HuggingFace Transformers library in https://github.com/huggingface/transformers/pull/29588. Documentation of the Trainer module includes a few code examples of how to use GaLore. However, the `optim_targe_modules` argument to the `TrainingArguments` function is incorrect, as discussed in https://github.com/huggingface/transformers/pull/29588#issuecomment-2006289512. This pull request fixes this issue.
-
- 29 Jul, 2024 3 commits
-
-
Aymeric Roucher authored
* Add stream_to_gradio method for running agent in gradio demo
-
Joao Gante authored
* mvp * added test (a few models need fixes) * fix a few test cases * test nits * harder test
😈 * revert changes in stablelm * test with improved condition * add todo * tmp commit * merged with main * nits * add todo * final corrections * add docs for generation compilation * docs nits * add tip * PR suggestions * add more details to the compilation docs * fix cache positions * cache is now init in generate; update docs * tag test as flaky * docs * post rebase make fixup and other nits * remove unintended changes * whisper (encoder-decoder) not supported * move token default updates to ; add tests for token defaults * push changes * manual rebase * chameleon doesn't support this * fix test_static_cache_mha_mqa_gqa (broken in another PR) * docs: dynamic is better with end-to-end compilation -
Sai-Suraj-27 authored
Fixed a link in docs.
-
- 25 Jul, 2024 2 commits
-
-
Pavel Iakubovskii authored
Fix code snippet for grounding-dino
-
Huazhong Ji authored
* translate philosophy.md to chinese * add the missing link
-
- 24 Jul, 2024 2 commits
-
-
Matt authored
* No more default chat templates * Add the template to the GPT-SW3 tests since it's not available by default now * Fix GPT2 test * Fix Bloom test * Fix Bloom test * Remove default templates again
-
Dr. Artificial曾小健 authored
* Update qwen2.md outdated description * Update qwen2.md amended * Update qwen2.md Update * Update qwen2.md fix wrong version code, now good to go
-
- 23 Jul, 2024 4 commits
-
-
Fanli Lin authored
fix
-
RhuiDih authored
* add DataCollatorBatchFlattening * Update data_collator.py * change name * new FA2 flow if position_ids is provided * add comments * minor fix * minor fix data collator * add test cases for models * add test case for data collator * remove extra code * formating for ruff check and check_repo.py * ruff format ruff format tests src utils * custom_init_isort.py
-
Raushan Turganbay authored
* pad on right if training * docs * add tests
-
James Thewlis authored
* Add llama3-llava-next-8b to llava_next conversion script Adds support for the lmms-lab/llama3-llava-next-8b model to the convert_llava_next_weights_to_hf.py script, along with an example prompt generated from the llava_llama_3 conv_template in the LLaVA-NeXT repo. * Exclude <|begin_of_text|> from prompt example This token gets added automatically, so it should not be included in the prompt example. * Add llava-next-72b and llava-next-110b Adds the Qwen-based LLaVA-Next models to the conversion script, along with changes to load the models on multiple GPUs for inference. * Add llama3 and qwen prompt formats to docs * Chat prompt and padding side left for llama3 batched * update * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove code * better naming --------- Co-authored-by:
raushan <raushan@huggingface.co> Co-authored-by:
Raushan Turganbay <raushan.turganbay@alumni.nu.edu.kz> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 22 Jul, 2024 4 commits
-
-
Marc Sun authored
* Add new quant method * update * fix multi-device * add test * add offload * style * style * add simple example * initial doc * docstring * style again * works ? * better docs * switch to non persistant * remove print * fix init * code review
-
Bertrand Thia authored
* minor edits and clarifications * address comment Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Woojun Jung authored
update `ko/_toctree.yml` and remove `custom_tools.md`
-
Lucain authored
-
- 19 Jul, 2024 5 commits
-
-
Raushan Turganbay authored
fixes
-
Merve Noyan authored
* Add image-text-to-text task page * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Address comments * Fix heading * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_text_to_text.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address comments * Update image_text_to_text.md --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Merve Noyan authored
* Fixes * Let's not use auto
-
Raushan Turganbay authored
fix chat format
-
NielsRogge authored
* Improve docs * Fix docs * Fix code snippet
-