"vscode:/vscode.git/clone" did not exist on "e1cec43415e72c9853288d4e9325b734d36dd617"
- 05 Aug, 2024 3 commits
-
-
Ita Zaporozhets authored
* save total_vocab_size = vocab_size + user added tokens to speed up operation * updating length when added_tokens_decoder is set * add test len(tokenizer)
-
Raushan Turganbay authored
fix phi
-
TechInterMezzo authored
* fix: SeamlessM4TFeatureExtractor stride remainder * Added attention mask size test * Reran ruff for style correction
-
- 02 Aug, 2024 1 commit
-
-
Joao Gante authored
tests! :D
-
- 01 Aug, 2024 7 commits
-
-
Zach Mueller authored
* Test this zach * Test for improper init w/o zero3 * Move back * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Get rid of stars in warning * Make private * Make clear --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
OsamaS99 authored
* fixed hybrid cache init, added test * Fix Test Typo --------- Co-authored-by:Aaron Haag <aaron.haag@siemens.com>
-
Nikos Karampatziakis authored
* Initial implementation of OffloadedCache * enable usage via cache_implementation * Address feedback, add tests, remove legacy methods. * Remove flash-attn, discover synchronization bugs, fix bugs * Prevent usage in CPU only mode * Add a section about offloaded KV cache to the docs * Fix typos in docs * Clarifications and better explanation of streams
-
Omar Salman authored
* Fix conflicting key in init kwargs in PreTrainedTokenizerBase * Update code to check for callable key in save_pretrained * Apply PR suggestions * Invoke CI * Updates based on PR suggestion
-
Ita Zaporozhets authored
-
Lunwen He authored
* Remove size check between attn_weights and kv_seq_len * add unit tests
-
Sanchit Gandhi authored
* [whisper] compile compatibility with long-form decoding * clarify comment * fix after rebase * finalise * fix bsz * fix cache split * remove contiguous * style * finish * update doc * prevent cuda graph trace
-
- 31 Jul, 2024 4 commits
-
-
fxmarty authored
* draft * apply changes to all relevant archs * rerun ci - check_docstrings.py failing? * fix docstring * move 2D->4D mask creation to modeling file * repo consistency * fix the batch size = 1 case - calling contiguous is not enough * nit * style * propagate to gemma/gemma-2 * prepare inputs for gemma generation * implement test and tiny fix in gemma2 * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix copies * ci pass * fix gemma's test_compile_static_cache tests * flacky * retrigger ci --------- Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
amyeroberts authored
* Fix FA2 call for Perciever layer * [run_slow] idefics2 * [run_slow] idefics2 * [run_slow] idefics2 * Fix up * [run_slow] idefics2 * [run_slow] idefics2 * [run_slow] idefics2
-
Joao Gante authored
fix
💩 -
Raushan Turganbay authored
* enable flash-attn & static cache * this works, not the prev * fix for sliding window layers * not needed anymore
-
- 30 Jul, 2024 1 commit
-
-
Joshua Lochner authored
* Remove user-defined tokens which can be obtained through merges * Remove debug line * formatting * Refactor spm slow -> fast converter * revert unnecessary refactor * set comprehension * remove test files * Use `vocab_scores` * Always replace spiece underline with space in decode * we no longer need token filtering * Add save fast load slow unit test * Remove tokenizers version check * Remove duplicate code * Make `<start_of_turn>` and `<end_of_turn>` special tokens * Bias merge priority with length if score is the same * Add unit test for merge priority * CI
-
- 29 Jul, 2024 5 commits
-
-
Guang Yang authored
-
Sanchit Gandhi authored
* [pipeline] fix padding for 1-d tensors * add test * make style * Update tests/pipelines/test_pipelines_automatic_speech_recognition.py Co-authored-by:
Kamil Akesbi <45195979+kamilakesbi@users.noreply.github.com> * Update tests/pipelines/test_pipelines_automatic_speech_recognition.py --------- Co-authored-by:
Kamil Akesbi <45195979+kamilakesbi@users.noreply.github.com>
-
Kamil Akesbi authored
* fix _fix_key in PreTrainedModel * fix _find_longest_common_sequence * add test * remove result.json * nit * update test
-
Joao Gante authored
* mvp * added test (a few models need fixes) * fix a few test cases * test nits * harder test
😈 * revert changes in stablelm * test with improved condition * add todo * tmp commit * merged with main * nits * add todo * final corrections * add docs for generation compilation * docs nits * add tip * PR suggestions * add more details to the compilation docs * fix cache positions * cache is now init in generate; update docs * tag test as flaky * docs * post rebase make fixup and other nits * remove unintended changes * whisper (encoder-decoder) not supported * move token default updates to ; add tests for token defaults * push changes * manual rebase * chameleon doesn't support this * fix test_static_cache_mha_mqa_gqa (broken in another PR) * docs: dynamic is better with end-to-end compilation -
Raushan Turganbay authored
* bloom dynamic cache * bloom follows standard cache format * no skips for bloom anymore * use cache position when possible * clean up * codestyle * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/bloom/modeling_bloom.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * pr comments * isinstance fix * address comments * make musicgen test happy * [run-slow] bloom --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 26 Jul, 2024 5 commits
-
-
Raushan Turganbay authored
* fix * fix prev test (half of failures) * [run-slow] llama, gemma2 * [run-slow] llama, gemma2
-
Fanli Lin authored
[tests] fix `static` cache implementation is not compatible with `attn_implementation==flash_attention_2` (#32039) * add flash attention check * fix * fix
-
Sai-Suraj-27 authored
* Refactored to remove un-necessary object base class. * small fix.
-
Raushan Turganbay authored
* llava w/o images * tests
-
Raushan Turganbay authored
* fix * move changes to prompt lookup * add test * set eos in assistant model * style * fix flakiness * changes for new `main` * Update tests/generation/test_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/generation/test_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add comment to explain --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 25 Jul, 2024 3 commits
-
-
Yih-Dar authored
* fix * [test_all] trigger full CI --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Kashif Rasul authored
fix E721 warnings
-
Sanchit Gandhi authored
* [whisper] fix short-form output type * add test * make style * update long-form tests * fixes * last fix * finalise test
-
- 24 Jul, 2024 5 commits
-
-
Sai-Suraj-27 authored
Replaced deprecated unittest method with the correct one.
-
Matt authored
* No more default chat templates * Add the template to the GPT-SW3 tests since it's not available by default now * Fix GPT2 test * Fix Bloom test * Fix Bloom test * Remove default templates again
-
Penut Chen authored
* support gguf fp16 * support gguf bf16 with pytorch * add gguf f16 test * remove bf16
-
Joao Gante authored
* relaxed rope check * lets also accept rope_type=None, defaulting to the original implementation * type and rope_type can coexist
-
amyeroberts authored
Remove conversation pipeline tests
-
- 23 Jul, 2024 6 commits
-
-
Sai-Suraj-27 authored
* Updated ruff version and fixed the required code accorindg to the latest version. * Updated ruff version and fixed the required code accorindg to the latest version. * Added noqa directive to ignore 1 error shown by ruff
-
RhuiDih authored
* add DataCollatorBatchFlattening * Update data_collator.py * change name * new FA2 flow if position_ids is provided * add comments * minor fix * minor fix data collator * add test cases for models * add test case for data collator * remove extra code * formating for ruff check and check_repo.py * ruff format ruff format tests src utils * custom_init_isort.py
-
Sanchit Gandhi authored
Revert "Incorrect Whisper long-form decoding timestamps (#32003)" This reverts commit cd48553f.
-
Amit Garg authored
* renamed phi3 rope_scaling type * fixed trailing whitespaces * fixed test * added warning * fixed format
-
Merve Noyan authored
--------- Co-authored-by:Merve Noyan <mervenoyan@Merve-MacBook-Pro.local>
-
Ita Zaporozhets authored
* gguf conversion forces add_prefix_space=False for llama3, this is not required and forces from_slow, which fails. changing to None + test * typo * clean test
-