"examples/research_projects/distillation/train.py" did not exist on "c51e533a5febe3fae2bb33b060f6b1f36a92e003"
- 08 Apr, 2024 7 commits
-
-
JINO ROHIT authored
-
Fanli Lin authored
* add bnb flag * move maker * add accelerator maker
-
Haz Sameen Shahgir authored
updated examples/pytorch/language-modeling scripts and requirements.txt to require datasets>=2.14.0 (#30120) updated requirements.txt and require_version() calls in examples/pytorch/language-modeling to require datasets>=2.14.0
-
Howard Liberty authored
* Make MLFlow version detection more robust and handles mlflow-skinny * Make function name more clear and refactor the logic * Further refactor
-
Xu Song authored
-
vaibhavagg303 authored
* add _torch_extract_fbank_features_batch function in feature_extractor_whisper * reformat feature_extraction_whisper.py file * handle batching in single function * add gpu test & doc * add batch test & device in each __call__ * add device arg in doc string --------- Co-authored-by:vaibhav.aggarwal <vaibhav.aggarwal@sprinklr.com>
-
Cylis authored
-
- 05 Apr, 2024 13 commits
-
-
Raushan Turganbay authored
* clean-up whisper kwargs * failing test
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Kola authored
* Add docstrings and types for MambaCache * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py * Update src/transformers/models/mamba/modeling_mamba.py * make fixup * import copy in generation_whisper * ruff * Revert "make fixup" This reverts commit c4fedd6f60e3b0f11974a11433bc130478829a5c.
-
Yih-Dar authored
* separate jobs * separate jobs * use channel name directly instead of ID * use channel name directly instead of ID * use channel name directly instead of ID --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Michael Benayoun authored
* [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * [WIP] fix fx * Apply changes to other models
-
Yih-Dar authored
* fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
miRx923 authored
Update quantizer_bnb_4bit.py: In the ValueError string there should be "....you need to set `llm_int8_enable_fp32_cpu_offload=True`...." instead of "`load_in_8bit_fp32_cpu_offload=True`". (#30013) * Update quantizer_bnb_4bit.py There is an mistake in ValueError on line 86 of quantizer_bnb_4bit.py. In the error string there should be "....you need to set `llm_int8_enable_fp32_cpu_offload=True`...." instead of "load_in_8bit_fp32_cpu_offload=True". I think you updated the BitsAndBytesConfig() arguments, but forgot to change the ValueError in quantizer_bnb_4bit.py. * Update quantizer_bnb_4bit.py Changed ValueError string "...you need to set load_in_8bit_fp32_cpu_offload=True..." to "....you need to set llm_int8_enable_fp32_cpu_offload=True...."
-
Marc Sun authored
fix bnb test
-
NielsRogge authored
* Add image processor to trainer * Replace tokenizer=image_processor everywhere
-
Adam Louly authored
* fix mixtral onnx export * fix qwen model
-
Wang, Yi authored
* if output is tuple like facebook/hf-seamless-m4t-medium, waveform is the first element Signed-off-by:
Wang, Yi <yi.a.wang@intel.com> * add test and fix batch issue Signed-off-by:
Wang, Yi <yi.a.wang@intel.com> * add dict output support for seamless_m4t Signed-off-by:
Wang, Yi <yi.a.wang@intel.com> --------- Signed-off-by:
Wang, Yi <yi.a.wang@intel.com>
-
Yih-Dar authored
skip test_encode_decode_fast_slow_all_tokens for now Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
Add whisper Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 04 Apr, 2024 3 commits
-
-
Saurabh Dash authored
* changes * addressing comments * smol fix
-
byi8220 authored
* Defaulted IdeficsProcessor padding to 'longest', removed manual padding * make fixup * Defaulted processor call to padding=False * Add padding to processor call in IdeficsModelIntegrationTest as well * Defaulted IdeficsProcessor padding to 'longest', removed manual padding * make fixup * Defaulted processor call to padding=False * Add padding to processor call in IdeficsModelIntegrationTest as well * redefaulted padding=longest again * fixup/doc
-
byi8220 authored
* implement convert_mamba_ssm_checkpoint_to_pytorch * Add test test_model_from_mamba_ssm_conversion * moved convert_ssm_config_to_hf_config to inside mamba_ssm_available check * fix skipif clause * moved skips to inside test since skipif decorator isn't working for some reason * Added validation * removed test * fixup * only compare logits * remove weight rename * Update src/transformers/models/mamba/convert_mamba_ssm_checkpoint_to_pytorch.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * nits --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 03 Apr, 2024 12 commits
-
-
Jacky Lee authored
feat: enable mult-idevice for efficientnet
-
Zach Mueller authored
* Docstring to note about zero init * Check for accelerate * Change conditional return * Tweak * Add new accelerate-specific zero3 check * Fix import * Revert to RTFM * Update src/transformers/modeling_utils.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Arthur authored
* fix * sort imports
-
Raushan Turganbay authored
quick fix
-
Steven Liu authored
new audio file
-
Raushan Turganbay authored
* fix vipllava generation * consistent llava code * revert llava tests changes
-
Ondřej Cífka authored
* Fix is_scores_logprobs in WhisperNoSpeechDetection * Add test_whisper_longform_no_speech_detection * Fix typo
-
Ondřej Cífka authored
* Fix generate_with_fallback **kwargs * Change pop to get * Delete keys from kwargs to prevent overriding generation_config * Revert to passing kwargs by reference, but make a (shallow) copy * dict -> copy.copy * Add test_whisper_longform_multi_batch_beam
-
Ren Xuancheng authored
qwen2: fixed tokens starting with # in slow tokenizer; add tests Co-authored-by:jklj077 <17811943+jklj077@users.noreply.github.com>
-
Miguel Almeida authored
To address the issue of NaN logit outputs for certain combinations of the `image_size`, `patch_size` and `depths` configuration parameters, an assertion was made to ensure that the resulting `window_size` field in the model's Self Attention class is greater than 1, preventing divisions by zero in the normalization of `relative_coords_table`. Fix: #28675
-
fxmarty authored
* fix encodec onnx export for musicgen * simplification * fix quality * better style
-
Yih-Dar authored
update Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 02 Apr, 2024 5 commits
-
-
Mario Šaško authored
-
Joao Gante authored
* fix norm * fix logits processors doctests
-
Nicolas Patry authored
* Hard error when ignoring tensors. (#27484) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add small tests. * Dead variable. * Fixup. * Fixing tied_Weights_keys on generic models. * Fixup + T5 encoder/decoder tying (with different layers) * Code quality. * Dynamic member. * trigger * Fixing encoder name for other types of encoder/decoder combos. * Fix scoping. * Update .github/workflows/self-scheduled.yml Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fixing the tied_weights after the call. --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Minsub Lee (Matt) authored
* Fix skip_special_tokens process for Wav2Vec2CTCTokenizer._decode * Fix skip_special_tokens for Wav2Vec2CTCTokenizer._decode * Exclude pad_token filtering since it is used as CTC-blank token * Add small test for skip_special_tokens * Update decoding test for added new token
-
Michael authored
-