- 11 Oct, 2023 8 commits
-
-
Lysandre Debut authored
-
Sourab Mangrulkar authored
-
Shivanand authored
* remove from utils * updated doc string * only in the model * Update src/transformers/models/swin/modeling_swin.py Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/swin/modeling_swin.py Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com>
-
Patrick von Platen authored
* [Assistant Generation] Improve enc dec * save more * Fix logit processor checks * Clean * make style * fix deprecation * fix generation test * Apply suggestions from code review * fix biogpt * make style
-
Yih-Dar authored
* copied statement for test files --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Ben Gubler authored
* feat: update callback doc to explain disabling callbacks using report_to * docs: update report_to docstring
-
Billy Bradley authored
In assisted decoding, pass model_kwargs to model's forward call (fix prepare_input_for_generation in all models) (#25242) * In assisted decoding, pass model_kwargs to model's forward call Previously, assisted decoding would ignore any additional kwargs that it doesn't explicitly handle. This was inconsistent with other generation methods, which pass the model_kwargs through prepare_inputs_for_generation and forward the returned dict to the model's forward call. The prepare_inputs_for_generation method needs to be amended in all models, as previously it only kept the last input ID when a past_key_values was passed. * Improve variable names in _extend_attention_mask * Refactor extending token_type_ids into a function * Replace deepcopy with copy to optimize performance * Update new persimmon model with llama changes for assisted generation * Update new mistral model for assisted generation with prepare_inputs_for_generation * Update position_ids creation in falcon prepare_inputs_for_generation to support assisted generation
-
Thien Tran authored
* set encoder's PE as non-trainable * freeze flax * init sinusoids * add test for non-trainable embed positions * simplify TF encoder embed_pos * revert tf * clean up * add sinusoidal init for jax * make consistent sinusoidal function * fix dtype * add default dtype * use numpy for sinusoids. fix jax * add sinusoid init for TF * fix * use custom embedding * use specialized init for each impl * fix sinusoids init. add test for pytorch * fix TF dtype * simplify sinusoid init for flax and tf * add tests for TF * change default dtype to float32 * add sinusoid test for flax * Update src/transformers/models/whisper/modeling_flax_whisper.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/whisper/modeling_tf_whisper.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * move sinusoidal init to _init_weights --------- Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co> Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 10 Oct, 2023 6 commits
-
-
Roy Hvaara authored
`jnp.array` is a function, not a type: https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.array.html so it never makes sense to use `jnp.array` in a type annotation. Presumably the intent was to write `jnp.ndarray` aka `jax.Array`. Co-authored-by:
Peter Hawkins <phawkins@google.com>
-
jheitmann authored
-
th茅o gigant authored
* fix a typo in flax t5 attention * fix the typo in flax longt5 attention
-
Pavarissy authored
* Your commit message here * fix LlamaConfig docstring * run make fixup * fix formatting after review reformat of the file to prevent script issues * rerun make fixup after reformat
-
Tuowei Wang authored
-
jiqing-feng authored
* control first downsample stride * reduce first only works for ResNetBottleNeckLayer * fix param name * fix style
-
- 09 Oct, 2023 10 commits
-
-
Isaac Chung authored
fix docstrings for vanilla clip
-
Lysandre Debut authored
* Fix stale bot * Comments
-
Alex Bzdel authored
* removed donutimageprocessor from objects_to_ignore * added docstring for donutimageprocessor * readding donut file * moved docstring to correct location
-
Isaac Chung authored
fix docstring for CLIPImageProcessor
-
Isaac Chung authored
* fix docstrings for CLIP configs * black formatted
-
tom white authored
* fix typos in idefics.md Two typos found in reviewing this documentation. 1) max_new_tokens=4, is not sufficient to generate "Vegetables" as indicated - you will get only "Veget". (incidentally - some mention of how to select this value might be useful as it seems to change in each example) 2) inputs = processor(prompts, return_tensors="pt").to(device) as inputs need to be on the same device (as they are in all other examples on the page) * Update idefics.md Change device to cuda explicitly to match other examples
-
Yih-Dar authored
fix avoid oom Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
D. Carpintero authored
* fix OpenAI GPT, GPT-2 links * fix Llama2 link
-
Shreyas S authored
Update test_integration.py Fixed malapropism clone>copy
-
NielsRogge authored
* Convert checkpoints * Update doc test * Address comment
-
- 06 Oct, 2023 11 commits
-
-
Jabasukuriputo Wang authored
-
Yih-Dar authored
example fix docstring Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Arthur authored
* make sure eos and bos are properly handled for fast tokenizer * fix code llama as well * nits * fix the conversion script as well * fix failing test
-
statelesshz authored
* remove SharedDDP as it was drepracated * apply review suggestion * make style * Oops,forgot to remove the compute_loss context manager in Seq2SeqTrainer. * remove the unnecessary conditional statement * keep the logic of IPEX * clean code * mix precision setup & make fixup --------- Co-authored-by:statelesshz <jihuazhong1@huawei.com>
-
Yih-Dar authored
* fix * fix * Fix * Fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
rui-ren authored
-
Matt authored
-
fxmarty authored
* remove unnecessary unsqueeze-squeeze in llama * correct other models * fix * revert gpt_neox_japanese * fix copie * fix test
-
Tianqi Liu authored
* Update tokenization_code_llama_fast.py * Update test_tokenization_code_llama.py * Update test_tokenization_code_llama.py
-
Towdo authored
-
Ramiro Leal-Cavazos authored
* Remove unnecessary `view` of `position_ids` in `modeling_llama` When `position_ids` is `None`, its value is generated using `torch.arange`, which creates a tensor of size `(seq_length + past_key_values_length) - past_key_values_length = seq_length`. The tensor is then unsqueezed, resulting in a tensor of shape `(1, seq_length)`. This means that the last `view` to a tensor of shape `(-1, seq_length)` is a no-op. This commit removes the unnecessary view. * Remove no-op `view` of `position_ids` in rest of transformer models
-
- 05 Oct, 2023 5 commits
-
-
Yih-Dar authored
Fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Maria Khalusova authored
* build the table in index.md with links to the model_doc * removed list generation on index.md * fixed missing models * make style
-
Yih-Dar authored
Fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
eajechiloae authored
don't close clearml task if it was created externally
-
Marvin Gabler authored
* feat: close #26566, changed model & config files to accept arbitary in and out channels * updated docstrings * fix: linter error * fix: update Copy docstrings * fix: linter update * fix: rename num_channels_in to num_channels to prevent breaking changes * fix: make num_channels_out None per default * Update src/transformers/models/swin2sr/configuration_swin2sr.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix: update tests to include num_channels_out * fix:linter * fix: remove normalization with precomputed rgb values when #input_channels!=#output_channels --------- Co-authored-by:
marvingabler <marvingabler@outlook.de> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-