- 23 Feb, 2024 6 commits
-
-
Yih-Dar authored
* Use torch 2.2 for daily CI (model tests) * update * update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Matt authored
* stash commit * stash commit * It works! * Remove unnecessary change * We don't actually need the cache_dir! * Update docstring * Add test * Add test with custom cache dir too * Update model repo path
-
Arthur authored
* update model doc qwen2 * Update docs/source/en/model_doc/qwen2.md Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Alessandro Palla authored
* Fix issue 29206 * Fix style
-
Amin authored
* Fix missing translation in README_ru * Update README_ru.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> --------- Co-authored-by:
Maria Khalusova <kafooster@gmail.com>
-
cchen-dialpad authored
* fix(mlflow): check mlflow version to use the flag * fix indent * add log_params async and fix quality
-
- 22 Feb, 2024 3 commits
-
-
fxmarty authored
* fix torch.export.export for llama * do not change doc title * make fix copies
-
NielsRogge authored
* Improve docs * Improve chat template
-
Sanchit Gandhi authored
* fix modelling code * add tests * fix tests * add some logit tests * style * fix fix
-
- 21 Feb, 2024 8 commits
-
-
Andrei Panferov authored
* training version check * warn old aqlm * aqlm 1.0.2 real * docs
-
Younes Belkada authored
fix bad rebase
-
Arthur authored
* inital commit * update * update conversion checkpoint * update conversion script * nits * some fixes * nits * merge * fix permute * nits * fix * nits * nits * nits * fix rope * fix both rope * nites * style * make sure flax works * fix flax init code * fix foward * nits * print flax generation out * current code * nits * SIIIIIIIIIIIIIIIIIII * update * add new tokenizer * correct fast tokenizer * fix conversion * more comments * fix modeling and conversion * nits and nits * nits testing * add some tokenization tests * add some edge cases * add slow tests and fix them * fixup * fix copies for modeling * fix copies * add 7B slow tests * fix * fix * fix tests * make tokenizer cis go green * styling * last tokenizer nits * update jax tests * fix flax for 7b * add jit testing
馃 * cleanups * isolated nit, inv_freq for rotary_emb.inv_freq * propagate to jax * Apply suggestions from code review Co-authored-by:Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adjust test * fix conversion script * change name * correct file names * update conversion script * Fix bos and eos token ids in the model configuration (#3) * update modelling * update conversion script * add static cache for gemma * fix sdpa generate * fix batched * multiple fixes * fix FA2 * final fix * Rename a few missing strings and filenames (#4) * merge with upstream main * fix copies * fix copies * fix fixup * fix fixup * fix * fix * final tests * fix fx gemma tests * fix fx bf16/fp16 tests * update slow fx tests * fx slow tests: one logits, one generation * move jit test standalone * Apply suggestions from code review * nits * tokenizer updates * more tokenization updates: custom GemmaSentencepieceExtrator * style * Update src/transformers/cache_utils.py * Update src/transformers/models/gemma/__init__.py * Update tests/models/gemma/test_modeling_flax_gemma.py * small nits * style * update tokenization test * fix the rotary embedding * with style * fix slow tests * WARNING this commit might be very important for precisions * Update tests/models/gemma/test_modeling_flax_gemma.py * Update src/transformers/models/gemma/configuration_gemma.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * Update src/transformers/models/gemma/modeling_flax_gemma.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * small nits here and there! * forgotten nit * remove on the fly computation of inv_freq * revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float * Apply suggestions from code review Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_flax_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * nit conversion script link * fix some tests * add not doctest and pr doctest * repo consistency * fix last CIs
馃殌 * update all readmes --------- Co-authored-by:younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co> Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
amyeroberts authored
Safe getattr
-
Ekaterina Aidova authored
* support SDPA Attention in stablelm * add integration test * add fallback for output_attentions * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/stablelm/test_modeling_stablelm.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * handle non-contiguous states --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
fxmarty authored
* fix compatibility * working version * cleanup * sanity checks * more sanity * working version WITH refactor * working without API change * cleanup & tests pass * more cleaning * fix test * fix tests * Update src/transformers/generation/utils.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * smaller comment * update comment * update comment --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Joao Gante authored
-
Arthur Zucker authored
-
- 20 Feb, 2024 20 commits
-
-
amyeroberts authored
* Add pool option * PR comments - error message and exact outputs check
-
Fernando P茅rez-Garc铆a authored
Fix drop path not being used
-
Gustavo Isturiz authored
added image_captioning version in es and included in toctree file
-
Joao Gante authored
-
Pablo Montalvo authored
* draft processor arg capture * add missing vivit model * add new common test for image preprocess signature * fix quality * fix up * add back missing validations * quality * move info level to warning for unused kwargs
-
JB (Don) authored
-
Yih-Dar authored
nice job Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Taylor Jackle Spriggs authored
* add support for siglip and chinese-clip model training with contrastive-image-text example * codebase fixups
-
amyeroberts authored
* Revert "Add tie_weights() to LM heads and set bias in set_output_embeddings() (#28948)" This reverts commit 725f4ad1. * Revert "Patch to skip failing `test_save_load_low_cpu_mem_usage` tests (#29043)" This reverts commit 4156f517.
-
Arthur authored
* add add_dummy_prefix_space option to slow * checking kwargs might be better. Should be there for all spm tokenizer IMO * nits * fix copies * more copied * nits * add prefix space * nit * nits * Update src/transformers/convert_slow_tokenizer.py * fix inti * revert wrong styling * fix * nits * style * updates * make sure we use slow tokenizer for conversion instead of looking for the decoder * support llama ast well * update llama tokenizer fast * nits * nits nits nits * update the doc * update * update to fix tests * skip unrelated tailing test * Update src/transformers/convert_slow_tokenizer.py * add proper testing * test decode as well * more testing * format * fix llama test * Apply suggestions from code review
-
Younes Belkada authored
* handle peft + compiled models * add tests * fixup * adapt from suggestions * clarify comment
-
Arthur authored
* only compile when needed * fix mra as well * fix yoso as well * update * rempve comment * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py * Update src/transformers/models/deformable_detr/modeling_deformable_detr.py * opps * Update src/transformers/models/deta/modeling_deta.py * nit
-
Joao Gante authored
-
Joao Gante authored
-
Younes Belkada authored
* forgot to push the changes for 4bit .. * trigger CI
-
Pablo Montalvo authored
* abstract image processor arg checks. * fix signatures and quality * add validate_ method to rescale-prone processors * add more validations * quality * quality * fix formatting Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix formatting Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix formatting Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Fix formatting mishap Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix crop_size compatibility * fix default mutable arg * fix segmentation map + image arg validity * remove segmentation check from arg validation * fix quality * fix missing segmap * protect PILImageResampling type * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add back segmentation maps check --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Younes Belkada authored
* add RMSProp to Trainer * revert some change * Update src/transformers/trainer.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Erich Schubert authored
Move misplaced line, improve code comment
-
Arthur authored
* default to use it * style
-
Nilesh authored
* Fixed nll with label_smoothing to nll * Resolved conflict by rebase * Fixed nll with label_smoothing to nll * Resolved conflict by rebase * Added label_smoothing to config file * Fixed nits
-
- 19 Feb, 2024 3 commits
-
-
Shijie Wu authored
* report grad_norm during training * support getting grad_norm from deepspeed
-
Sadra Barikbin authored
* Update base.py * Fix a typo
-
Titus authored
* generated text on A10G * generated text in CI * Apply suggestions from code review add explanatory comments Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-