- 01 Mar, 2024 10 commits
-
-
David Valente authored
* Correct zero division error in inverse sqrt scheduler * default timescale to 10_000
-
Zach Mueller authored
* Fix deprecated arg issue * Trainer check too * Check for dict or dataclass * Simplify, make config always AcceleratorConfig * Upstream to Trainer
-
Marc Sun authored
-
Jingya HUANG authored
enable subfolder
-
amyeroberts authored
* Fix yolos processing * Add back slow marker - protects for pycocotools in slow * Slow decorator goes above copied from header
-
Sanchit Gandhi authored
* [Whisper Tok] Update integration test * make style
-
Arthur authored
* use the generation config 馃珷 * fixup
-
Younes Belkada authored
* fix ESM 8bit * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Leon Engl盲nder authored
* LlamaForQuestionAnswering self.transformer->self.model * fix "Copied from" string * Llama QA model: set base_model_prefix = "transformer"
-
Song Fuchang authored
Expose `offload_buffers` parameter of `accelerate` to `PreTrainedModel.from_pretrained` method (#28755) Expose offload_buffers parameter to from_pretrained method
-
- 29 Feb, 2024 6 commits
-
-
Lucain authored
-
NielsRogge authored
Fix issue
-
Yih-Dar authored
* more fixes * more fixes --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Younes Belkada authored
Update test_modeling_llama.py
-
Younes Belkada authored
fix failing tests for peft integration
-
Younes Belkada authored
change starcoder2 path to correct one
-
- 28 Feb, 2024 14 commits
-
-
Michael authored
* [i18n-zh] Sync source/zh/index.md * apply review comments
-
fxmarty authored
* better unmask imple * comment * typo * bug report pytorch * cleanup * fix import * add back example * retrigger ci * come on
-
Marc Sun authored
* [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
jiqing-feng authored
Co-authored-by:Joao Gante <joao@huggingface.co>
-
Daniel Han authored
* Update modeling_llama.py Llama - Force float32 since bfloat16 loses precision on long contexts * Update modeling_llama.py * Update modeling_gemma.py Fix RoPE and logits.float() * @torch.no_grad() * @torch.no_grad() * Cos, Sin to float32 * cos, sin to float32 * Update src/transformers/models/gemma/modeling_gemma.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Resolve PR conflicts * Fix RoPE for llama * Revert "Fix RoPE for llama" This reverts commit b860a22dab9bb01cd15cb9a3220abeaefad3e458. * Fix RoPE for llama * RoPE device * Autocast device type * RoPE * RoPE isinstance --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
Joao Gante authored
-
Leonardo Emili authored
* Set output_router_logits=False in prepare_inputs_for_generation for mixtral * Add output_router_logits=False to prepare_inputs_for_generation for mixtral * Fix style
-
Arthur authored
* remove control flow * update gptneox * update .... * nits * Actually let's just break. Otherwise we are silently failing which imo is not optimal * version BC * fix tests * fix eager causal * nit * add a test * style * nits * nits * more nits for the test * update and fix * make sure cuda graphs are not skipped * read token is needed for meta llama * update! * fiixup * compile test should be slow * fix thet fix copies * stle 馃珷
-
Arthur authored
* remove warning * add co-author * update --------- Co-authored-by:hiaoxui <hiaoxui@users.noreply.github.com>
-
Arthur authored
fix wrapper
-
fxmarty authored
* remove numpy usage from owlvit * fix init owlv2 * style
-
Younes Belkada authored
* pu hf token in gemma tests * update suggestion * add to flax * revert * fix * fixup * forward contrib credits from discussion --------- Co-authored-by:ArthurZucker <ArthurZucker@users.noreply.github.com>
-
Jared Van Bortel authored
-
RaymondLi0 authored
* Copy model * changes * misc * fixes * add embed and residual dropout (#30) * misc * remove rms norm and gated MLP * remove copied mentions where its not a copy anymore * remove unused _shape * copied from mistral instead * fix copies * fix copies * add not doctested * fix * fix copyright * Update docs/source/en/model_doc/starcoder2.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/starcoder2/configuration_starcoder2.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix doc * revert some changes * add fa2 tests * fix styling nit * fix * push dummy docs --------- Co-authored-by:
Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com> Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 27 Feb, 2024 10 commits
-
-
Michael authored
* [i18n-zh] Translate fsdp.md into Chinese Signed-off-by:
windsonsea <haifeng.yao@daocloud.io> * apply suggestions from Fan-Lin --------- Signed-off-by:
windsonsea <haifeng.yao@daocloud.io>
-
Sadra Barikbin authored
Co-authored-by:Joao Gante <joao@huggingface.co>
-
Raushan Turganbay authored
-
Marc Sun authored
* Add compatibility with mps device * fix * typo and style
-
Yih-Dar authored
update Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Fanli Lin authored
* add xpu for benchmark * no auto_map * use require_torch_gpu * use gpu * revert * revert * fix style
-
fxmarty authored
fix
-
Merve Noyan authored
* Image Feature Extraction docs * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update image_feature_extraction.md * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address comments * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/tasks/image_feature_extraction.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update image_feature_extraction.md * Update image_feature_extraction.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Maria Khalusova <kafooster@gmail.com>
-
Andrei Panferov authored
Cleaner Cache `dtype` and `device` extraction for CUDA graph generation for quantizers compatibility (#29079) * input_layernorm as the beacon of hope * cleaner dtype extraction * AQLM + CUDA graph test * is available check * shorter text test
-
regisss authored
-