- 14 Dec, 2023 1 commit
-
-
Arthur authored
-
- 13 Dec, 2023 10 commits
-
-
Marc Sun authored
* add inside_layer_modules arg * fix * change to modules_to_quantize_inside_block * fix * remane again * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * better docsting * fix again with less explanation * Update src/transformers/utils/quantization_config.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * style --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Rockerz authored
* upfaste * Update * Update docs/source/ja/model_doc/deformable_detr.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/data2vec.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/cvt.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * add suggestions * Toctree update * remove git references * Update docs/source/ja/_toctree.yml Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/decision_transformer.md Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com>
-
Lysandre authored
-
Aaron Jimenez authored
* Add glossary to es/_toctree.yml * Add glossary.md to es/ * A section translated * B and C section translated * Fix typo in en/glossary.md C section * D section translated | Add a extra line in en/glossary.md * E and F section translated | Fix typo in en/glossary.md * Fix words preentrenado * H and I section translated | Fix typo in en/glossary.md * L section translated * M and N section translated * P section translated * R section translated * S section translated * T section translated * U and Z section translated | Fix TensorParallel link in both files * Fix word
-
Zach Mueller authored
* Fix bug * Write test * Keep back old modification for grad accum steps * Whitespace... * Whitespace again * Race condition * Wait for everyone
-
Arthur authored
* fix expected values * style * test is slow
-
Arindam Jati authored
* fix slow tests * revert formatting --------- Co-authored-by:
Arindam Jati <arindam.jati@ibm.com> Co-authored-by:
Kashif Rasul <kashif.rasul@gmail.com>
-
Younes Belkada authored
* v1 * add-new-model-like * revert * fix forward and conversion script * revert * fix copies * fixup * fix * Update docs/source/en/index.md * Apply suggestions from code review * push * fix * fixes here and there * up * fixup and fix tests * Apply suggestions from code review * add docs * fixup * fixes * docstring * add docstring * fixup * docstring * fixup * nit * docs * more copies * fix copies * nit * update test
-
Arthur authored
* [`Whisper`] raise better erros fixes #27893 * update torch as well
-
Arthur authored
* nits * nits * actual fix * style * ze fix * fix fix fix style
-
- 12 Dec, 2023 7 commits
-
-
Dave Berenbaum authored
-
Stas Bekman authored
-
fxmarty authored
* fix sdpa with non-contiguous inputs for gpt_bigcode * fix other archs * add currently comment * format
-
Matt authored
* Improve the error printed when loading an unrecognized architecture * Improve the error printed when loading an unrecognized architecture * Raise a ValueError instead because KeyError prints weirdly * make fixup
-
saswatmeher authored
Update the link for vision encoder decoder doc used by FlaxVisionEncoderDecoderModel link.
-
Arthur authored
* fix loss computation * compute on GPU if possible
-
Joao Gante authored
Co-authored-by:Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 11 Dec, 2023 22 commits
-
-
Anthony Susevski authored
* fixed typos (issue 27919) * Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
dancingpipi authored
* Support PeftModel signature inspect * Use get_base_model() to get the base model --------- Co-authored-by:shujunhua1 <shujunhua1@jd.com>
-
Steven Liu authored
streamline
-
NielsRogge authored
Update formats
-
Younes Belkada authored
up
-
Adam Louly authored
* fix no sequence length models error * block size check --------- Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
-
Ashish Tawari authored
Update modeling_timesformer.py Fixing typo to correct the stochastic depth decay rule
-
Chenhao Xu authored
fix bug: cost matrix is infeasible
-
rjenc29 authored
* fix a typo and add an illustrative test * appease black * reduce code duplication and add Annotion type back with a pending deprecation warning * remove unused code * change warning type * black formatting fix * change enum deprecation approach to support 3.8 and earlier * add stacklevel * fix black issue * fix ruff issues * fix ruff issues * move tests to own mixin * include yolos * fix black formatting issue * fix black formatting issue * use logger instead of warnings and include target version for deprecation
-
Ella Charlaix authored
* add deepspeed scheduled test for amd * fix image * add dockerfile * add comment * enable tests * trigger * remove trigger for this branch * trigger * change runner env to trigger the docker build image test * use new docker image * remove test suffix from docker image tag * replace test docker image with original image * push new image * Trigger * add back amd tests * fix typo * add amd tests back * fix * comment until docker image build scheduled test fix * remove deprecated deepspeed build option * upgrade torch * update docker & make tests pass * Update docker/transformers-pytorch-deepspeed-amd-gpu/Dockerfile * fix * tmp disable test * precompile deepspeed to avoid timeout during tests * fix comment * trigger deepspeed tests with new image * comment tests * trigger * add sklearn dependency to fix slow tests * enable back other tests * final update --------- Co-authored-by:
Felix Marty <felix@hf.co> Co-authored-by:
F茅lix Marty <9808326+fxmarty@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Peter G枚tz authored
"text input must of type" -> "text input must be of type"
-
Timon K盲ch authored
fix parameter count in readme
-
NielsRogge authored
* Update import message * Update message
-
Zach Mueller authored
* Fix test for multi-GPU * WIth CPU handle
-
Merve Noyan authored
* Initial commit for AutoBackbone & Backbone * Added timm and clarified out_indices * Swapped the example to out_indices * fix toctree * Update autoclass_tutorial.md * Update backbones.md * Update autoclass_tutorial.md * Add dummy torch input instead * Add dummy torch input * Update autoclass_tutorial.md * Update backbones.md * minor fix * Update docs/source/en/main_classes/backbones.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update docs/source/en/autoclass_tutorial.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Added illustrations and explained backbone & neck * Update docs/source/en/main_classes/backbones.md Co-authored-by:
Maria Khalusova <kafooster@gmail.com> * Update backbones.md --------- Co-authored-by:
Maria Khalusova <kafooster@gmail.com>
-
YQ authored
* use logger.warning_once to avoid massive outputs when training/finetuning longformer * update more
-
vijaye12 authored
* docstring corrections * style make --------- Co-authored-by:vijaye12 <vijaye12@in.ibm.com>
-
Arthur authored
* up * up * test * logits ok * up * up * few fixes * conversion script * up * nits * nits * update * nuke * more updates * nites * fix many issues * nit * scatter * nit * nuke megablocks * nits * fix conversion script * nit * remove * nits * nit * update * oupsssss * change * nits device * nits * fixup * update * merge * add copied from * fix the copy mentions * update tests * more fixes * nits * conversion script * add parts of the readme * Update tests/models/mixtral/test_modeling_mixtral.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * new test + conversion script * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review * fix * fix copies * fix copies * ooops * fix config * Apply suggestions from code review * fix nits * nit * add copies * add batched tests * docs * fix flash attention * let's add more verbose * add correct outputs * support router ouptus * ignore copies where needed * fix * cat list if list is given for now * nits * Update docs/source/en/model_doc/mixtral.md * finish router refactoring * fix forward * fix expected values * nits * fixup * fix * fix bug * fix * fix dtype mismatch * fix * grrr grrr I support item assignment * fix CI * docs * fixup * remove some copied form * fix weird diff * skip doctest fast on the config and modeling * mark that is supports flash attention in the doc * update * Update src/transformers/models/mixtral/modeling_mixtral.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * Update docs/source/en/model_doc/mixtral.md Co-authored-by:
Lysandre Debut <hi@lysand.re> * revert router logits config issue * update doc accordingly * Update src/transformers/models/mixtral/convert_mixtral_weights_to_hf.py * nits * use torch testing asssert close * fixup * doc nits --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
Arthur authored
* Skip nn.Module.reset_parameters * Actually skip * Check quality * Maybe change all inits * Fix init issues: only modify public functions * Add a small test for now * Style * test updates * style * nice tes * style * make it even faster * one more second * remove fx icompatible * Update tests/test_modeling_common.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * Update tests/test_modeling_common.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * skip * fix quality * protect the import --------- Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
fxmarty authored
fix sdpa dispatch
-
NielsRogge authored
* More improvements * Improve variable names * Update READMEs, improve docs
-