- 22 Apr, 2024 1 commit
-
-
zhong zhuang authored
* [FEAT]: EETQ quantizer support * Update quantization.md * Update docs/source/en/main_classes/quantization.md Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/quantization.md Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update docs/source/en/quantization.md Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/__init__.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/__init__.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/eetq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/eetq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/integrations/eetq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/eetq_integration/test_eetq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_eetq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/eetq_integration/test_eetq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update src/transformers/quantizers/quantizer_eetq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/eetq_integration/test_eetq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * Update tests/quantization/eetq_integration/test_eetq.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * [FEAT]: EETQ quantizer support * [FEAT]: EETQ quantizer support * remove whitespaces * update quantization.md * style * Update docs/source/en/quantization.md Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add copyright * Update quantization.md * Update docs/source/en/quantization.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update docs/source/en/quantization.md Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Address the comments by amyeroberts * style --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by:
Marc Sun <marc@huggingface.co> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 09 Apr, 2024 1 commit
-
-
Marc Sun authored
* revert back to torch 2.1.1 * run test * switch to torch 2.2.1 * udapte dockerfile * fix awq tests * fix test * run quanto tests * update tests * split quantization tests * fix * fix again * final fix * fix report artifact * build docker again * Revert "build docker again" This reverts commit 399a5f9d9308da071d79034f238c719de0f3532e. * debug * revert * style * new notification system * testing notfication * rebuild docker * fix_prev_ci_results * typo * remove warning * fix typo * fix artifact name * debug * issue fixed * debug again * fix * fix time * test notif with faling test * typo * issues again * final fix ? * run all quantization tests again * remove name to clear space * revert modfiication done on workflow * fix * build docker * build only quant docker * fix quantization ci * fix * fix report * better quantization_matrix * add print * revert to the basic one
-
- 15 Mar, 2024 1 commit
-
-
Marc Sun authored
* start integration * fix * add and debug tests * update tests * make pytorch serialization works * compatible with device_map and offload * fix tests * make style * add ref * guard against safetensors * add float8 and style * fix is_serializable * Fix shard_checkpoint compatibility with quanto * more tests * docs * adjust memory * better * style * pass tests * Update src/transformers/modeling_utils.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add is_safe_serialization instead * Update src/transformers/quantizers/quantizer_quanto.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add QbitsTensor tests * fix tests * simplify activation list * Update docs/source/en/quantization.md Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * better comment * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> * find and fix edge case * Update docs/source/en/quantization.md Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * pass weights_only_kwarg instead * fix shard_checkpoint loading * simplify update_missing_keys * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * recursion to get all tensors * block serialization * skip serialization tests * fix * change by cuda:0 for now * fix regression * update device_map * fix doc * add noteboon * update torch_dtype * update doc * typo * typo * remove comm --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
David Corvoysier <david.corvoysier@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Younes Belkada <younesbelkada@gmail.com>
-
- 05 Mar, 2024 1 commit
-
-
Ilyas Moutawwakil authored
* added exllama kernels support for awq models * doc * style * Update src/transformers/modeling_utils.py Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com> * refactor * moved exllama post init to after device dispatching * bump autoawq version * added exllama test * style * configurable exllama kernels * copy exllama_config from gptq * moved exllama version check to post init * moved to quantization dockerfile --------- Co-authored-by:
Marc Sun <57196510+SunMarc@users.noreply.github.com>
-
- 28 Feb, 2024 1 commit
-
-
Marc Sun authored
* [CI] Quantization workflow * build dockerfile * fix dockerfile * update self-cheduled.yml * test build dockerfile on push * fix torch install * udapte to python 3.10 * update aqlm version * uncomment build dockerfile * tests if the scheduler works * fix docker * do not trigger on psuh again * add additional runs * test again * all good * style * Update .github/workflows/self-scheduled.yml Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * test build dockerfile with torch 2.2.0 * fix extra * clean * revert changes * Revert "revert changes" This reverts commit 4cb52b8822da9d1786a821a33e867e4fcc00d8fd. * revert correct change --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-