"...git@developer.sourcefind.cn:kecinstone/2024-pra-vllm.git" did not exist on "4b1ac23f53d0e714a4a48d2c8058438405c0fd07"
-
Evgeny Tsykunov authored
* Introduce QuantizerBase Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Expose as a first-class API Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Undo QuantizerBase Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Make Quantizer a base class without implementations Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Support CustomRecipe and CustomRecipeState Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Resolving comments: quantize impl, num_quantizers, defaults Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Quantizer factories Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Add tests Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * QuantizedTensorBase _get_quantizer() + quantize_() Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Experimental note + LayerNormMLP fix Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * tensor._internal -> tensor.base Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Expose Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Minor import fix Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Single quantizer factory with roles Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * More context for qfactory, fwd/bwd_roles Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Minor Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Rename *Base -> *Storage quantized tensors Signed-off-by:
Evgeny <etsykunov@nvidia.com> * make_quantizers() will take roles from the operation Signed-off-by:
Evgeny <etsykunov@nvidia.com> * Improve tests and fix missing imports Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Apply suggestions from code review Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> * Merge main followup Signed-off-by:
Evgeny <etsykunov@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by:
Evgeny <etsykunov@nvidia.com> Signed-off-by:
Evgeny Tsykunov <etsykunov@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com>
7022d50f