"...git@developer.sourcefind.cn:kecinstone/2024-pra-vllm.git" did not exist on "c1376e0f825e88e32b5aca85c676fe547bcb03c9"
-
Tim Moon authored
* Add base class for tensor proxies Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Move tensor detaching logic to tensor proxy base class Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Use Python wrappers to PyTorch extensions Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Include transpose caching logic in proxy encode function Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Debug dimension mismatch with amax history Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Move dequantize logic to proxy_decode func Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Rename to "QuantizedTensor" Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Rename "proxy_detach" to "detach" Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Include transpose cache in detach and clone funcs Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix linter warnings Signed-off-by:
Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update FP8 workspaces with QuantizedTensor functions Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Move logic for FP8 transpose cache in FP8 workspaces to base class Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Remove cast-transpose logic from linear op Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Remove unnecessary args for Float8Tensor when using FP8 attr dict Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Remove __torch_function__ to QuantizedTensor Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix linter warnings Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Update tests/pytorch/test_float8tensor.py Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> * Debug FP8 transpose test Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Debug cast functions Signed-off-by:
Tim Moon <tmoon@nvidia.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com> Signed-off-by:
Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by:
pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by:
Kirthi Shankar Sivamani <ksivamani@nvidia.com>
2d57db8b