"vllm/vscode:/vscode.git/clone" did not exist on "c06170cc8e324f4fe6a0c26b57d09e8c958e11bc"
[PyTorch] Add option in activation ops to cache input in FP8 (#1665)
* Add option to cache activation input in FP8 Signed-off-by:Tim Moon <tmoon@nvidia.com> * Avoid casting to FP8 transpose Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Skip input caching if device is not supported Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Add documentation that FP8 input caching is experimental Signed-off-by:
Tim Moon <tmoon@nvidia.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com>
Showing
Please register or sign in to comment