[PyTorch] Linear op avoids saving input tensor if weight grad is not needed (#1817)
* Linear op avoids saving input tensor if weight grad is not needed Signed-off-by:Tim Moon <tmoon@nvidia.com> * Linear op forward avoids producing quantized tensors with unnecessary usages Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Fix linter warnings Signed-off-by:
Tim Moon <tmoon@nvidia.com> * Avoid unnecessary usages in fused linear ops Signed-off-by:
Tim Moon <tmoon@nvidia.com> --------- Signed-off-by:
Tim Moon <tmoon@nvidia.com>
Showing
Please register or sign in to comment