[PyTorch] Add record_stream and untyped_storage func op in QuantizedTensor (#2144)

* [PyTorch] Add record_stream and untyped_storage func op in QuantizedTensor Signed-off-by: xiaoxi-wangfj <690912414@qq.com> * Update transformer_engine/pytorch/tensor/float8_blockwise_tensor.py Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: xiaoxi-wangfj <690912414@qq.com> * Update transformer_engine/pytorch/tensor/float8_blockwise_tensor.py Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: xiaoxi-wangfj <690912414@qq.com> --------- Signed-off-by: xiaoxi-wangfj <690912414@qq.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

[PyTorch] Add record_stream and untyped_storage func op in QuantizedTensor (#2144)
* [PyTorch] Add record_stream and untyped_storage func op in QuantizedTensor Signed-off-by: xiaoxi-wangfj <690912414@qq.com> * Update transformer_engine/pytorch/tensor/float8_blockwise_tensor.py Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: xiaoxi-wangfj <690912414@qq.com> * Update transformer_engine/pytorch/tensor/float8_blockwise_tensor.py Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Signed-off-by: xiaoxi-wangfj <690912414@qq.com> --------- Signed-off-by: xiaoxi-wangfj <690912414@qq.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
81c363bf · xiaoxi-wangfj · GitHub · 452c7374 · 81c363bf
Unverified Commit 81c363bf authored Oct 17, 2025 by xiaoxi-wangfj Committed by GitHub Oct 16, 2025
Hide whitespace changes
Inline Side-by-side

Showing with 28 additions and 0 deletions

transformer_engine/pytorch/tensor/float8_blockwise_tensor.py transformer_engine/pytorch/tensor/float8_blockwise_tensor.py +28 -0

No files found.
--- a/transformer_engine/pytorch/tensor/float8_blockwise_tensor.py
+++ b/transformer_engine/pytorch/tensor/float8_blockwise_tensor.py
@@ -403,6 +403,21 @@ class Float8BlockwiseQTensor(Float8BlockwiseQTensorStorage, QuantizedTensor):
        # pylint: disable=missing-function-docstring
        return _ReshapeFunc.apply(self, shape)
+    def untyped_storage(self) -> torch.UntypedStorage:
+        """Return the underlying UntypedStorage of the FP8 data.
+        Note that FP8 block-scaled tensor may involve multiple
+        buffers: row-wise FP8 data, row-wise scales, column-wise FP8
+        data, column-wise scales. The UntypedStorage of the row-wise
+        FP8 data is returned if it exists, and otherwise the
+        UntypedStorage of the column-wise FP8 data.
+        """
+        data = self._rowwise_data if self._rowwise_data is not None else self._columnwise_data
+        if data is not None:
+            return data.untyped_storage()
+        return torch.UntypedStorage(0, device=self.device)
    @classmethod
    def __torch_dispatch__(cls, func, types, args, kwargs=None):
@@ -427,6 +442,19 @@ class Float8BlockwiseQTensor(Float8BlockwiseQTensorStorage, QuantizedTensor):
                )
            return Float8BlockwiseQTensor.make_like(tensor)
+        # record stream op
+        if func == torch.ops.aten.record_stream.default:
+            qt, stream = args
+            for t in (
+                qt._rowwise_data,
+                qt._columnwise_data,
+                qt._rowwise_scale_inv,
+                qt._columnwise_scale_inv,
+            ):
+                if t is not None and t.is_cuda:
+                    t.record_stream(stream)
+            return None
        # Default case
        return super().__torch_dispatch__(func, types, args, kwargs)