Perform an atomic maximum on the value stored at dst with an optional memory-order.
If memory_order is None the runtime extern "AtomicMax" is called without an explicit memory-order id; otherwise the provided memory_order string is mapped to a numeric id using the module's memory-order map and passed to the extern.
Parameters:
dst (Buffer): Destination buffer/address to apply the atomic max.
value (PrimExpr): Value to compare/store atomically.
Atomically add `value` into `dst`, returning a handle to the operation.
Supports scalar/addressed extern atomic add when neither argument exposes extents, or tile-region-based atomic add for Buffer/BufferRegion/BufferLoad inputs. If both arguments are plain Buffers their shapes must be structurally equal. If at least one side exposes extents, extents are aligned (missing dimensions are treated as size 1); an assertion is raised if extents cannot be deduced. The optional `memory_order` (one of "relaxed","consume","acquire","release","acq_rel","seq_cst") is used only for the direct extern `AtomicAdd` path when no extents are available — otherwise the tile-region path ignores `memory_order`.
Returns:
PrimExpr: A handle representing the atomic addition operation.
"""Reshapes the input buffer to the specified shape.
Args:
src (Buffer): Input buffer to be reshaped
shape (List[PrimExpr]): New shape for the buffer
...
...
@@ -284,7 +284,7 @@ def view(src: Buffer,
dtype:Union[str,None]=None)->Buffer:
"""
Return a Tensor view of the input buffer with an optional new shape and dtype.
If `shape` is None the source buffer's shape is used; if `dtype` is None the source buffer's dtype is used. The returned buffer shares the same underlying data as `src` (no copy).
Compute the cumulative sum of `src` along `dim`, writing results to `dst`.
Negative `dim` indices are normalized (Python-style). If `dst` is None, the operation is performed in-place into `src`. Raises ValueError when `dim` is out of bounds for `src.shape`. When `src.scope() == "local.fragment"`, this delegates to `cumsum_fragment`; otherwise it emits the `tl.cumsum` intrinsic.
Returns:
tir.Call: A handle to the emitted cumulative-sum operation.
Convert a flat (linear) index into multi-dimensional coordinates for a given shape.
Given a linear index and a shape (sequence of dimension extents), returns a list of coordinates (one per dimension) such that converting those coordinates back to a linear index using the usual row-major / C-order formula yields the original index. The computation iterates from the last dimension to the first using modulo and integer division, then reverses the collected coordinates.
Parameters:
index (int or PrimExpr): The flat index to convert.
shape (Sequence[int]): The extents of each dimension (length >= 1).
Returns:
list[PrimExpr]: Coordinates for each dimension in the same order as `shape`.