"sgl-kernel/git@developer.sourcefind.cn:zhaoyu6/sglang.git" did not exist on "81372f3bef65c0dda79849233d8ff9abc1e5d078"
-
Yu Cheng authored
- Introduced a new intrinsic `ptx_cp_async_barrier_noinc` for handling the `cp.async.mbarrier.arrive.noinc` operation in TileLang. - Updated the CUDA code generation to support the new barrier operation. - Added a corresponding function in the TileLang Python API for ease of use. - Enhanced the barrier handling in CUDA templates to include the new no-increment operation, improving synchronization capabilities in parallel execution contexts.
ae9b7063