• rocking's avatar
    [Ck tile] support rmsnorm and related fusion (#1605) · 3d609534
    rocking authored
    * Add reduce2d new api
    
    * Prevent user use cross warp reduction
    
    * Fix bug of std caculation
    
    * Add rmsnorm2d
    
    * Add rmsnorm small example
    
    * Remove static assert to prevent compile fail
    
    * Add script to test performance and correctness
    
    * Add missing cmake change
    
    * refine naming
    
    * refine example of rmsnorm
    
    * Fix bug of rmsnorm
    
    * Refine naming
    
    * Fix cmake
    
    * clang format
    
    * Refine pipeline name
    
    * Add add_rmsnorm2d_rdquant kernel
    
    * Add reduce op
    
    * host verification
    
    * Fix bug of one pass pipeline
    
    * Refine tile size
    
    * Add two pass pipeline
    
    * Rename two pass to three pass
    
    * Fix bug of kSaveX == false
    
    * Add instance library
    
    * Add test script
    
    * Fix bug of x verification
    
    * Add save_x to trait
    
    * Add README
    
    * Move reduce2d into reduce folder
    
    * Fix bug of welford when number of m warp > 1
    
    * remove reduncant comment
    
    * 1. move 06_rmsnorm2d to 10_rmsnorm2d
    2. move 07_add_rmsnorm2d_rdquant to 11_add_rmsnorm2d_rdquant
    
    * clang format and add missing header
    
    * Add host validation of add + layernorm2d + rsquant
    
    * Revert "Add host validation of add + layernorm2d + rsquant"
    
    This reverts commit 936cb457978b928b90eff89a08fcdb7dc8bbed67.
    
    * Remove deprecated flag
    3d609534
host.hpp 1.58 KB