• Harisankar Sadasivan's avatar
    Universal streamk with atomics (#1360) · 75e622f0
    Harisankar Sadasivan authored
    * universal streamk with atomics with ckprofiler support. grid_size and streamk strategy are tunable. grid_size of -1 leads to #WGs = maximum occupancy X num_CUs. implementation supports many different streamk policies: 1-tile, 2-tile, 3-tile and 4-tile. streamk strategy of -1 leads to default streamk policy (4-tile). 
    
    * Update README.md
    
    * fixing clang-format issues
    
    * removed conflicts in struct members between streamk and universal streamk
    
    * corrected arg parsing for streamk and universal streamk
    
    * added stream-k policies for 3 tile and 4 tile
    
    * fixed argument type issue with parsing cmd args
    
    * changes suggested in PR review are made- removing comments and correcting copyright
    
    * file permissions updated
    
    * added default value support for grid_size and streamk-policy selection set to -1
    
    * print messages for arguments
    
    * print messages for arguments
    
    * print messages for arguments1
    75e622f0
profile_gemm_universal_streamk.cpp 5.5 KB