• Yuqing Xia's avatar
    [Example] Add sparse gqa decode example (#332) · 8fdfdf03
    Yuqing Xia authored
    
    
    * add example gqa decode wgmma pipelined
    
    * add sparse gqa
    
    * support num split
    
    * support num split
    
    * add if condition
    
    * add heuristic num split
    
    * clean code
    
    * add ref
    
    * fix bug
    
    * add torch ref
    
    * fix bug
    
    * integrate to torch
    
    * symbolic
    
    * clean mask
    
    * rm actual_num_blocks
    
    * clean code
    
    * get num_sm via torch
    
    * add sparse gqa decode example
    
    * format
    
    * rm example_gqa_decode_wgmma_pipelined.py
    
    * Add license headers to example scripts
    
    * format
    
    * Remove commented-out cache disabling lines
    
    ---------
    Co-authored-by: default avatarLei Wang <34334180+LeiWang1999@users.noreply.github.com>
    8fdfdf03
heuristic.py 2.06 KB