"...composable_kernel_onnxruntime.git" did not exist on "8a4b59785b4f5ba48468d53618ca270c5da599a7"
  • Yu Cheng's avatar
    [Dev] Add MLA and GQA decode examples (#109) · 40faabb1
    Yu Cheng authored
    * [CI][Test] Add test cases for tilelang transform MultiVersionBuffer and WarpSpecialized
    
    * Relax the mismatch ratio restrictions in the flash_linear_attention and mha tests
    
    * [Dev] Add mha backward example
    
    * [Dev] Add mla decode example
    
    * bug fix
    
    * Add triton impl
    
    * Add gqa decode example
    
    * [Dev] Add GQA decode example
    
    * lint
    
    * delete unused triton example
    
    * set default profiler to 'auto'
    40faabb1
example_gqa_decode.py 16.6 KB