"...composable_kernel_rocm.git" did not exist on "8ce1480430fe08fded62dd6bccc3750cc3ad1548"
Fix performance issue when passing tensor descriptor from host to kernel by void pointers (#27)
* use address_space(4) in kernel signature to fix performance issue when passing tensor descriptor from host to kernel by (void) pointers * remove passing by pointer* option (only use pass by value or void*)
Showing
Please register or sign in to comment