-
one authored
- Introduce kernel_launch_overhead.cu to measure kernel launch latency, system throughput, CPU dispatch overhead, and GPU dispatch time. - Create Makefile for building the benchmark with support for nvcc and hipcc. - Add run-all.sh script to execute the benchmark with specified device settings.
0fe0b01f