Benchmarks: Micro benchmark - add ncu profile support in cublaslt-gemm (#740)
**Description** This PR adds NCU (NVIDIA Nsight Compute) profiling support to the cublaslt-gemm micro benchmark, enabling detailed kernel analysis including DRAM throughput, compute throughput, and launch arguments. **Major Revision** - Add --enable_ncu_profiling and --profiling_metrics for ncu profiling - Modifies command execution to use NCU when profiling is enabled - Updates result parsing to handle both standard and NCU profiled output formats
Showing
tests/data/cublaslt_ncu.log
0 → 100644
Please register or sign in to comment