• one's avatar
    Benchmarks: Add gpu-hpl and gpu-hpl-mxp micro benchmarks (#15) · 4fa10f4d
    one authored
    Add gpu-hpl and gpu-hpl-mxp micro benchmarks backed by rocHPL and rocHPL-MxP.
    
    Implemented a shared GPU HPL base that:
    - Generates per-workload HPL dat files and parses the corresponding output files.
    - Supports common HPL inputs such as process grid, matrix size, block size, broadcast topology, warmup, iterations, and reduce operator.
    - Adds rocHPL-specific tuning parameters for gpu-hpl.
    - Formats metric keys from input-derived workload attributes.
    - Reports `flops`, `time`, and `tests_pass` metrics with warmup-aware aggregation.
    
    Add benchmark registrations, parser tests, sample output fixtures, documentation, and recommended configurations for gpu-hpl and gpu-hpl-mxp.
    
    Update rocHPL and rocHPL-MxP third-party integration with build patches, install targets, and SuperBench run helper scripts.
    
    Also update gpu-hpcg metric naming to use flops instead of gflops, remove standalone domain/verification-style metrics from the documented metric surface, and refresh Hygon HPCG documentation/config references accordingly.
    4fa10f4d
.gitmodules 1.73 KB