• one's avatar
    Tune HIP PME kernel launch block sizes · 20e4b551
    one authored
    Use explicit 128-thread block launches for selected HIP PME kernels that
    benefit from larger blocks.  Keep the platform default block size unchanged,
    and leave small-system grid indexing and charge spreading on the existing
    default launch configuration.
    
    The heuristic applies 128-thread launches to finishSpreadCharge on HIP, and
    uses 128-thread launches for findAtomGridIndex and gridSpreadCharge only for
    larger systems.  Coulomb PME and LJPME dispersion paths are handled in
    parallel, while interpolation and energy evaluation remain unchanged.
    20e4b551
CommonCalcNonbondedForce.cpp 68.5 KB