• one's avatar
    Enable split PME streams for HIP LJPME · b1a21fd4
    one authored
    Run Coulomb and dispersion reciprocal PME work on separate HIP queues for
    LJPME when PME streams are enabled.  Use separate grids, sorters, events, and
    energy buffers so the two reciprocal branches can overlap safely.
    
    Keep the behavior HIP-only based on RTX4090 CUDA profiling, where the same
    split increased PME spread/list contention and regressed apoa1ljpme.
    b1a21fd4
CommonCalcNonbondedForce.cpp 70.4 KB