• Anton Gorenko's avatar
    Optimize computeNonbonded · 67f5644d
    Anton Gorenko authored
    * All AMD GPUs support shuffle, double precision and 64-bit int atomics;
    * Remove unused code: !ENABLE_SHUFFLE code paths in nonbonded.hip;
    * Use intrinsics in single-precision;
    * Use realToFixedPoint (faster float32-to-int64);
    * Remove shared atomIndices, use shuffles;
    * Check early if atoms are in the cutoff range, sometimes all lanes in
      a warp can skip computations, single pairs can also skip useless
      atomics with zero values;
    * Remove volatile skipTiles access, use shuffles;
    * Distribute work for warps in a strided order;
    * Skip warps that may be still busy in the first loop;
    * Unify conditions for excluded atoms with `includeInteraction`;
    * Move multiprocessors to HipContext;
    * Increase number of warps for computeNonbonded;
    * Disable packed math for >=MI200 (it affects performance of some
      kernels like computeGKForces of amoebagk);
    * Remove defaultOptimizationOptions and createModule's optimizationFlags
      as they are never used;
    * Support -save-temps.
    67f5644d
HipContext.h 23.7 KB