"...cuda-old/src/kernels/kCalculateCDLJObcGbsaForces1.h" did not exist on "f329a0470c8e995b4b12fc116a6390d8cbfb7ff8"
  1. 05 Sep, 2024 1 commit
    • Anton Gorenko's avatar
      Optimize computeNonbonded · 67f5644d
      Anton Gorenko authored
      * All AMD GPUs support shuffle, double precision and 64-bit int atomics;
      * Remove unused code: !ENABLE_SHUFFLE code paths in nonbonded.hip;
      * Use intrinsics in single-precision;
      * Use realToFixedPoint (faster float32-to-int64);
      * Remove shared atomIndices, use shuffles;
      * Check early if atoms are in the cutoff range, sometimes all lanes in
        a warp can skip computations, single pairs can also skip useless
        atomics with zero values;
      * Remove volatile skipTiles access, use shuffles;
      * Distribute work for warps in a strided order;
      * Skip warps that may be still busy in the first loop;
      * Unify conditions for excluded atoms with `includeInteraction`;
      * Move multiprocessors to HipContext;
      * Increase number of warps for computeNonbonded;
      * Disable packed math for >=MI200 (it affects performance of some
        kernels like computeGKForces of amoebagk);
      * Remove defaultOptimizationOptions and createModule's optimizationFlags
        as they are never used;
      * Support -save-temps.
      67f5644d