1. 05 Sep, 2024 1 commit
    • Anton Gorenko's avatar
      Optimize computeNonbonded · 67f5644d
      Anton Gorenko authored
      * All AMD GPUs support shuffle, double precision and 64-bit int atomics;
      * Remove unused code: !ENABLE_SHUFFLE code paths in nonbonded.hip;
      * Use intrinsics in single-precision;
      * Use realToFixedPoint (faster float32-to-int64);
      * Remove shared atomIndices, use shuffles;
      * Check early if atoms are in the cutoff range, sometimes all lanes in
        a warp can skip computations, single pairs can also skip useless
        atomics with zero values;
      * Remove volatile skipTiles access, use shuffles;
      * Distribute work for warps in a strided order;
      * Skip warps that may be still busy in the first loop;
      * Unify conditions for excluded atoms with `includeInteraction`;
      * Move multiprocessors to HipContext;
      * Increase number of warps for computeNonbonded;
      * Disable packed math for >=MI200 (it affects performance of some
        kernels like computeGKForces of amoebagk);
      * Remove defaultOptimizationOptions and createModule's optimizationFlags
        as they are never used;
      * Support -save-temps.
      67f5644d
  2. 01 Sep, 2024 3 commits
    • Anton Gorenko's avatar
      Optimize sorting kernels and tune block sizes · 7279c539
      Anton Gorenko authored
      * Compile kernels with max block size of 256 threads:
        The default hipcc behavior since ROCm 4.2 is to compile kernels
        with 1024 threads unless __launch_bounds__ is specified. This
        significantly increases register pressure especially in heavy kernels
        (double precision, for example), requiring register spilling;
      * Optimize computeRange by using multiple blocks for reduction;
      * Use blocks of 1024 threads for computeBucketPositions - it is executed
        as a single work group so larger block size is faster;
      * Sort up-to lenghtNextPow2 instead of blockDim.x (faster for short
        buckets);
      * Optimize sortShortList2;
      * Optimize sortBuckets with bit instructions;
      * Decrease bucket size for non-uniform sorting: too many buckets may
        have sizes too large to sort in shared memory;
      * Add more sizes in tests.
      7279c539
    • Anton Gorenko's avatar
      Cleanup Cmake scripts for HIP platform · aca24d5f
      Anton Gorenko authored
      * Remove setting of link libraries, include and link dirs and compile
        flags for each target, instead let Cmake deal with them by linking the
        main library to hip::host hiprtc::hiprtc hip::hipfft;
      * Fix: custom command without ADD_CUSTOM_TARGET and ADD_DEPENDENCIES is
        executed for both static and shared targets;
      * Remove IF(APPLE) parts.
      aca24d5f
    • Anton Gorenko's avatar
      Add hipification of CUDA platform · 89d2ff0e
      Anton Gorenko authored
      Port changes in CUDA backend to HIP
      
      Fix a warning about arithmetic operations on void* in HipArray::uploadSubArray
      
      Fix "Error Initializing context ROCm 5.3.0"
      
          https://github.com/StreamHPC/openmm-hip/issues/3
      
      
          hipDeviceSetCacheConfig returns hipErrorNotSupported on 5.3
      Co-authored-by: default avatarNick Curtis <nicholas.curtis@amd.com>
      89d2ff0e
  3. 19 Aug, 2024 1 commit
  4. 06 Apr, 2024 1 commit
  5. 24 Feb, 2024 1 commit
  6. 23 Feb, 2024 1 commit
  7. 17 Feb, 2024 1 commit
  8. 02 Feb, 2024 1 commit
  9. 18 Jan, 2024 1 commit
  10. 20 Dec, 2023 2 commits
  11. 14 Dec, 2023 1 commit
  12. 12 Dec, 2023 1 commit
  13. 11 Dec, 2023 1 commit
  14. 02 Nov, 2023 1 commit
  15. 31 Oct, 2023 1 commit
  16. 24 Oct, 2023 2 commits
  17. 16 Oct, 2023 1 commit
    • Christopher Woods's avatar
      WIP - looking for a way to optimise performance of creating contexts by... · 03ed8ff2
      Christopher Woods authored
      WIP - looking for a way to optimise performance of creating contexts by removing temporary arrays (and their associated mallocs/frees) (#4261)
      
      * Suggesting a "haveSameParameters" function for CustomNonbondedForce which could be
      used to avoid creating temporary copies of arrays when testing if particles are
      the same.
      
      Also updating "getParticleParameters" so that it re-uses the memory of the
      passed vector argument, rather than deallocating and reallocating it
      via a copy.
      
      * Revert "Suggesting a "haveSameParameters" function for CustomNonbondedForce which could be"
      
      This reverts commit e80ec2d2e9981abb90711636bf3a78d0c49e43fc.
      
      * Moved to `thread_local static` as suggested to prevent new vector allocations on each function call.
      
      Updated `getParameters` and `getBondParameters` to re-use the memory from the argument rather
      than re-allocating via the copy.
      
      * Forgot to reuse the memory for the groups...
      
      * Reverted back the manual copies via memcpy as they aren't needed. Looking at the header
      file and benchmarking shows that std::vector does the right thing.
      
      * Confined `thread_local static` only to ForceInfo methods, and have also put declarations
      for multiple variables back onto a single line
      
      * Removed `thread_local static` from the constructor
      
      * Moved constructor declarations back into the for loop
      03ed8ff2
  18. 14 Oct, 2023 2 commits
  19. 28 Sep, 2023 1 commit
  20. 16 Sep, 2023 1 commit
  21. 04 Sep, 2023 1 commit
  22. 02 Sep, 2023 1 commit
  23. 01 Sep, 2023 1 commit
  24. 28 Aug, 2023 1 commit
  25. 18 Aug, 2023 2 commits
  26. 02 Aug, 2023 1 commit
    • Emilio Gallicchio's avatar
      Draft integration of the Alchemical Transfer Method (ATM) plugin (#4110) · d8c67699
      Emilio Gallicchio authored
      
      
      * Draft integration of the Alchemical Transfer Method (ATM) plugin
      
      * Attempt to store and retrieve forces--does not compile
      
      * Implement addForce()/getForce() methods
      
      * Throw exception when specifying properties without a Platform (#4130)
      
      * Fixed DOF calculation for NoseHooverIntegrator (#4128)
      
      * Fix variance in documentation of VerletIntegrator (#4138)
      
      * Python API for ATMForce
      
      * Fixed compilation error
      
      * Minor cleanup of formatting and documentation
      
      * Files for ATMForce test cases
      
      * More cleanup
      
      * Removed variable groups
      
      * Test ATMForce with two particles
      
      * More tests for ATMForce plus fixes
      
      * Added missing header
      
      * Rework interface to pass displacements as vector of parameters
      
      * Revert "Rework interface to pass displacements as vector of parameters"
      
      This reverts commit 5e092031f31ded1137b677588f007add1c2d6f82.
      
      * Test with nonbonded force
      
      * Allow energy expression to be customized
      
      * Optional displacements at the initial state
      
      * Fixed compilation error build C wrapper
      
      * Address edge case of default energy expression
      
      * Consistent naming of the variables of the displacement states
      
      * Test of soft core function of the default energy expression
      
      * Mark addForce() as taking ownership
      
      * initial python test for ATMForce
      
      * Test custom expressions
      
      * Expanded C++ API documentation for ATMForce
      
      * Energy parameter derivatives
      
      * Serialization for ATMForce
      
      * Documentation, cleanup, and fixes
      
      * Fixed typos
      
      * getPerturbationEnergy() computes energy
      
      * Another test case
      
      * Minor edits
      
      ---------
      Co-authored-by: default avatarPeter Eastman <peastman@stanford.edu>
      Co-authored-by: default avatarMichael Plainer <plainer@ymail.com>
      d8c67699
  27. 24 Jul, 2023 1 commit
  28. 21 Jul, 2023 1 commit
  29. 20 Jul, 2023 1 commit
  30. 14 Jul, 2023 1 commit
  31. 23 Jun, 2023 1 commit
  32. 12 Jun, 2023 1 commit
  33. 31 May, 2023 1 commit
  34. 23 May, 2023 1 commit