1. 06 May, 2026 1 commit
    • one's avatar
      Optimize HIP pair-list handling for CDNA LJPME · 939ecf28
      one authored
      - Use bitwise prefix accounting when storing sparse interactions as single pairs in the HIP pair-list kernel. This reduces the number of ballot operations needed to compute per-lane single-pair offsets.
      - For HIP CDNA single precision, raise MAX_BITS_FOR_PAIRS to 8 so more sparse interactions are emitted as single pairs instead of full tiles. Keep the existing double precision and RDNA thresholds unchanged.
      - Also simplify the HIP LJPME direct correction by computing alpha^2*r2
      939ecf28
  2. 29 Apr, 2026 1 commit
  3. 16 Apr, 2026 2 commits
  4. 14 Dec, 2025 1 commit
    • Anton Gorenko's avatar
      Support ROCm 7 (#5162) · 07b738c5
      Anton Gorenko authored
      * Remove std::enable_if, warpRotateLeft is always used with TILE_SIZE
      
      * Do not use built-in warpSize in constexpr contexts
      
      Starting from ROCm 7 warpSize is no longer constexpr.
      findInteractingBlocks.hip uses it for sizes of __shared__ arrays.
      
      * Check if hipHostMallocNumaUser is allowed before using it
      07b738c5
  5. 21 Oct, 2025 1 commit
  6. 23 Sep, 2025 1 commit
  7. 09 Jul, 2025 1 commit
  8. 05 May, 2025 1 commit
    • Peter Eastman's avatar
      Common implementation of NonbondedForce (#4922) · 2443dcee
      Peter Eastman authored
      * Use common API for kernels
      
      * More code uses common interface
      
      * Bug fixes
      
      * Unified interface for sorting
      
      * Simplified interface for FFT
      
      * Use common event API for synchronization
      
      * Minor changes to make code more consistent between platforms
      
      * Common implementation of NonbondedForce
      
      * Bug fixes
      
      * Flag to enable list of single pairs
      
      * CUDA and OpenCL use common implementation of NonbondedForce
      
      * Fixed compilation error
      
      * HIP uses common implementation of NonbondedForce
      2443dcee
  9. 14 Apr, 2025 1 commit
    • Peter Eastman's avatar
      DPDIntegrator (#4799) · de180e4e
      Peter Eastman authored
      * Created DPDIntegrator class
      
      * Reference implementation of DPDIntegrator
      
      * Build neighbor list for DPDIntegrator
      
      * Minor fixes
      
      * Documentation for DPDIntegrator
      
      * Python API for DPDIntegrator
      
      * Preliminary OpenCL implementation of DPDIntegrator
      
      * Enable USE_PERIODIC
      
      * Use updated positions in DPD thermostat
      
      * Working on neighbor list for OpenCL DPDIntegrator
      
      * ReorderListener for particle types
      
      * Serialization for DPDIntegrator
      
      * CUDA implementation of DPDIntegrator
      
      * HIP implementation of DPDIntegrator
      
      * Fixed compile error in Python wrapper
      
      * Fixed compile error in wrappers
      
      * Fixed uninitialized memory in reference neighbor list
      
      * Added DPDIntegrator to C++ API docs
      
      * Fixed incorrect launch size
      
      * Fixed nan in DPD random number generator
      
      * Minor optimizations
      
      * Improved load balancing
      
      * Fixed an indexing error
      
      * Neighbor list uses the maximum cutoff of any force
      
      * Fixed HIP compilation error
      
      * Fixed access to invalid memory
      
      * Added test case for diffusion coefficient
      
      * Try to debug segfaults on CI
      
      * Debugging
      
      * Debugging
      
      * Debugging
      
      * Debugging
      
      * Debugging
      
      * Debugging
      
      * Possible fix
      
      * Debugging
      
      * Debugging
      
      * Debugging
      
      * Use correct block size on CPU OpenCL
      
      * Workaround for bug in Intel's OpenCL for CPUs
      
      * Removed an unnecessary define
      
      * Removed debugging code
      
      * Include Dart
      
      * More Intel workarounds
      
      * Workaround for error in NVIDIA OpenCL
      de180e4e
  10. 05 Sep, 2024 6 commits
  11. 01 Sep, 2024 2 commits
    • Anton Gorenko's avatar
      Optimize sorting kernels and tune block sizes · 7279c539
      Anton Gorenko authored
      * Compile kernels with max block size of 256 threads:
        The default hipcc behavior since ROCm 4.2 is to compile kernels
        with 1024 threads unless __launch_bounds__ is specified. This
        significantly increases register pressure especially in heavy kernels
        (double precision, for example), requiring register spilling;
      * Optimize computeRange by using multiple blocks for reduction;
      * Use blocks of 1024 threads for computeBucketPositions - it is executed
        as a single work group so larger block size is faster;
      * Sort up-to lenghtNextPow2 instead of blockDim.x (faster for short
        buckets);
      * Optimize sortShortList2;
      * Optimize sortBuckets with bit instructions;
      * Decrease bucket size for non-uniform sorting: too many buckets may
        have sizes too large to sort in shared memory;
      * Add more sizes in tests.
      7279c539
    • Anton Gorenko's avatar
      Add hipification of CUDA platform · 89d2ff0e
      Anton Gorenko authored
      Port changes in CUDA backend to HIP
      
      Fix a warning about arithmetic operations on void* in HipArray::uploadSubArray
      
      Fix "Error Initializing context ROCm 5.3.0"
      
          https://github.com/StreamHPC/openmm-hip/issues/3
      
      
          hipDeviceSetCacheConfig returns hipErrorNotSupported on 5.3
      Co-authored-by: default avatarNick Curtis <nicholas.curtis@amd.com>
      89d2ff0e
  12. 11 Dec, 2023 1 commit
  13. 24 Jul, 2023 1 commit
  14. 23 May, 2023 1 commit
  15. 02 Mar, 2023 1 commit
  16. 12 Aug, 2022 1 commit
  17. 13 Apr, 2022 1 commit
  18. 24 Mar, 2022 1 commit
  19. 04 Mar, 2022 1 commit
  20. 27 Jan, 2022 1 commit
  21. 27 Dec, 2021 1 commit
  22. 22 May, 2021 1 commit
    • Peter Eastman's avatar
      Converted AMOEBA to common platform (#3120) · 8e8923a7
      Peter Eastman authored
      * Began converting AMOEBA to common platform
      
      * Beginning of OpenCL platform for AMOEBA
      
      * Converted AmoebaVdwForce to common platform
      
      * Cleaned up reference AMOEBA tests
      
      * Began converting AmoebaMultipoleForce to common platform
      
      * Continue converting AmoebaMultipoleForce to common platform
      
      * Bug fixes
      
      * Bug fix
      
      * Continue converting AmoebaMultipoleForce to common platform
      
      * Converting AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce to common platform
      
      * Converting AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce to common platform
      
      * Creating OpenCL version of AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce
      
      * Creating OpenCL version of AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce
      
      * Creating OpenCL version of AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce
      
      * Converted arrays from real3 to real
      
      * Bug fix to OpenCL AmoebaGeneralizedKirkwoodForce
      
      * Fixes for AMD GPUs
      
      * Began converting HippoNonbondedForce to common platform
      
      * Continuing to convert HippoNonbondedForce to common platform
      
      * Continuing to convert HippoNonbondedForce to common platform
      
      * Working on unifying PME kernels
      
      * Fixed error on devices without 64 bit atomics
      
      * Unified PME kernels
      
      * Converted HippoNonbondedForce to common platform
      
      * Creating OpenCL implementation of HippoNonbondedForce
      
      * Continuing OpenCL implementation of HippoNonbondedForce
      
      * Mostly finished OpenCL implementation of HippoNonbondedForce
      
      * Eliminated three component vector types in host code
      
      * Fix errors on CPU OpenCL
      
      * Skip double precision tests for AMOEBA on OpenCL
      
      * Bug fixes
      
      * Bug fixes
      
      * Fixed compilation error
      8e8923a7
  23. 18 Feb, 2021 1 commit
  24. 25 Sep, 2020 1 commit
  25. 10 Sep, 2020 1 commit
  26. 20 Aug, 2020 1 commit
  27. 08 Jan, 2020 1 commit
    • peastman's avatar
      Common compute framework to unify CUDA and OpenCL code (#2488) · edbc8407
      peastman authored
      * Began creating common compute framework to unify code between CUDA and OpenCL
      
      * Began OpenCL implementation of common compute framework
      
      * Common implementation of CMMotionRemover
      
      * CUDA implementation of common compute interface
      
      * Converted HarmonicBondForce to common compute API
      
      * Converted standard bonded forces to common compute API
      
      * Converted ExpressionUtilities to common compute API
      
      * Created ComputeParameterSet
      
      * Converted custom bonded forces to common compute API
      
      * Converted CustomCentroidBondForce to common compute API
      
      * Converted CustomManyParticleForce to common compute API
      
      * Moved lots of duplicate code from CudaContext and OpenCLContext to ComputeContext
      
      * Converted GayBerneForce to common compute API
      
      * Removed obsolete kernels
      
      * Converted verlet integrators to common compute API
      
      * Converted Langevin and Brownian integrators to common compute API
      
      * Converted CustomIntegrator to common compute API
      
      * Converted CustomNonbondedForce to common compute API
      
      * Removed uses of a deprecated API
      
      * Fixed failing test cases
      
      * Converted GBSAOBCForce to common compute API
      
      * Began converting CustomGBForce to common compute API
      
      * Finished converting CustomGBForce to common compute API
      
      * Merged duplicated code in CudaIntegrationUtilities and OpenCLIntegrationUtilities
      
      * Converted RMSDForce and AndersenThermostat to common compute API
      
      * Converted CustomHbondForce to common compute API
      
      * Merged scripts for encoding kernel sources
      
      * Converted Drude plugin to common compute API
      
      * Fixed errors in CMake scripts
      
      * Attempt at fixing errors on Windows
      
      * Added discussion of common compute API to developer guide
      
      * Added Windows export macro for common classes
      
      * Fixed error in CMMotionRemover
      
      * Ubdated travis to newer Ubuntu version
      
      * Fixed errors on CPU OpenCL
      
      * Fixed Windows linking errors
      
      * Added missing pragma for 32 bit atomics
      
      * Replaced long long with mm_long
      
      * More fixes to Windows linking
      
      * Bug fix
      edbc8407
  28. 09 Apr, 2019 1 commit
    • peastman's avatar
      Created HippoNonbondedForce (#2296) · 1eec1e15
      peastman authored
      * Created API for HIPPO force field
      
      * Beginning of reference implementation of HIPPO
      
      * Continuing reference implementation of HIPPO
      
      * Continuing reference implementation of HIPPO
      
      * Continuing reference implementation of HIPPO
      
      * Continuing reference implementation of HIPPO
      
      * Continuing reference implementation of HIPPO
      
      * Continuing reference implementation of HIPPO
      
      * Continuing reference implementation of HIPPO
      
      * Completed reference of HIPPO with no cutoff
      
      * Beginning cutoffs/PME for reference implementation of HIPPO
      
      * Continuing PME for reference implementation of HIPPO
      
      * Continuing PME for reference implementation of HIPPO
      
      * Continuing PME for reference implementation of HIPPO
      
      * Completed reference implementation of HIPPO
      
      * Cleanup and optimization to HIPPO reference
      
      * Further cleanup to HIPPO
      
      * Combined direct space interactions into a single loop
      
      * Compute direct space interactions in quasi-internal frame
      
      * Beginning of CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Continuing CUDA implementation of HIPPO
      
      * Finished CUDA implementation of HIPPO
      
      * More features and test cases for HippoNonbondedForce
      
      * Serialization and Python API for HippoNonbondedForce
      
      * Fixed sign error in computing forces
      1eec1e15
  29. 16 Mar, 2018 1 commit
  30. 15 Mar, 2018 1 commit
  31. 12 Mar, 2018 1 commit
  32. 12 Feb, 2018 1 commit
  33. 02 Dec, 2016 1 commit