1. 23 Jun, 2023 1 commit
  2. 12 Jun, 2023 1 commit
  3. 31 May, 2023 1 commit
  4. 23 May, 2023 1 commit
  5. 18 May, 2023 1 commit
  6. 14 May, 2023 1 commit
  7. 05 May, 2023 1 commit
  8. 27 Apr, 2023 1 commit
  9. 24 Apr, 2023 1 commit
  10. 13 Apr, 2023 1 commit
  11. 30 Mar, 2023 1 commit
  12. 02 Mar, 2023 1 commit
  13. 25 Feb, 2023 1 commit
  14. 14 Feb, 2023 1 commit
  15. 13 Feb, 2023 1 commit
  16. 09 Feb, 2023 1 commit
  17. 31 Jan, 2023 4 commits
  18. 29 Nov, 2022 1 commit
  19. 11 Nov, 2022 1 commit
  20. 09 Nov, 2022 1 commit
  21. 12 Sep, 2022 1 commit
  22. 08 Sep, 2022 1 commit
  23. 31 Aug, 2022 1 commit
  24. 17 Aug, 2022 1 commit
  25. 12 Aug, 2022 1 commit
  26. 09 Aug, 2022 1 commit
  27. 02 Aug, 2022 1 commit
  28. 22 Jul, 2022 1 commit
    • Adel Johar's avatar
      Final HIP Platform implementation for AMD GPUs on ROCm (#3338) · a39fa14a
      Adel Johar authored
      
      
      * Support kernel files with extensions of any length (like .hip)
      
      * Do not allow to replace symbols in single-line comments
      
      * Add OPENMM_BUILD_COMMON CMake option
      
      It allows to build and install common platform files even if
      CUDA or OpenCL platforms are not built.
      This is required for HIP platform (openmm-hip) if ROCm OpenCL
      packages are not installed.
      
      * Add an option for Python wrapper to install into user packages
      
      OPENMM_PYTHON_USER_INSTALL is OFF be default.
      
      * Support FFT backends in Amoeba plugin
      
      The HIP platform supports FFT backends, this commit moves
      findLegalFFTDimension to ComputeContext, so platforms can have their own
      implementations.
      
      * Compatibility for common platform w/ new HIP platform
      
      * Do not use volatile with private and local AtomData parameters on HIP
      
      The generated code is not optimal, for example, the compiler generates
      flat_load instructions instead of ds_read.
      
      * Tune launch bounds for PME grid-related kernels and add WA for RDNA
      
      Force the compiler to use all registers for gridSpreadCharge and
      gridInterpolateForce by limiting max waves per EU to 1 on CDNA GPUs,
      RDNA GPUs work better without it.
      
      * Optimize atom data structs in GBSA and Amoeba on HIP
      
      Manually rearrange fields, add paddings and force alignments to
      have faster accesses to shared memory: ds_read and ds_write may
      work slower if addresses are not aligned by 16 bytes.
      Co-authored-by: default avatarAnton Gorenko <anton@streamhpc.com>
      Co-authored-by: default avatarNick Curtis <nicholas.curtis@amd.com>
      a39fa14a
  29. 15 Jul, 2022 1 commit
  30. 30 Jun, 2022 1 commit
    • Peter Eastman's avatar
      Use PocketFFT (#3667) · 1dac981a
      Peter Eastman authored
      * Use PocketFFT instead of FFTW
      
      * Minor cleanup
      
      * Use PocketFFT instead of fftpack for reference platform
      
      * Remove FFTW as a dependency
      
      * Converted a test case to use PocketFFT
      
      * Fixed an incorrect comment
      1dac981a
  31. 28 Jun, 2022 1 commit
  32. 22 Jun, 2022 1 commit
  33. 21 Jun, 2022 1 commit
  34. 10 Jun, 2022 1 commit
  35. 01 Jun, 2022 1 commit
    • Xavier Hallade's avatar
      fix divergence in barriers (#3621) · 7af08783
      Xavier Hallade authored
      Without this fix, we see cases in which not all work-items in a thread group end up hitting the same number of barriers, which leads to a hang in OpenCL GPU execution.
      7af08783
  36. 19 May, 2022 1 commit
  37. 17 May, 2022 1 commit