Commits · dbf25377ea52f68a607ae4f9d53360c18fcea353 · tsoc / openmm

30 Mar, 2023 1 commit
- Improved load balancing between GPUs (#4013) · dbf25377
  Peter Eastman authored Mar 30, 2023
  
  dbf25377
02 Mar, 2023 1 commit

Initialize tilesAfterReorder properly (#3984) · 48b6abc1

Anton Gorenko authored Mar 02, 2023

It may contain a garbage value, and if it is large then
updateNeighborListSize does not force reorder atoms after 25 steps
in extremal cases.

48b6abc1

25 Feb, 2023 1 commit
- [macOS GPU Support] Tune dispatching of persistent threads for Apple silicon GPUs (#3978) · fa893467
  Philip Turner authored Feb 24, 2023
```
* Use 768 instead of 384 threads in generic kernels

* Use 1536 instead of 1024 threads in force kernels.
```
  fa893467
14 Feb, 2023 1 commit
- Remove unused offset variables (#3961) · 611bd817
  bdenhollander authored Feb 13, 2023
```
- Appear to be copy and pasted from getPositions and were not removed
```
  611bd817
13 Feb, 2023 1 commit
- [macOS GPU Support] Fix `MTLCommandBuffer` bottlenecks for Apple silicon GPUs (#3960) · d5a4ce06
  Philip Turner authored Feb 13, 2023
```
* Flushing optimization

* Remove unnecessary checks
```
  d5a4ce06
09 Feb, 2023 1 commit
- Profiling of OpenCL kernels (#3954) · 8528d8eb
  Peter Eastman authored Feb 09, 2023
```
* Profiling of OpenCL kernels

* Minor improvements to profiling
```
  8528d8eb
31 Jan, 2023 4 commits
- Use CompiledExpression for CustomCVForce energy expression (#3898) · 17b61225
  Peter Eastman authored Jan 31, 2023
  
  17b61225
- Update OpenCLContext.cpp (#3917) · a1a32466
  Philip Turner authored Jan 31, 2023
  
  a1a32466
- Optimized reducing energy (#3902) · 2975f44b
  Peter Eastman authored Jan 31, 2023
  
  2975f44b
- Use VkFFT for OpenCL (#3934) · e0c80069
  Peter Eastman authored Jan 31, 2023
```
* Use VkFFT for OpenCL

* Updated comments for OpenCLFFT3D
```
  e0c80069
29 Nov, 2022 1 commit
- Fix compilation error in OpenCL kernels on some platforms (#3857) · b01017d6
  Peter Eastman authored Nov 29, 2022
  
  b01017d6
11 Nov, 2022 1 commit

Initialize pinnedBuffer at OpenCLContext creation (#3842) · 3a212464

Charlles R. A. Abreu authored Nov 11, 2022

This initialization is needed to prevent segfault during object destruction in certain circumstances (e.g., when a `Force` is changed before calling `Context.reinitialize()` and this change causes the corresponding `ForceImpl` to throw an exception).

3a212464

09 Nov, 2022 1 commit
- Fixed error in CustomNonbondedForce on CPU (#3840) · a6a05ee6
  Peter Eastman authored Nov 09, 2022
  
  a6a05ee6
12 Sep, 2022 1 commit
- Ensure valid atom order after loading a checkpoint (#3771) · a056d5a3
  Peter Eastman authored Sep 12, 2022
  
  a056d5a3
08 Sep, 2022 1 commit
- Ensure valid atom order after loading a checkpoint · c29e203b
  peastman authored Sep 08, 2022
  
  c29e203b
31 Aug, 2022 1 commit
- Detect NaN to avoid infinite loop in JAMA::Eigenvalue (#3758) · 0e104b49
  David Williams authored Aug 30, 2022
  
  0e104b49
17 Aug, 2022 1 commit
- Improved support for devices without 64 bit atomics (#3737) · ae686364
  Peter Eastman authored Aug 17, 2022
  
  ae686364
12 Aug, 2022 1 commit
- Improved logic for deciding when to reorder atoms (#3721) · 48664a1f
  Peter Eastman authored Aug 12, 2022
  
  48664a1f
09 Aug, 2022 1 commit
- Fixed compilation error with multiple CustomNonbondedForces (#3729) · 14e878a9
  Peter Eastman authored Aug 08, 2022
  
  14e878a9
02 Aug, 2022 1 commit
- UseBlockingSync defaults to false (#3720) · a2f4ab8b
  Peter Eastman authored Aug 02, 2022
  
  a2f4ab8b
22 Jul, 2022 1 commit

Final HIP Platform implementation for AMD GPUs on ROCm (#3338) · a39fa14a

Adel Johar authored Jul 22, 2022



* Support kernel files with extensions of any length (like .hip)

* Do not allow to replace symbols in single-line comments

* Add OPENMM_BUILD_COMMON CMake option

It allows to build and install common platform files even if
CUDA or OpenCL platforms are not built.
This is required for HIP platform (openmm-hip) if ROCm OpenCL
packages are not installed.

* Add an option for Python wrapper to install into user packages

OPENMM_PYTHON_USER_INSTALL is OFF be default.

* Support FFT backends in Amoeba plugin

The HIP platform supports FFT backends, this commit moves
findLegalFFTDimension to ComputeContext, so platforms can have their own
implementations.

* Compatibility for common platform w/ new HIP platform

* Do not use volatile with private and local AtomData parameters on HIP

The generated code is not optimal, for example, the compiler generates
flat_load instructions instead of ds_read.

* Tune launch bounds for PME grid-related kernels and add WA for RDNA

Force the compiler to use all registers for gridSpreadCharge and
gridInterpolateForce by limiting max waves per EU to 1 on CDNA GPUs,
RDNA GPUs work better without it.

* Optimize atom data structs in GBSA and Amoeba on HIP

Manually rearrange fields, add paddings and force alignments to
have faster accesses to shared memory: ds_read and ds_write may
work slower if addresses are not aligned by 16 bytes.
Co-authored-by: Anton Gorenko <anton@streamhpc.com>
Co-authored-by: Nick Curtis <nicholas.curtis@amd.com>

a39fa14a

15 Jul, 2022 1 commit
- Kernel source headers included in installation (#3700) · 8d9a656d
  Charlles R. A. Abreu authored Jul 14, 2022
  
  8d9a656d
30 Jun, 2022 1 commit

Use PocketFFT (#3667) · 1dac981a

Peter Eastman authored Jun 30, 2022

* Use PocketFFT instead of FFTW

* Minor cleanup

* Use PocketFFT instead of fftpack for reference platform

* Remove FFTW as a dependency

* Converted a test case to use PocketFFT

* Fixed an incorrect comment

1dac981a

28 Jun, 2022 1 commit
- Fixed freeze when using multiple GPUs (#3668) · 3d62421b
  Peter Eastman authored Jun 28, 2022
  
  3d62421b
22 Jun, 2022 1 commit
- add missing header and fix logic for including cmath (#3658) · e9534c15
  Mike Henry authored Jun 22, 2022
  
  e9534c15
21 Jun, 2022 1 commit
- Reduced the cost of updating tabulated functions (#3649) · 8292bb3a
  Peter Eastman authored Jun 21, 2022
  
  8292bb3a
10 Jun, 2022 1 commit
- Prevent Windows from defining macros that interfere with other code (#3637) · 55ae9d7f
  Mike Henry authored Jun 10, 2022
```
* add fix to Prevent Windows from defining macros that interfere with other code

* add fix to the tippy top of the file
```
  55ae9d7f
01 Jun, 2022 1 commit

fix divergence in barriers (#3621) · 7af08783

Xavier Hallade authored Jun 01, 2022

Without this fix, we see cases in which not all work-items in a thread group end up hitting the same number of barriers, which leads to a hang in OpenCL GPU execution.

7af08783

19 May, 2022 1 commit
- Vectorized calculating long range correction coefficient (#3606) · c3c8ec55
  Peter Eastman authored May 19, 2022
  
  c3c8ec55
17 May, 2022 1 commit
- Very minor optimizations (#3602) · 109f6b25
  Peter Eastman authored May 17, 2022
  
  109f6b25
11 May, 2022 1 commit
- Added FAQ links to error messages (#3600) · fb036060
  Peter Eastman authored May 11, 2022
```
* Added FAQ links to error messages

* Added missing Windows export
```
  fb036060
17 Apr, 2022 1 commit

Vectorize nonbonded interactions with no cutoff (#3575) · db4cefd4

Peter Eastman authored Apr 17, 2022

* Vectorize NonbondedForce with no cutoff

* Vectorize CustomNonbondedForce with no cutoff

* Memory efficient dense neighbor list

* Fixed errors

db4cefd4

15 Apr, 2022 1 commit
- Fixed inconsistency in computing kinetic energy (#3574) · 11a1982a
  Peter Eastman authored Apr 15, 2022
  
  11a1982a
14 Apr, 2022 1 commit

Vectorized CpuCustomNonbondedForce (#3568) · c8981916

Peter Eastman authored Apr 14, 2022

* Began vectorizing CustomNonbondedForce

* Refactored CpuCustomNonbondedForce to support multiple vector sizes

* AVX implementation of CpuCustomNonbondedForce

* Fixed compilation errors

c8981916

13 Apr, 2022 1 commit
- Use blocking sync when creating events (#3561) · fe21d5ee
  Peter Eastman authored Apr 13, 2022
  
  fe21d5ee
09 Apr, 2022 1 commit
- Fixed race condition with multiple GPUs (#3560) · 6fb1c8a4
  Peter Eastman authored Apr 09, 2022
  
  6fb1c8a4
28 Mar, 2022 1 commit
- Workaround for PyTorch bug (#3533) · 7e00f556
  Peter Eastman authored Mar 28, 2022
  
  7e00f556
24 Mar, 2022 1 commit
- Removed code for CUDA devices without shuffle (#3528) · c7af17c8
  Peter Eastman authored Mar 24, 2022
  
  c7af17c8
07 Mar, 2022 1 commit
- Add realToFixedPoint to all platforms (#3504) · 434d7afb
  Anton Gorenko authored Mar 08, 2022
```
It allows to use a faster float-to-int64 in the HIP platform.
```
  434d7afb
04 Mar, 2022 1 commit
- Minor optimizations to computing single pairs (#3494) · e581f42b
  Peter Eastman authored Mar 04, 2022
```
* Minor optimizations to computing single pairs

* Adjusted MAX_BITS_FOR_PAIRS on Ampere
```
  e581f42b