Commits · 20e4b551e60cee701801e3617f80f2ffe4ff45a9 · tsoc / openmm

11 May, 2026 1 commit

Tune HIP PME kernel launch block sizes · 20e4b551

one authored May 12, 2026

Use explicit 128-thread block launches for selected HIP PME kernels that
benefit from larger blocks. Keep the platform default block size unchanged,
and leave small-system grid indexing and charge spreading on the existing
default launch configuration.

The heuristic applies 128-thread launches to finishSpreadCharge on HIP, and
uses 128-thread launches for findAtomGridIndex and gridSpreadCharge only for
larger systems. Coulomb PME and LJPME dispersion paths are handled in
parallel, while interpolation and energy evaluation remain unchanged.

20e4b551

10 May, 2026 1 commit

Tune HIP neighbor-list launch heuristics · 4d20b76e

one authored May 10, 2026

Apply heuristics for HIP neighbor-list construction:
use fewer nonbonded force blocks for small neighbor-list systems, use two
tiles per batch for larger atom-block counts, and increase the
findBlocksWithInteractions thread block size for small atom-block counts.

Standard concurrent validation shows no clear per-case regression and a
small geomean throughput improvement over the current blocksPerCU baseline.

4d20b76e

06 May, 2026 2 commits

Add wave64 LDS spreading in HIP LJ-PME · 4e7070c2
one authored Apr 30, 2026

4e7070c2

Optimize HIP pair-list handling for CDNA LJPME · 939ecf28

one authored May 06, 2026

- Use bitwise prefix accounting when storing sparse interactions as single pairs in the HIP pair-list kernel. This reduces the number of ballot operations needed to compute per-lane single-pair offsets.
- For HIP CDNA single precision, raise MAX_BITS_FOR_PAIRS to 8 so more sparse interactions are emitted as single pairs instead of full tiles. Keep the existing double precision and RDNA thresholds unchanged.
- Also simplify the HIP LJPME direct correction by computing alpha^2*r2

939ecf28

29 Apr, 2026 1 commit
- Remove redundant code, refactor large block threshold · 14f1b515
  one authored Apr 29, 2026
  
  14f1b515
24 Apr, 2026 1 commit
- Avoid host wait in PME post-sync · c2d9cc7b
  one authored Apr 24, 2026
  
  c2d9cc7b
17 Apr, 2026 2 commits

Split LJ-PME atom-grid sorting from Coulomb PME · c1d643e2

one authored Apr 17, 2026

Avoid forcing Coulomb PME to re-sort whenever LJ-PME is enabled, and give dispersion PME its own atom-grid index and sort state so the performance impact can be measured independently.

c1d643e2

Add hipFFT backend for testing · b1a1c54c
one authored Apr 17, 2026

b1a1c54c

16 Apr, 2026 4 commits
- Tune sorting threshold in Coulomb PME (v3.1) · ee4ca894
  one authored Apr 16, 2026
  
  ee4ca894
- tune cutoff padding · 33ae1570
  one authored Apr 10, 2026
  
  33ae1570
- tune computeNonbonded launch params · 16f10eff
  one authored Apr 10, 2026
  
  16f10eff
- fix hiprtc target · 9c6732f0
  one authored Apr 10, 2026
  
  9c6732f0
10 Apr, 2026 1 commit
- Created ReplicaExchangeSampler (#5257) · efd89169
  Peter Eastman authored Apr 10, 2026
```
* Created ReplicaExchangeSampler

* Improvements to ReplicaExchangeSampler
```
  efd89169
07 Apr, 2026 1 commit
- Retry error when checking for broken links (#5258) · 64254c09
  Peter Eastman authored Apr 07, 2026
  
  64254c09
06 Apr, 2026 2 commits
- PythonForce can be restricted to a subset of particles (#5246) · 9aae4bb3
  Peter Eastman authored Apr 06, 2026
```
* PythonForce can be restricted to a subset of particles

* Fix exception with CUDA
```
  9aae4bb3
- Reduced memory used by SerializationNode (#5249) · 6717a85c
  Peter Eastman authored Apr 06, 2026
  
  6717a85c
02 Apr, 2026 2 commits
- Fixed pressure calculation in MOnteCarloFlexibleBarostat (#5251) · 1f2203f6
  Peter Eastman authored Apr 02, 2026
  
  1f2203f6
- Infrastructure for multistate sampling (#5231) · 9f17d0f8
  Peter Eastman authored Apr 02, 2026
```
* Infrastructure for multistate sampling

* Added computeRelativeEnergies()
```
  9f17d0f8
31 Mar, 2026 1 commit
- atomFlags switched to unordered_map for performance (#5247) · e4728a21
  ramdoys authored Mar 31, 2026
```
* optimization, switch to unordered map

* Remove emplace, reduce reserve allocation
```
  e4728a21
30 Mar, 2026 1 commit
- Avoid multiple forces running on the worker thread at once (#5243) · 5d8d5874
  Peter Eastman authored Mar 30, 2026
  
  5d8d5874
27 Mar, 2026 2 commits
- Take line search energy difference on CPU before reducing precision (#5242) · 1c528ca8
  Evan Pretti authored Mar 27, 2026
  
  1c528ca8
- Cache coefficients for long range correction (#5239) · 26df7a87
  Peter Eastman authored Mar 27, 2026
```
* Cache coefficients for long range correction

* updateParametersInContext() clears cache
```
  26df7a87
26 Mar, 2026 1 commit
- Fix leaked variable causing spurious template constraint assignment (#5236) · 13c568d0
  Jeff Wagner authored Mar 26, 2026
```
* fix 5234 and add test

* clean up docstring and standardize test name
```
  13c568d0
12 Mar, 2026 1 commit
- Update installation instructions (#5227) · b55e6088
  Peter Eastman authored Mar 12, 2026
  
  b55e6088
05 Mar, 2026 1 commit
- Fix for compilation error on some AMD GPUs (#5224) · 068495f3
  Peter Eastman authored Mar 04, 2026
  
  068495f3
26 Feb, 2026 1 commit
- Fixed bug in selecting HIS variant (#5221) · 3e8a62ba
  Peter Eastman authored Feb 26, 2026
  
  3e8a62ba
24 Feb, 2026 1 commit
- Improved documentation on CMMotionRemover (#5219) · a1579238
  Peter Eastman authored Feb 24, 2026
  
  a1579238
19 Feb, 2026 1 commit
- Fixed issue that caused inefficient sorting when a block contained only one atom (#5215) · 2c287f10
  Peter Eastman authored Feb 19, 2026
```
* Fixed issue that caused inefficient sorting when a block contained only one atom

* Add the fix to OpenCL and HIP
```
  2c287f10
17 Feb, 2026 1 commit
- Added two more synonyms for HOH (#5213) · acf36fd6
  Peter Eastman authored Feb 17, 2026
```
* Added two more synonyms for HOH

* Change H20 to H2O
```
  acf36fd6
16 Feb, 2026 1 commit

Update patch documentation (#5208) · bc6fe729

Yulian Manchev authored Feb 16, 2026



* Update patch documentation

Clarified the definition and purpose of patches.

* Fix typo in RemoveExternalBond tag description

* Update wording in patches

* A few edits to the description of patches

---------
Co-authored-by: Peter Eastman <peter.eastman@gmail.com>

bc6fe729

11 Feb, 2026 1 commit
- Specify extras_require on all platforms (#5207) · 10b23bf0
  Peter Eastman authored Feb 11, 2026
  
  10b23bf0
10 Feb, 2026 3 commits

Update version number to 8.5 (#5210) · 017fca83
Peter Eastman authored Feb 10, 2026

017fca83
Avoid error in updateParametersInContext() when there are no exceptions (#5209) · 122dbde2
Peter Eastman authored Feb 10, 2026

122dbde2

GPU implementation of L-BFGS (#5198) · 4ab645ea

Evan Pretti authored Feb 10, 2026

* Make reference/CPU minimizer into a kernel

* Add per-platform support for GPU minimization

* Initial implementation of GPU minimization

* Fixes

* Increase robustness when initial gradient is huge

* Handle overflow leading to non-finite values gracefully

* Handle large forces in single precision more robustly

* Optimize kernels

* Fix kernel launch size

* Update banner years

* Don't create MinimizeKernel until first minimization requested

* Make some compile-time constants into kernel arguments

* Consolidate scale calculation kernel

* Condense alpha/beta reduction kernels using atomics

* Condense line search dot kernels with reductions

* Remove a download, and download grad norm separately

* Asynchronously check lbfgs convergence condition

* Restructure line search to avoid download waiting

* Start line search preemptively in case CPU evaluation is not needed

* In rare cases, constraint error might not decrease after one optimization round

* Better handling of unsupported 64-bit atomics, use FLT_MAX

* Pick gradient mode based on GPU vs. CPU evaluation

* Rework getDiff/getScale reduction, remove reduceBuffer

* Older CUDA might not like float hex literals

* Fix error in a comment

4ab645ea

09 Feb, 2026 2 commits

Residue templates can specify constraints (#5197) · 834b1294
Peter Eastman authored Feb 09, 2026
```
* Residue templates can specify constraints

* Patched template generation preserves constraints
```
834b1294

API for querying devices (#5192) · add95438

Peter Eastman authored Feb 09, 2026

* API for querying devices

* CUDA and HIP implementations of getDevices()

* Fix test failures

* Fix test failures

* CUDA returns correct devices even if no context has been created

* Return a single device for Reference and CPU

* Fix CI failure

add95438

30 Jan, 2026 1 commit

Templates can match whole molecules (#5181) · 8eeee16d

Peter Eastman authored Jan 30, 2026

* Templates can match whole molecules

* addExtraParticles() supports molecule templates

* Documentation on molecule templates

* Bug fix

8eeee16d

14 Jan, 2026 1 commit
- Fix memory error in PythonForce (#5182) · 421eb42a
  Peter Eastman authored Jan 13, 2026
  
  421eb42a
08 Jan, 2026 1 commit
- Fix error reading CONECT records in PDB with multiple models (#5179) · e688b86e
  Peter Eastman authored Jan 07, 2026
  
  e688b86e
30 Dec, 2025 1 commit
- Fix typos discovered by codespell in Python files (#5173) · 82703dff
  Christian Clauss authored Dec 30, 2025
  
  82703dff