Commits · 4ab645ea6911bf29e9ec6fd13b50ba1e196cbc40 · tsoc / openmm

10 Feb, 2026 1 commit

GPU implementation of L-BFGS (#5198) · 4ab645ea

Evan Pretti authored Feb 10, 2026

* Make reference/CPU minimizer into a kernel

* Add per-platform support for GPU minimization

* Initial implementation of GPU minimization

* Fixes

* Increase robustness when initial gradient is huge

* Handle overflow leading to non-finite values gracefully

* Handle large forces in single precision more robustly

* Optimize kernels

* Fix kernel launch size

* Update banner years

* Don't create MinimizeKernel until first minimization requested

* Make some compile-time constants into kernel arguments

* Consolidate scale calculation kernel

* Condense alpha/beta reduction kernels using atomics

* Condense line search dot kernels with reductions

* Remove a download, and download grad norm separately

* Asynchronously check lbfgs convergence condition

* Restructure line search to avoid download waiting

* Start line search preemptively in case CPU evaluation is not needed

* In rare cases, constraint error might not decrease after one optimization round

* Better handling of unsupported 64-bit atomics, use FLT_MAX

* Pick gradient mode based on GPU vs. CPU evaluation

* Rework getDiff/getScale reduction, remove reduceBuffer

* Older CUDA might not like float hex literals

* Fix error in a comment

4ab645ea