- 10 Feb, 2026 1 commit
-
-
Evan Pretti authored
* Make reference/CPU minimizer into a kernel * Add per-platform support for GPU minimization * Initial implementation of GPU minimization * Fixes * Increase robustness when initial gradient is huge * Handle overflow leading to non-finite values gracefully * Handle large forces in single precision more robustly * Optimize kernels * Fix kernel launch size * Update banner years * Don't create MinimizeKernel until first minimization requested * Make some compile-time constants into kernel arguments * Consolidate scale calculation kernel * Condense alpha/beta reduction kernels using atomics * Condense line search dot kernels with reductions * Remove a download, and download grad norm separately * Asynchronously check lbfgs convergence condition * Restructure line search to avoid download waiting * Start line search preemptively in case CPU evaluation is not needed * In rare cases, constraint error might not decrease after one optimization round * Better handling of unsupported 64-bit atomics, use FLT_MAX * Pick gradient mode based on GPU vs. CPU evaluation * Rework getDiff/getScale reduction, remove reduceBuffer * Older CUDA might not like float hex literals * Fix error in a comment
-
- 11 Dec, 2025 1 commit
-
-
Evan Pretti authored
* Basic LCPO support * Add basic test for LCPO from a prmtop file * API for LCPOForce * Started LCPO reference implementation * Finished reference forces & test cases * Use other test for finite difference since grid might have discontinuous forces * Reference platform formatting * Initial implementation of CPU platform * Bugfixes * More vectorization and improve neighbor list query speed * Parallelize part of neighbor search * Check box size for LCPO with periodic boundary conditions * Fixes for updating parameters in context * GBSAOBCForce doesn't use first & last indices for updates, so no need for this optimization here * Changes to neighbor checking and optimization * Fixes and minor changes * Add global surface tension parameter * Only process half of the pairs in the neighbor list * Remove unnecessary checks * Initial version of common platform implementation * Asynchronously download neighbor list size * Debugging * Do pair precomputation in copyPairsToNeighborList * Recompute interactions instead of scanning neighbor list in inner loop * Condense position array before computations * Also make neighbor count download asynchronous on device * Fixes for kernel launching * Topology-based LCPO parameter assignment * Fixes, and use test system for LCPO with nucleic acids * Always raise instead of warn when LCPO parameters can't be assigned * Use Amber convention for phosphates
-
- 23 Sep, 2025 1 commit
-
-
Evan Pretti authored
* Replace SimTK-containing file headers * Update file headers for new Tinker reader files added
-
- 12 Sep, 2025 1 commit
-
-
Evan Pretti authored
* Initial implementation of C++ API * Add kernel interface and information for API generation * API updates for updating electrode parameters * Add serialization proxy for ConstantPotentialForce * Update file headers * Add CG error tolerance and fix units on getCharges() return value * Initial implementation of matrix solver * Fixes and conjugate gradient solver * Try to fix Linux and Windows builds * Make sure charge constraint target is on total charge * Restore handling of exceptions like NonbondedForce since they won't involve electrode atoms * Ameliorate numerical instability in constrained conjugate gradient * Fix uninitialized pointers, memory leak, and style * Set CG tolerance units in Python API * Test ConstantPotentialForce serialization * Read/write ExceptionsUsePeriodicBoundaryConditions as bool * Improve constrained conjugate gradient robustness to roundoff error accumulation * Recompute matrix if electrode atoms move due to setPositions() * Tolerance is now in gradient (potential) units again * Add neutralizing background correction * Add Python API tests * Fixes for CG and nonbonded exceptions * Add initial tests checking against existing NonbondedForce behavior * Expand test suite and fix some implementation issues * Add additional tests using larger reference system * Add Gaussian test * Finish test against reference computation * CPU platform implementation * Fixes for compilation on some platforms * Fixes for constant potential with AVX/AVX2 * Test linking CPU PME library to constant potential test directly * Older SWIG versions don't support Python set to C++ set conversion * Add user guide entry * Increase speed of reference test * Conditional building constant potential CPU test is unreliable * Debugging * Miscellaneous fixes and improvements for CI * Cache charges so solver will not run if system and coordinates have not changed * Preconditioner flag, stability, and automatic detection improvements * Add GPU platform-specific constant potential kernel classes * PME and device-host I/O changes to support constant potential * Initial common constant potential implementation * Constant potential fixes: * Fix preconditioner PME position/charge save/restore logic * Fix reduction synchronization in constant potential solver kernels * Add double-float accumulation for conjugate gradient solver when double unsupported by hardware * Improve conditioning of a test system, and make sure particles are in or out of cutoff for consistency and ease of comparing between platforms * Reorder guess charges for CG when atom reordering changes positions * Remove PME queue for now * Trying to debug optimized direct space derivative kernel * Remove extraneous debugging lines * Style updates; just make CPU preconditioner double precision * Debugging updated optimized direct derivatives kernel for all but OpenCL CPU * OpenCL CPU implementation of direct space derivatives, and cleanup * Try to make test even shorter to not time out on CI * Temporary - Debugging * Debugging * Debugging * Debugging * Debugging * Remove debugging code and fix reduction synchronization * Fix other reductions * Debugging - are tests hanging or just slow on CI? * Debugging * Debugging * Fix macro for case when double precision is available on hardware * Remove changes for debugging again * Try to improve matrix solver cache locality by uploading transpose * Fixes for atom ordering and periodic images * Can't rely on reorder listener for cell offset updates * Test reducing number of contexts and timing for CI * Debugging * Remove timing code and revert debugging changes * Matrix solver and plasma term optimizations * Reduce CG solver kernel calls and downloads * Don't read back convergence flag from global memory * Update PME due to refactoring in master branch * Faster matrix solver (1st step) * Faster matrix solver for CUDA * Faster matrix solver compatibility with non-CUDA platforms * Matrix solver fixes * Use warp shuffle reductions when possible * Attempt to work around intermittent compiler crash in Intel CPU OpenCL * Optimize CG solver kernel 1 * Rework CG solver so some kernels can use more than 1 block * Don't run out of shared memory * Asynchronously download convergence flag while clearing buffers --------- Co-authored-by:Evan Pretti <pretti@sh03-17n15.int>
-
- 10 Sep, 2024 1 commit
-
-
Peter Eastman authored
* Unified lots of parallel computation code between platforms * Unified test code between platforms * Eliminated duplicated timing code
-
- 16 Sep, 2023 1 commit
-
-
Peter Eastman authored
* Implemented CustomCPPForceImpl * Documentation for CustomCPPForceImpl * Attempt at fixing Windows compilation error * Improved documentation
-
- 20 Aug, 2020 1 commit
-
-
peastman authored
* Fixed range overflow with very large numbers of atoms * More fixes to overflow with large numbers of atoms * Fix test failures
-
- 14 Feb, 2020 1 commit
-
-
Peter Eastman authored
-
- 21 Oct, 2019 1 commit
-
-
peastman authored
-
- 26 Jan, 2017 1 commit
-
-
Peter Eastman authored
-
- 13 Jan, 2017 1 commit
-
-
peastman authored
-
- 02 May, 2016 1 commit
-
-
peastman authored
-
- 26 Feb, 2016 1 commit
-
-
peastman authored
-
- 15 Jan, 2016 1 commit
-
-
Peter Eastman authored
-
- 04 Nov, 2015 1 commit
-
-
Peter Eastman authored
-
- 24 Sep, 2015 1 commit
-
-
peastman authored
-
- 23 Sep, 2015 2 commits
-
-
Peter Eastman authored
-
Peter Eastman authored
-
- 22 Sep, 2015 2 commits
-
-
Peter Eastman authored
-
Peter Eastman authored
-
- 21 Sep, 2015 1 commit
-
-
Peter Eastman authored
-
- 03 Sep, 2015 1 commit
-
-
peastman authored
-
- 27 Aug, 2015 1 commit
-
-
peastman authored
-
- 12 Aug, 2015 1 commit
-
-
Peter Eastman authored
-
- 07 Jul, 2015 1 commit
-
-
Peter Eastman authored
-
- 24 Feb, 2015 1 commit
-
-
peastman authored
-
- 08 Jan, 2015 1 commit
-
-
Peter Eastman authored
-
- 19 Dec, 2014 1 commit
-
-
peastman authored
-
- 17 Dec, 2014 1 commit
-
-
peastman authored
-
- 16 Dec, 2014 1 commit
-
-
peastman authored
-
- 10 Dec, 2014 1 commit
-
-
peastman authored
-
- 26 Nov, 2014 1 commit
-
-
peastman authored
-
- 13 Oct, 2014 1 commit
-
-
peastman authored
-
- 06 Oct, 2014 1 commit
-
-
peastman authored
-
- 04 Sep, 2014 1 commit
-
-
peastman authored
-
- 29 Aug, 2014 1 commit
-
-
peastman authored
CustomManyParticleForce offers two different "permutation modes". Implemented it for Reference and CPU platforms.
-
- 22 Aug, 2014 1 commit
-
-
peastman authored
-
- 15 Aug, 2014 1 commit
-
-
peastman authored
-
- 14 Aug, 2014 1 commit
-
-
peastman authored
-
- 25 Jul, 2014 1 commit
-
-
peastman authored
-