- 09 Jul, 2025 1 commit
-
-
Peter Eastman authored
* Fixed bug in computing pressure (#4980) * Add function parsing to custom multiparticle force (#4986) * Merge pull request #4989 from epretti/fix-charmm-water-models Update CHARMM36 2024 and add water models --------- Co-authored-by:
Ezra Greenberg <120955867+egreenberg7@users.noreply.github.com> Co-authored-by:
Evan Pretti <pretti@stanford.edu>
-
- 07 Jun, 2025 2 commits
-
-
Anton Gorenko authored
* Use fixed point spread charge on RDNA4 as it is faster Even though RDNA4 (gfx12) has global_atomic_add_f32, micro-benchmarks and OpenMM benchmarks show that it is very slow compared to global_atomic_add_u64. * Add a workaround for fixed point gridSpreadCharge on RDNA4 Workaround for rare cases when few values of pmeGrid are very large and incorrect. The cause is unknown. Why this workaround or other irrelevant changes like printf help is also unknown.
-
Anton Gorenko authored
* Add a workaround for infinite loop in computeNonbonded (HIP) computeNonbonded hangs in some tests (without neighbor list). Reproducible on ROCm 6.4 and 6.4.1 (maybe on older versions too) on various architectures (both CDNA and RDNA). Affected tests: TestHipATMForce, TestHipMonteCarloBarostat, TestHipNonbondedForce, TestHipVirtualSites. Disassembly shows that the compiler splits branches of `if (skipBase+tgx < NUM_TILES_WITH_EXCLUSIONS)` and does `SHFL(skipTiles, TILE_SIZE-1) < pos` checks in them separately, even though `__builtin_amdgcn_ds_bpermute` is a convergent function. Apparently in this case not all lanes participate in each call. * Simplify includeTile check using ballot
-
- 02 Jun, 2025 1 commit
-
-
Peter Eastman authored
-
- 25 May, 2025 1 commit
-
-
Peter Eastman authored
-
- 24 May, 2025 1 commit
-
-
Emilio Gallicchio authored
* set box vectors of the inner contexts before atom reordering * test for changing box vectors
-
- 23 May, 2025 1 commit
-
-
Peter Eastman authored
* Optimized setPositions() and setVelocities() * Fix test failures
-
- 20 May, 2025 1 commit
-
-
Pier Fiedorowicz authored
* Fix GPU memory leak * Undo CUDA change
-
- 05 May, 2025 1 commit
-
-
Peter Eastman authored
* Use common API for kernels * More code uses common interface * Bug fixes * Unified interface for sorting * Simplified interface for FFT * Use common event API for synchronization * Minor changes to make code more consistent between platforms * Common implementation of NonbondedForce * Bug fixes * Flag to enable list of single pairs * CUDA and OpenCL use common implementation of NonbondedForce * Fixed compilation error * HIP uses common implementation of NonbondedForce
-
- 02 May, 2025 1 commit
-
-
Peter Eastman authored
-
- 28 Apr, 2025 2 commits
-
-
Peter Eastman authored
* Unified interface for queues * Simplified stream handling in CudaFFT3D * HIP implementation of ComputeQueue
-
Peter Eastman authored
* Added computeCurrentPressure() to MonteCarloBarostat * Use instantaneous temperature to compute pressure * Added computeCurrentPressure() to MonteCarloAnisotropicBarostat * Added computeCurrentPressure() to MonteCarloMembraneBarostat * Fixed compilation error * Fixed error in typemap * Added documentation on computing pressure * Fixed CUDA compilation errors * Made test case more robust * Made a test case more robust * Added computeCurrentPressure() to MonteCarloFlexibleBarostat * Fixed compilation error * More documentation on computing pressure
-
- 25 Apr, 2025 1 commit
-
-
Peter Eastman authored
* Unified interface for FFTs * AMOEBA uses unified interface for FFTs * HIP implementation of common FFT interface
-
- 23 Apr, 2025 1 commit
-
-
Peter Eastman authored
* Add correction for self energy of neutralizing plasma * Fixed compilation errors * Update total charge in copyParametersToContext() * Bug fixes * Fixed compilation errors in HIP * Bug fix
-
- 14 Apr, 2025 1 commit
-
-
Peter Eastman authored
* Created DPDIntegrator class * Reference implementation of DPDIntegrator * Build neighbor list for DPDIntegrator * Minor fixes * Documentation for DPDIntegrator * Python API for DPDIntegrator * Preliminary OpenCL implementation of DPDIntegrator * Enable USE_PERIODIC * Use updated positions in DPD thermostat * Working on neighbor list for OpenCL DPDIntegrator * ReorderListener for particle types * Serialization for DPDIntegrator * CUDA implementation of DPDIntegrator * HIP implementation of DPDIntegrator * Fixed compile error in Python wrapper * Fixed compile error in wrappers * Fixed uninitialized memory in reference neighbor list * Added DPDIntegrator to C++ API docs * Fixed incorrect launch size * Fixed nan in DPD random number generator * Minor optimizations * Improved load balancing * Fixed an indexing error * Neighbor list uses the maximum cutoff of any force * Fixed HIP compilation error * Fixed access to invalid memory * Added test case for diffusion coefficient * Try to debug segfaults on CI * Debugging * Debugging * Debugging * Debugging * Debugging * Debugging * Possible fix * Debugging * Debugging * Debugging * Use correct block size on CPU OpenCL * Workaround for bug in Intel's OpenCL for CPUs * Removed an unnecessary define * Removed debugging code * Include Dart * More Intel workarounds * Workaround for error in NVIDIA OpenCL
-
- 21 Mar, 2025 1 commit
-
-
Peter Eastman authored
-
- 14 Mar, 2025 1 commit
-
-
Peter Eastman authored
* Began splitting CommonKernels into multiple files * Moved two more kernels into separate files * Moved two integrators into separate files * Fix compilation error on Windows
-
- 11 Mar, 2025 1 commit
-
-
Emilio Gallicchio authored
* reset overflowed state energies at the alchemical endpoints * address formatting, complete clash test * Fixed indentation --------- Co-authored-by:Peter Eastman <peter.eastman@gmail.com>
-
- 10 Mar, 2025 1 commit
-
-
Peter Eastman authored
* Replace pthreads with C++ threads * Try to fix CI errors * Try including -pthread linker option
-
- 05 Mar, 2025 1 commit
-
-
Peter Eastman authored
-
- 04 Mar, 2025 1 commit
-
-
Peter Eastman authored
-
- 13 Jan, 2025 1 commit
-
-
Peter Eastman authored
-
- 16 Dec, 2024 1 commit
-
-
Peter Eastman authored
-
- 27 Nov, 2024 1 commit
-
-
Peter Eastman authored
* CPU platform checkpoints random number generator * Fix Windows compilation error * Another Windows compilation error
-
- 26 Nov, 2024 1 commit
-
-
Peter Eastman authored
* Use Intel OpenCL for CI * Set environment variables * Try to get CI to run * Debugging * Debugging * Fixes for Intel OpenCL
-
- 22 Nov, 2024 1 commit
-
-
Peter Eastman authored
* updateParametersInContext() can modify parameter offsets * Reordering respects parameter offsets * Implemented for CUDA and HIP
-
- 11 Nov, 2024 1 commit
-
-
Peter Eastman authored
* Reduced memory use while identifying molecule groups * Further reduce memory use
-
- 01 Nov, 2024 1 commit
-
-
Peter Eastman authored
-
- 09 Oct, 2024 1 commit
-
-
Peter Eastman authored
-
- 23 Sep, 2024 1 commit
-
-
Anton Gorenko authored
* PME_ORDER threads process one atom; * PME_ORDER threads access consecutive addresses; * No need to permute z indices with zindexTable; * finishSpreadCharge is needed only with fixed point charge spreading;
-
- 10 Sep, 2024 3 commits
-
-
Peter Eastman authored
-
Peter Eastman authored
-
Peter Eastman authored
* Unified lots of parallel computation code between platforms * Unified test code between platforms * Eliminated duplicated timing code
-
- 06 Sep, 2024 1 commit
-
-
Peter Eastman authored
* Optimize CustomNonbondedForce.updateParametersInContext() * Optimized uploading changed values to GPU * Optimized updateParametersInContext() for lots of bonded forces * Optimized updateParametersInContext() for CustomExternalForce * Optimized updateParametersInContext() for NonbondedForce * Code changes for HIP platform
-
- 05 Sep, 2024 6 commits
-
-
peastman authored
-
Peter Eastman authored
-
Anton Gorenko authored
CustomCPPForceImpl for writing forces in C++ https://github.com/openmm/openmm/commit/9a0db72 https://github.com/openmm/openmm/pull/4231 Virtual sites can depend on other virtual sites https://github.com/openmm/openmm/commit/71f4b3f Use LF-Middle for LangevinIntegrator and VariableLangevinIntegrator https://github.com/openmm/openmm/commit/86988b9 Merged more code into common platform https://github.com/openmm/openmm/commit/5739788 * Common implementation of BondedUtilities * Common implementation of UpdateStateDataKernel Fixed periodic box changing from rectangular to triclinic https://github.com/openmm/openmm/commit/75d4f29 -
Anton Gorenko authored
Skip neighbor list for very small systems https://github.com/openmm/openmm/pull/4070 Store bounding box sizes in half precision https://github.com/openmm/openmm/commit/2ae50f9 Use large blocks to optimize building the neighbor list https://github.com/openmm/openmm/commit/3955033 Improved sorting of blocks when building neighbor list https://github.com/openmm/openmm/commit/796ffaa Fixed bug in large blocks optimization with triclinic boxes https://github.com/openmm/openmm/commit/4c10732 Optimize sorting of non-uniformly distributed data https://github.com/openmm/openmm/commit/71d9bb1 Co-authored-by:bdenhollander <44237618+bdenhollander@users.noreply.github.com>
-
Anton Gorenko authored
Co-authored-by:Emilio Gallicchio <emilio.gallicchio@gmail.com>
-
Anton Gorenko authored
Use a small kernel for copying interactionCounts to host memory hipMemcpy's CopyDeviceToHost operation has higher latency. Do not set stream and event blocking/spin related flags Let the runtime choose the best option because overriding does not improve performance in most cases. Remove NULL streams and use nonblocking streams explicitly Make HipContext::pushAsCurrent/popAsCurrent thread-safe as they can be called simultaneously from different threads via ContextSelector. Allow peer access to be enabled more than once (if there are multiple simulations one after another, like in benchmark.py). Create peerCopyStream on a corresponding device Use two-speed load balancing for multi GPU runs First 100 steps do coarse balancing, next 100 - fine tuning. Also ignore the slowest device (usually 0) if its fraction has reached 0, (i.e. no work can be transfered to other devices) and balance other devices. Do not download inteactionCounts in parallel nonbonded tasks This is not required because updateNeighborListSize has been called and valid flag changed. Initialize tilesAfterReorder properly It may contain a garbage value, and if it is large then updateNeighborListSize does not force reorder atoms after 25 steps in extremal cases.
-