Commits · 70771a516bef96c0a69e19a4c9eff60aa36f7404 · tsoc / openmm

"vscode:/vscode.git/clone" did not exist on "6b49c0213016b86cd51c552e0905c06d18e69027"

05 Sep, 2024 3 commits

Improve latencies, handling of streams and events, multi-GPU support · 70771a51

Anton Gorenko authored Aug 25, 2024

Use a small kernel for copying interactionCounts to host memory

    hipMemcpy's CopyDeviceToHost operation has higher latency.

Do not set stream and event blocking/spin related flags

    Let the runtime choose the best option because overriding does not
    improve performance in most cases.

Remove NULL streams and use nonblocking streams explicitly

Make HipContext::pushAsCurrent/popAsCurrent thread-safe as they can be
called simultaneously from different threads via ContextSelector.

Allow peer access to be enabled more than once (if there are multiple
simulations one after another, like in benchmark.py).

Create peerCopyStream on a corresponding device

Use two-speed load balancing for multi GPU runs

    First 100 steps do coarse balancing, next 100 - fine tuning.
    Also ignore the slowest device (usually 0) if its fraction has
    reached 0, (i.e. no work can be transfered to other devices) and
    balance other devices.

Do not download inteactionCounts in parallel nonbonded tasks

    This is not required because updateNeighborListSize has been called
    and valid flag changed.

Initialize tilesAfterReorder properly

    It may contain a garbage value, and if it is large then
    updateNeighborListSize does not force reorder atoms after 25 steps
    in extremal cases.

70771a51

Port changes from the main repository · ecc2d258

Anton Gorenko authored Aug 25, 2024

Use cuCtxPushCurrent() and cuCtxPopCurrent() for selecting CUDA context

    https://github.com/openmm/openmm/pull/3258

Fixed uninitialized memory access

    https://github.com/openmm/openmm/issues/3392
    https://github.com/openmm/openmm/pull/3399

Fixed potential invalid memory access

    See https://github.com/openmm/openmm/pull/3428

Improved temperature reporting for Drude particles

    https://github.com/openmm/openmm/pull/3486
    https://github.com/openmm/openmm/commit/a5e42f5

Fixed race condition with multiple GPUs

    https://github.com/openmm/openmm/commit/6fb1c8a41edff980862750bc086f6a204eb50941

Use blocking sync when creating events

    https://github.com/openmm/openmm/commit/fe21d5ee4f14673a4ea38b7244991772a64ceec2

Very minor optimizations

    https://github.com/openmm/openmm/commit/109f6b2535da4e0c0dd88007d6ca06b4add2ce81

Use PocketFFT

    https://github.com/openmm/openmm/commit/1dac981a63300a2a53a7925f570995914f7163ed

Improved logic for deciding when to reorder atoms

    https://github.com/openmm/openmm/commit/48664a1f1a4490a4dabc277757545ac070e7b898

Ensure valid atom order after loading a checkpoint

    https://github.com/openmm/openmm/commit/a056d5a3754e193105409afa12c9f0c9a2d972a2

Improve performance running on multiple GPUs

    https://github.com/openmm/openmm/commit/0c82c2647de98da5c6dab7bf7a7b8b19705aadc0

Fixed errors when running on multiple GPUs

    https://github.com/openmm/openmm/commit/ed9df876d43c037c08d4762721e73e5caae086d9

Optimized reducing energy

    https://github.com/openmm/openmm/commit/2975f44

ecc2d258

Always use hipRTC, support Windows · b9c45d45

Anton Gorenko authored Aug 25, 2024

* Unload all loaded modules in HipContext's destructor,
  HIP modules keep file desctriptors opened, but OpenMM never unloads
  modules leaking these file descriptors. This can cause crashinf of
  some scripts like test-openmm-platforms from openmmtools.
* ROCm 6.0 defines operator* for complex types (that are typedefs for
  float2 and double2), they conflict with operators defined for vectors.
  This is fixed in newer ROCm versions.
* Revert HIP_DYNAMIC_SHARED back to extern __shared__ (the macro is
  in the headers).
* Reduce the speed of the HIP platform if there are no HIP devices in
  the system.

b9c45d45

01 Sep, 2024 1 commit

Add hipification of CUDA platform · 89d2ff0e

Anton Gorenko authored Aug 25, 2024

Port changes in CUDA backend to HIP

Fix a warning about arithmetic operations on void* in HipArray::uploadSubArray

Fix "Error Initializing context ROCm 5.3.0"

    https://github.com/StreamHPC/openmm-hip/issues/3


    hipDeviceSetCacheConfig returns hipErrorNotSupported on 5.3
Co-authored-by: Nick Curtis <nicholas.curtis@amd.com>

89d2ff0e

02 Feb, 2024 1 commit

Virtual sites can depend on other virtual sites (#4348) · 71f4b3fc

Peter Eastman authored Feb 02, 2024

* Reference platform supports nested virtual sites

* Common platform supports nested virtual sites

* Fixed force distribution from nested virtual sites

* Fixed test failures

71f4b3fc

13 Apr, 2022 1 commit
- Use blocking sync when creating events (#3561) · fe21d5ee
  Peter Eastman authored Apr 13, 2022
  
  fe21d5ee
04 Oct, 2021 1 commit

Use cuCtxPushCurrent() and cuCtxPopCurrent() for selecting CUDA context (#3258) · c456dd54

Peter Eastman authored Oct 04, 2021

* Use cuCtxPushCurrent() and cuCtxPopCurrent() for selecting CUDA context

* Fixed errors in amoeba coda

* Fixed more errors in context selection

c456dd54

03 Sep, 2020 1 commit

CCMA with a small number of constraints uses a single kernel (#2818) · b3d98469

peastman authored Sep 03, 2020

* CCMA with a small number of constraints uses a single kernel

* Fixed compilation errors in kernel

* Fixed compilation errors in kernel

* Further optimizations to CCMA with few constraints

b3d98469

08 Jan, 2020 1 commit

Common compute framework to unify CUDA and OpenCL code (#2488) · edbc8407

peastman authored Jan 08, 2020

* Began creating common compute framework to unify code between CUDA and OpenCL

* Began OpenCL implementation of common compute framework

* Common implementation of CMMotionRemover

* CUDA implementation of common compute interface

* Converted HarmonicBondForce to common compute API

* Converted standard bonded forces to common compute API

* Converted ExpressionUtilities to common compute API

* Created ComputeParameterSet

* Converted custom bonded forces to common compute API

* Converted CustomCentroidBondForce to common compute API

* Converted CustomManyParticleForce to common compute API

* Moved lots of duplicate code from CudaContext and OpenCLContext to ComputeContext

* Converted GayBerneForce to common compute API

* Removed obsolete kernels

* Converted verlet integrators to common compute API

* Converted Langevin and Brownian integrators to common compute API

* Converted CustomIntegrator to common compute API

* Converted CustomNonbondedForce to common compute API

* Removed uses of a deprecated API

* Fixed failing test cases

* Converted GBSAOBCForce to common compute API

* Began converting CustomGBForce to common compute API

* Finished converting CustomGBForce to common compute API

* Merged duplicated code in CudaIntegrationUtilities and OpenCLIntegrationUtilities

* Converted RMSDForce and AndersenThermostat to common compute API

* Converted CustomHbondForce to common compute API

* Merged scripts for encoding kernel sources

* Converted Drude plugin to common compute API

* Fixed errors in CMake scripts

* Attempt at fixing errors on Windows

* Added discussion of common compute API to developer guide

* Added Windows export macro for common classes

* Fixed error in CMMotionRemover

* Ubdated travis to newer Ubuntu version

* Fixed errors on CPU OpenCL

* Fixed Windows linking errors

* Added missing pragma for 32 bit atomics

* Replaced long long with mm_long

* More fixes to Windows linking

* Bug fix

edbc8407

24 Oct, 2019 3 commits
- Correct CUDA NHC implementation · cca3e11a
  Andy Simmonett authored Aug 27, 2019
  
  cca3e11a
- Add Drude Nose-Hoover capability, with tests · f5df1076
  Andy Simmonett authored May 29, 2019
  
  f5df1076
- Add Nose-Hoover chain serialization on CUDA · 6278ef5f
  Andy Simmonett authored May 14, 2019
  
  6278ef5f
24 Oct, 2018 1 commit
- Fixed a case of uninitialized memory · 6bebfc4a
  Peter Eastman authored Oct 24, 2018
  
  6bebfc4a
03 Jul, 2018 1 commit
- Code cleanup to CUDA platform · b441c4a7
  Peter Eastman authored Jul 03, 2018
  
  b441c4a7
12 Feb, 2018 1 commit
- Began converting CudaArrays. · b8c86406
  Peter Eastman authored Feb 12, 2018
  
  b8c86406
18 Jul, 2017 1 commit
- CUDA implementation of LocalCoordinatesSite depending on arbitrary particles · b2222de3
  Peter Eastman authored Jul 18, 2017
  
  b2222de3
13 Jan, 2017 1 commit
- Eliminated RealOpenMM type · a783b996
  peastman authored Jan 13, 2017
  
  a783b996
04 Nov, 2015 1 commit
- Finished implementing CompoundIntegrator · 5fa6fbc1
  Peter Eastman authored Nov 04, 2015
  
  5fa6fbc1
27 Aug, 2015 1 commit
- Python 2/3 compatibility in single code base, plus python 3 testing on travis. · b7088b74
  peastman authored Aug 10, 2015
  
  b7088b74
20 Aug, 2015 1 commit
- Fixed an unnecessary exception in SETTLE code · 1342158e
  peastman authored Aug 20, 2015
  
  1342158e
25 Mar, 2015 1 commit
- Parallelized extraction of inverse matrix for CCMA · 787869c3
  peastman authored Mar 25, 2015
  
  787869c3
05 Jan, 2015 1 commit
- Attempt to move the implementation of separate osrngseed() calls for each · e04a7368
  Jason Swails authored Jan 05, 2015
```
Context into the platform rather than hacked into getRandomNumberSeed...

Someone please check this.
```
  e04a7368
24 Apr, 2014 1 commit
- Created CUDA and OpenCL implementations of LocalCoordinatesSite · c88213f8
  peastman authored Apr 24, 2014
  
  c88213f8
01 Oct, 2013 1 commit
- Ignore constraints that connect two massless particles (feature request 1915) · 803faa81
  peastman authored Oct 01, 2013
  
  803faa81
12 Apr, 2013 1 commit
- When computing kinetic energy, make sure the shifted velocities are... · 820a6baa
  Peter Eastman authored Apr 12, 2013
```
When computing kinetic energy, make sure the shifted velocities are perpendicular to the constraints
```
  820a6baa
08 Apr, 2013 1 commit
- Reduced memory use for random numbers, eliminated unnecessary calculation when... · 54b7eec9
  Peter Eastman authored Apr 08, 2013
```
Reduced memory use for random numbers, eliminated unnecessary calculation when generating random numbers
```
  54b7eec9
05 Apr, 2013 1 commit
- Improved performance of CCMA · c0a43bfc
  Peter Eastman authored Apr 05, 2013
  
  c0a43bfc
22 Mar, 2013 1 commit
- Merged 5.1Optimizations branch back to trunk · 93c467b2
  Peter Eastman authored Mar 22, 2013
  
  93c467b2
26 Feb, 2013 1 commit
- No commit message · eded2c6c
  Yutong Zhao authored Feb 26, 2013
```
No commit message
```
  eded2c6c
18 Dec, 2012 1 commit
- Fixed compilation errors on some older compilers · 6a83f3ee
  Peter Eastman authored Dec 18, 2012
  
  6a83f3ee
14 Dec, 2012 1 commit

When converting to fixed point, multiply by 0x100000000 instead of 0xFFFFFFFF.... · 18355094

Peter Eastman authored Dec 14, 2012

When converting to fixed point, multiply by 0x100000000 instead of 0xFFFFFFFF.  This should be (very very slightly) more accurate, since its reciprocal can be exactly represented in floating point.

18355094

25 Oct, 2012 1 commit
- Kinetic energy is computed by the Integrator so it can adjust for leapfrog displacements · 925b00ec
  Peter Eastman authored Oct 25, 2012
  
  925b00ec
18 Oct, 2012 1 commit
- Assorted cleanup and bug fixes · 1de2e2a0
  Peter Eastman authored Oct 18, 2012
  
  1de2e2a0
02 Oct, 2012 1 commit
- Began implementing new mixed precision model that does integration in double... · 9ad85ebd
  Peter Eastman authored Oct 02, 2012
```
Began implementing new mixed precision model that does integration in double precision and force evaluation in single precision
```
  9ad85ebd
28 Sep, 2012 1 commit
- Renamed cuda2 platform to cuda · 58b094ce
  Peter Eastman authored Sep 28, 2012
  
  58b094ce
03 Jul, 2012 1 commit
- Fixed bugs when using double precision · cf112a25
  Peter Eastman authored Jul 03, 2012
  
  cf112a25
28 Jun, 2012 2 commits
- Optimization: use mapped memory to communicate when CCMA is converged · f0c2e89c
  Peter Eastman authored Jun 28, 2012
  
  f0c2e89c
- Merged lots of kernels into fewer files to reduce compilation time · 1afb079f
  Peter Eastman authored Jun 28, 2012
  
  1afb079f
19 Jun, 2012 1 commit
- Continuing to implement new CUDA platform: virtual sites · bf1f6f32
  Peter Eastman authored Jun 19, 2012
  
  bf1f6f32
16 Jun, 2012 1 commit

Continuing to implement new CUDA platform: constraints, LangevinIntegrator,... · ecbbf442

Peter Eastman authored Jun 16, 2012

Continuing to implement new CUDA platform: constraints, LangevinIntegrator, BrownianIntegrator, VariableLangevinIntegrator, VariableVerletIntegrator

ecbbf442