- 14 Dec, 2025 1 commit
-
-
Anton Gorenko authored
* Remove std::enable_if, warpRotateLeft is always used with TILE_SIZE * Do not use built-in warpSize in constexpr contexts Starting from ROCm 7 warpSize is no longer constexpr. findInteractingBlocks.hip uses it for sizes of __shared__ arrays. * Check if hipHostMallocNumaUser is allowed before using it
-
- 07 Jun, 2025 1 commit
-
-
Anton Gorenko authored
* Add a workaround for infinite loop in computeNonbonded (HIP) computeNonbonded hangs in some tests (without neighbor list). Reproducible on ROCm 6.4 and 6.4.1 (maybe on older versions too) on various architectures (both CDNA and RDNA). Affected tests: TestHipATMForce, TestHipMonteCarloBarostat, TestHipNonbondedForce, TestHipVirtualSites. Disassembly shows that the compiler splits branches of `if (skipBase+tgx < NUM_TILES_WITH_EXCLUSIONS)` and does `SHFL(skipTiles, TILE_SIZE-1) < pos` checks in them separately, even though `__builtin_amdgcn_ds_bpermute` is a convergent function. Apparently in this case not all lanes participate in each call. * Simplify includeTile check using ballot
-
- 05 Sep, 2024 2 commits
-
-
Anton Gorenko authored
Skip neighbor list for very small systems https://github.com/openmm/openmm/pull/4070 Store bounding box sizes in half precision https://github.com/openmm/openmm/commit/2ae50f9 Use large blocks to optimize building the neighbor list https://github.com/openmm/openmm/commit/3955033 Improved sorting of blocks when building neighbor list https://github.com/openmm/openmm/commit/796ffaa Fixed bug in large blocks optimization with triclinic boxes https://github.com/openmm/openmm/commit/4c10732 Optimize sorting of non-uniformly distributed data https://github.com/openmm/openmm/commit/71d9bb1 Co-authored-by:bdenhollander <44237618+bdenhollander@users.noreply.github.com>
-
Anton Gorenko authored
* All AMD GPUs support shuffle, double precision and 64-bit int atomics; * Remove unused code: !ENABLE_SHUFFLE code paths in nonbonded.hip; * Use intrinsics in single-precision; * Use realToFixedPoint (faster float32-to-int64); * Remove shared atomIndices, use shuffles; * Check early if atoms are in the cutoff range, sometimes all lanes in a warp can skip computations, single pairs can also skip useless atomics with zero values; * Remove volatile skipTiles access, use shuffles; * Distribute work for warps in a strided order; * Skip warps that may be still busy in the first loop; * Unify conditions for excluded atoms with `includeInteraction`; * Move multiprocessors to HipContext; * Increase number of warps for computeNonbonded; * Disable packed math for >=MI200 (it affects performance of some kernels like computeGKForces of amoebagk); * Remove defaultOptimizationOptions and createModule's optimizationFlags as they are never used; * Support -save-temps.
-
- 01 Sep, 2024 1 commit
-
-
Anton Gorenko authored
Port changes in CUDA backend to HIP Fix a warning about arithmetic operations on void* in HipArray::uploadSubArray Fix "Error Initializing context ROCm 5.3.0" https://github.com/StreamHPC/openmm-hip/issues/3 hipDeviceSetCacheConfig returns hipErrorNotSupported on 5.3 Co-authored-by:Nick Curtis <nicholas.curtis@amd.com>
-
- 23 May, 2023 1 commit
-
-
Peter Eastman authored
* Skip neighbor list for very small systems * Fixed typos * Don't skip box size check when not using neighbor list * Made test larger to ensure it uses neighbor list
-
- 24 Mar, 2022 1 commit
-
-
Peter Eastman authored
-
- 07 Mar, 2022 1 commit
-
-
Anton Gorenko authored
It allows to use a faster float-to-int64 in the HIP platform.
-
- 04 Mar, 2022 1 commit
-
-
Peter Eastman authored
* Minor optimizations to computing single pairs * Adjusted MAX_BITS_FOR_PAIRS on Ampere
-
- 10 Sep, 2020 1 commit
-
-
peastman authored
-
- 20 Aug, 2020 1 commit
-
-
peastman authored
* Fixed range overflow with very large numbers of atoms * More fixes to overflow with large numbers of atoms * Fix test failures
-
- 01 Jul, 2020 1 commit
-
-
Peter Eastman authored
-
- 15 Sep, 2017 1 commit
-
-
Peter Eastman authored
-
- 13 Oct, 2016 2 commits
-
-
Peter Eastman authored
-
Peter Eastman authored
-
- 14 Sep, 2016 1 commit
-
-
Peter Eastman authored
-
- 02 Sep, 2016 1 commit
-
-
peastman authored
-
- 27 Jul, 2016 1 commit
-
-
Peter Eastman authored
-
- 25 May, 2016 1 commit
-
-
Peter Eastman authored
-
- 02 Oct, 2015 1 commit
-
-
Peter Eastman authored
-
- 21 Sep, 2015 1 commit
-
-
Peter Eastman authored
-
- 07 Jul, 2015 1 commit
-
-
Peter Eastman authored
-
- 05 Jan, 2015 1 commit
-
-
Peter Eastman authored
-
- 13 Oct, 2014 1 commit
-
-
peastman authored
-
- 09 Oct, 2014 1 commit
-
-
peastman authored
-
- 12 Aug, 2014 1 commit
-
-
peastman authored
-
- 27 Jun, 2014 1 commit
-
-
peastman authored
-
- 07 Jan, 2014 1 commit
-
-
peastman authored
-
- 04 Jun, 2013 1 commit
-
-
peastman authored
Converted the array containing atom block indices for the neighbor list from ushort2 to int. This removes the hard limit of 2 million atoms.
-
- 29 May, 2013 1 commit
-
-
Yutong Zhao authored
-
- 24 May, 2013 3 commits
-
-
peastman authored
-
Yutong Zhao authored
-
Yutong Zhao authored
-
- 22 May, 2013 1 commit
-
-
Yutong Zhao authored
-
- 16 May, 2013 1 commit
-
-
Yutong Zhao authored
-
- 19 Apr, 2013 1 commit
-
-
Yutong Zhao authored
Fixes a hard to catch bug when a boundingBoxSize increases in size in between a timestep, but does not trigger a rebuild of the neighbourlist. Affects the usage of singlePeriodicCopy in nonbonded force.
-
- 10 Apr, 2013 1 commit
-
-
Peter Eastman authored
-
- 22 Mar, 2013 1 commit
-
-
Peter Eastman authored
-
- 14 Dec, 2012 1 commit
-
-
Peter Eastman authored
When converting to fixed point, multiply by 0x100000000 instead of 0xFFFFFFFF. This should be (very very slightly) more accurate, since its reciprocal can be exactly represented in floating point.
-
- 05 Oct, 2012 1 commit
-
-
Peter Eastman authored
-