- 01 Sep, 2024 2 commits
-
-
Anton Gorenko authored
* Compile kernels with max block size of 256 threads: The default hipcc behavior since ROCm 4.2 is to compile kernels with 1024 threads unless __launch_bounds__ is specified. This significantly increases register pressure especially in heavy kernels (double precision, for example), requiring register spilling; * Optimize computeRange by using multiple blocks for reduction; * Use blocks of 1024 threads for computeBucketPositions - it is executed as a single work group so larger block size is faster; * Sort up-to lenghtNextPow2 instead of blockDim.x (faster for short buckets); * Optimize sortShortList2; * Optimize sortBuckets with bit instructions; * Decrease bucket size for non-uniform sorting: too many buckets may have sizes too large to sort in shared memory; * Add more sizes in tests.
-
Anton Gorenko authored
Port changes in CUDA backend to HIP Fix a warning about arithmetic operations on void* in HipArray::uploadSubArray Fix "Error Initializing context ROCm 5.3.0" https://github.com/StreamHPC/openmm-hip/issues/3 hipDeviceSetCacheConfig returns hipErrorNotSupported on 5.3 Co-authored-by:Nick Curtis <nicholas.curtis@amd.com>
-
- 27 Dec, 2021 1 commit
-
-
Peter Eastman authored
* Optimized CudaSort for non-uniformly distributed data * Optimized OpenCLSort for non-uniformly distributed data * Further tuned distributing elements between buckets * Copied optimizations over to OpenCL
-
- 08 Oct, 2020 1 commit
-
-
peastman authored
-
- 16 May, 2018 1 commit
-
-
peastman authored
-
- 03 May, 2018 1 commit
-
-
peastman authored
-
- 12 Feb, 2018 1 commit
-
-
Peter Eastman authored
-
- 12 Sep, 2016 1 commit
-
-
Peter Eastman authored
-
- 14 Nov, 2014 1 commit
-
-
Peter Eastman authored
-
- 22 Mar, 2013 1 commit
-
-
Peter Eastman authored
-
- 28 Sep, 2012 1 commit
-
-
Peter Eastman authored
-
- 16 Jun, 2012 1 commit
-
-
Peter Eastman authored
Continuing to implement new CUDA platform: constraints, LangevinIntegrator, BrownianIntegrator, VariableLangevinIntegrator, VariableVerletIntegrator
-
- 05 Jun, 2012 1 commit
-
-
Peter Eastman authored
-