- 01 Sep, 2024 2 commits
-
-
Anton Gorenko authored
* Compile kernels with max block size of 256 threads: The default hipcc behavior since ROCm 4.2 is to compile kernels with 1024 threads unless __launch_bounds__ is specified. This significantly increases register pressure especially in heavy kernels (double precision, for example), requiring register spilling; * Optimize computeRange by using multiple blocks for reduction; * Use blocks of 1024 threads for computeBucketPositions - it is executed as a single work group so larger block size is faster; * Sort up-to lenghtNextPow2 instead of blockDim.x (faster for short buckets); * Optimize sortShortList2; * Optimize sortBuckets with bit instructions; * Decrease bucket size for non-uniform sorting: too many buckets may have sizes too large to sort in shared memory; * Add more sizes in tests.
-
Anton Gorenko authored
Port changes in CUDA backend to HIP Fix a warning about arithmetic operations on void* in HipArray::uploadSubArray Fix "Error Initializing context ROCm 5.3.0" https://github.com/StreamHPC/openmm-hip/issues/3 hipDeviceSetCacheConfig returns hipErrorNotSupported on 5.3 Co-authored-by:Nick Curtis <nicholas.curtis@amd.com>
-
- 11 Dec, 2023 1 commit
-
-
Peter Eastman authored
* Improved sorting of blocks when building neighbor list * Improved block sorting for OpenCL * Made sort keys more evenly distributed
-
- 24 Jul, 2023 1 commit
-
-
Peter Eastman authored
* Use large blocks to optimize building the neighbor list * Large blocks optimization for OpenCL * Fix test failures * Select whether to use large blocks based on system size
-
- 23 May, 2023 1 commit
-
-
Peter Eastman authored
* Skip neighbor list for very small systems * Fixed typos * Don't skip box size check when not using neighbor list * Made test larger to ensure it uses neighbor list
-
- 02 Mar, 2023 1 commit
-
-
Anton Gorenko authored
It may contain a garbage value, and if it is large then updateNeighborListSize does not force reorder atoms after 25 steps in extremal cases.
-
- 12 Aug, 2022 1 commit
-
-
Peter Eastman authored
-
- 13 Apr, 2022 1 commit
-
-
Peter Eastman authored
-
- 24 Mar, 2022 1 commit
-
-
Peter Eastman authored
-
- 04 Mar, 2022 1 commit
-
-
Peter Eastman authored
* Minor optimizations to computing single pairs * Adjusted MAX_BITS_FOR_PAIRS on Ampere
-
- 27 Jan, 2022 1 commit
-
-
Peter Eastman authored
* Fixed potential invalid memory access * Fixed exception
-
- 27 Dec, 2021 1 commit
-
-
Peter Eastman authored
* Optimized CudaSort for non-uniformly distributed data * Optimized OpenCLSort for non-uniformly distributed data * Further tuned distributing elements between buckets * Copied optimizations over to OpenCL
-
- 22 May, 2021 1 commit
-
-
Peter Eastman authored
* Began converting AMOEBA to common platform * Beginning of OpenCL platform for AMOEBA * Converted AmoebaVdwForce to common platform * Cleaned up reference AMOEBA tests * Began converting AmoebaMultipoleForce to common platform * Continue converting AmoebaMultipoleForce to common platform * Bug fixes * Bug fix * Continue converting AmoebaMultipoleForce to common platform * Converting AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce to common platform * Converting AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce to common platform * Creating OpenCL version of AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce * Creating OpenCL version of AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce * Creating OpenCL version of AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce * Converted arrays from real3 to real * Bug fix to OpenCL AmoebaGeneralizedKirkwoodForce * Fixes for AMD GPUs * Began converting HippoNonbondedForce to common platform * Continuing to convert HippoNonbondedForce to common platform * Continuing to convert HippoNonbondedForce to common platform * Working on unifying PME kernels * Fixed error on devices without 64 bit atomics * Unified PME kernels * Converted HippoNonbondedForce to common platform * Creating OpenCL implementation of HippoNonbondedForce * Continuing OpenCL implementation of HippoNonbondedForce * Mostly finished OpenCL implementation of HippoNonbondedForce * Eliminated three component vector types in host code * Fix errors on CPU OpenCL * Skip double precision tests for AMOEBA on OpenCL * Bug fixes * Bug fixes * Fixed compilation error
-
- 18 Feb, 2021 1 commit
-
-
Peter Eastman authored
-
- 25 Sep, 2020 1 commit
-
-
peastman authored
-
- 10 Sep, 2020 1 commit
-
-
peastman authored
-
- 20 Aug, 2020 1 commit
-
-
peastman authored
* Fixed range overflow with very large numbers of atoms * More fixes to overflow with large numbers of atoms * Fix test failures
-
- 08 Jan, 2020 1 commit
-
-
peastman authored
* Began creating common compute framework to unify code between CUDA and OpenCL * Began OpenCL implementation of common compute framework * Common implementation of CMMotionRemover * CUDA implementation of common compute interface * Converted HarmonicBondForce to common compute API * Converted standard bonded forces to common compute API * Converted ExpressionUtilities to common compute API * Created ComputeParameterSet * Converted custom bonded forces to common compute API * Converted CustomCentroidBondForce to common compute API * Converted CustomManyParticleForce to common compute API * Moved lots of duplicate code from CudaContext and OpenCLContext to ComputeContext * Converted GayBerneForce to common compute API * Removed obsolete kernels * Converted verlet integrators to common compute API * Converted Langevin and Brownian integrators to common compute API * Converted CustomIntegrator to common compute API * Converted CustomNonbondedForce to common compute API * Removed uses of a deprecated API * Fixed failing test cases * Converted GBSAOBCForce to common compute API * Began converting CustomGBForce to common compute API * Finished converting CustomGBForce to common compute API * Merged duplicated code in CudaIntegrationUtilities and OpenCLIntegrationUtilities * Converted RMSDForce and AndersenThermostat to common compute API * Converted CustomHbondForce to common compute API * Merged scripts for encoding kernel sources * Converted Drude plugin to common compute API * Fixed errors in CMake scripts * Attempt at fixing errors on Windows * Added discussion of common compute API to developer guide * Added Windows export macro for common classes * Fixed error in CMMotionRemover * Ubdated travis to newer Ubuntu version * Fixed errors on CPU OpenCL * Fixed Windows linking errors * Added missing pragma for 32 bit atomics * Replaced long long with mm_long * More fixes to Windows linking * Bug fix
-
- 09 Apr, 2019 1 commit
-
-
peastman authored
* Created API for HIPPO force field * Beginning of reference implementation of HIPPO * Continuing reference implementation of HIPPO * Continuing reference implementation of HIPPO * Continuing reference implementation of HIPPO * Continuing reference implementation of HIPPO * Continuing reference implementation of HIPPO * Continuing reference implementation of HIPPO * Continuing reference implementation of HIPPO * Completed reference of HIPPO with no cutoff * Beginning cutoffs/PME for reference implementation of HIPPO * Continuing PME for reference implementation of HIPPO * Continuing PME for reference implementation of HIPPO * Continuing PME for reference implementation of HIPPO * Completed reference implementation of HIPPO * Cleanup and optimization to HIPPO reference * Further cleanup to HIPPO * Combined direct space interactions into a single loop * Compute direct space interactions in quasi-internal frame * Beginning of CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Continuing CUDA implementation of HIPPO * Finished CUDA implementation of HIPPO * More features and test cases for HippoNonbondedForce * Serialization and Python API for HippoNonbondedForce * Fixed sign error in computing forces
-
- 16 Mar, 2018 1 commit
-
-
Peter Eastman authored
-
- 15 Mar, 2018 1 commit
-
-
peastman authored
-
- 12 Mar, 2018 1 commit
-
-
Peter Eastman authored
-
- 12 Feb, 2018 1 commit
-
-
Peter Eastman authored
-
- 02 Dec, 2016 1 commit
-
-
Peter Eastman authored
-
- 30 Nov, 2016 1 commit
-
-
Peter Eastman authored
-
- 13 Oct, 2016 1 commit
-
-
Peter Eastman authored
-
- 14 Sep, 2016 1 commit
-
-
Peter Eastman authored
-
- 27 Jul, 2016 1 commit
-
-
Peter Eastman authored
-
- 25 May, 2016 1 commit
-
-
Peter Eastman authored
-
- 21 Sep, 2015 1 commit
-
-
Peter Eastman authored
-
- 27 Aug, 2015 1 commit
-
-
peastman authored
-
- 10 Aug, 2015 1 commit
-
-
Peter Eastman authored
-
- 03 Aug, 2015 1 commit
-
-
peastman authored
-
- 07 Jul, 2015 1 commit
-
-
Peter Eastman authored
-
- 10 Mar, 2015 1 commit
-
-
peastman authored
-
- 06 Mar, 2015 2 commits
- 05 Jan, 2015 1 commit
-
-
Peter Eastman authored
-
- 04 Nov, 2014 1 commit
-
-
peastman authored
-
- 08 Jul, 2013 1 commit
-
-
peastman authored
-