- 23 Sep, 2024 1 commit
-
-
Anton Gorenko authored
* PME_ORDER threads process one atom; * PME_ORDER threads access consecutive addresses; * No need to permute z indices with zindexTable; * finishSpreadCharge is needed only with fixed point charge spreading;
-
- 05 Sep, 2024 3 commits
-
-
Anton Gorenko authored
Use cuCtxPushCurrent() and cuCtxPopCurrent() for selecting CUDA context https://github.com/openmm/openmm/pull/3258 Fixed uninitialized memory access https://github.com/openmm/openmm/issues/3392 https://github.com/openmm/openmm/pull/3399 Fixed potential invalid memory access See https://github.com/openmm/openmm/pull/3428 Improved temperature reporting for Drude particles https://github.com/openmm/openmm/pull/3486 https://github.com/openmm/openmm/commit/a5e42f5 Fixed race condition with multiple GPUs https://github.com/openmm/openmm/commit/6fb1c8a41edff980862750bc086f6a204eb50941 Use blocking sync when creating events https://github.com/openmm/openmm/commit/fe21d5ee4f14673a4ea38b7244991772a64ceec2 Very minor optimizations https://github.com/openmm/openmm/commit/109f6b2535da4e0c0dd88007d6ca06b4add2ce81 Use PocketFFT https://github.com/openmm/openmm/commit/1dac981a63300a2a53a7925f570995914f7163ed Improved logic for deciding when to reorder atoms https://github.com/openmm/openmm/commit/48664a1f1a4490a4dabc277757545ac070e7b898 Ensure valid atom order after loading a checkpoint https://github.com/openmm/openmm/commit/a056d5a3754e193105409afa12c9f0c9a2d972a2 Improve performance running on multiple GPUs https://github.com/openmm/openmm/commit/0c82c2647de98da5c6dab7bf7a7b8b19705aadc0 Fixed errors when running on multiple GPUs https://github.com/openmm/openmm/commit/ed9df876d43c037c08d4762721e73e5caae086d9 Optimized reducing energy https://github.com/openmm/openmm/commit/2975f44 -
Anton Gorenko authored
* VkFFT-based 3D FFT; * Caching of compiled VkFFT kernels; * Extend FFT tests with more sizes.
-
Anton Gorenko authored
* Compile with -munsafe-fp-atomics to enable fast hardware f32 atomic add on global memory on pre-MI100 GPUs; * Use fixed point charge spreading on other GPUs, otherwise float atomic add will be compiled as a slow CAS loop; * Tune block sizes, use executeKernelFlat; * Tune launch bounds of PME grid-related kernels: force the compiler to use all registers by limiting max waves per EU to 1.
-
- 01 Sep, 2024 2 commits
-
-
Anton Gorenko authored
* Remove setting of link libraries, include and link dirs and compile flags for each target, instead let Cmake deal with them by linking the main library to hip::host hiprtc::hiprtc hip::hipfft; * Fix: custom command without ADD_CUSTOM_TARGET and ADD_DEPENDENCIES is executed for both static and shared targets; * Remove IF(APPLE) parts.
-
Anton Gorenko authored
Fix SegFault in HipCalcHippoNonbondedForceKernel HipSort was created using a temporary ref. Adding `HipContext& cu` field to HipCalcHippoNonbondedForceKernel fixes the issue;
-
- 28 Mar, 2024 1 commit
-
-
Peter Eastman authored
-
- 18 Aug, 2023 1 commit
-
-
bdenhollander authored
* Amoeba minor cleanup - Fix variable name in string - Remove odd space between variable and period that is inconsistently styled * Replaces random tabs with spaces in ATM Force
-
- 13 Aug, 2023 1 commit
-
-
Peter Eastman authored
-
- 20 Jul, 2023 1 commit
-
-
Peter Eastman authored
* Always use nvrtc for compilation * Install nvrtc on CI * Workaround for compiler error * Set empty values for deprecated properties
-
- 12 Dec, 2022 1 commit
-
-
Peter Eastman authored
-
- 17 Aug, 2022 1 commit
-
-
Peter Eastman authored
-
- 22 Jul, 2022 1 commit
-
-
Adel Johar authored
* Support kernel files with extensions of any length (like .hip) * Do not allow to replace symbols in single-line comments * Add OPENMM_BUILD_COMMON CMake option It allows to build and install common platform files even if CUDA or OpenCL platforms are not built. This is required for HIP platform (openmm-hip) if ROCm OpenCL packages are not installed. * Add an option for Python wrapper to install into user packages OPENMM_PYTHON_USER_INSTALL is OFF be default. * Support FFT backends in Amoeba plugin The HIP platform supports FFT backends, this commit moves findLegalFFTDimension to ComputeContext, so platforms can have their own implementations. * Compatibility for common platform w/ new HIP platform * Do not use volatile with private and local AtomData parameters on HIP The generated code is not optimal, for example, the compiler generates flat_load instructions instead of ds_read. * Tune launch bounds for PME grid-related kernels and add WA for RDNA Force the compiler to use all registers for gridSpreadCharge and gridInterpolateForce by limiting max waves per EU to 1 on CDNA GPUs, RDNA GPUs work better without it. * Optimize atom data structs in GBSA and Amoeba on HIP Manually rearrange fields, add paddings and force alignments to have faster accesses to shared memory: ds_read and ds_write may work slower if addresses are not aligned by 16 bytes. Co-authored-by:
Anton Gorenko <anton@streamhpc.com> Co-authored-by:
Nick Curtis <nicholas.curtis@amd.com>
-
- 30 Jun, 2022 1 commit
-
-
Peter Eastman authored
* Use PocketFFT instead of FFTW * Minor cleanup * Use PocketFFT instead of fftpack for reference platform * Remove FFTW as a dependency * Converted a test case to use PocketFFT * Fixed an incorrect comment
-
- 07 Mar, 2022 1 commit
-
-
Anton Gorenko authored
It allows to use a faster float-to-int64 in the HIP platform.
-
- 27 Jan, 2022 1 commit
-
-
Peter Eastman authored
* Fixed potential invalid memory access * Fixed exception
-
- 04 Oct, 2021 1 commit
-
-
Peter Eastman authored
* Use cuCtxPushCurrent() and cuCtxPopCurrent() for selecting CUDA context * Fixed errors in amoeba coda * Fixed more errors in context selection
-
- 22 May, 2021 1 commit
-
-
Peter Eastman authored
* Began converting AMOEBA to common platform * Beginning of OpenCL platform for AMOEBA * Converted AmoebaVdwForce to common platform * Cleaned up reference AMOEBA tests * Began converting AmoebaMultipoleForce to common platform * Continue converting AmoebaMultipoleForce to common platform * Bug fixes * Bug fix * Continue converting AmoebaMultipoleForce to common platform * Converting AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce to common platform * Converting AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce to common platform * Creating OpenCL version of AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce * Creating OpenCL version of AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce * Creating OpenCL version of AmoebaMultipoleForce and AmoebaGeneralizedKirkwoodForce * Converted arrays from real3 to real * Bug fix to OpenCL AmoebaGeneralizedKirkwoodForce * Fixes for AMD GPUs * Began converting HippoNonbondedForce to common platform * Continuing to convert HippoNonbondedForce to common platform * Continuing to convert HippoNonbondedForce to common platform * Working on unifying PME kernels * Fixed error on devices without 64 bit atomics * Unified PME kernels * Converted HippoNonbondedForce to common platform * Creating OpenCL implementation of HippoNonbondedForce * Continuing OpenCL implementation of HippoNonbondedForce * Mostly finished OpenCL implementation of HippoNonbondedForce * Eliminated three component vector types in host code * Fix errors on CPU OpenCL * Skip double precision tests for AMOEBA on OpenCL * Bug fixes * Bug fixes * Fixed compilation error
-
- 16 Mar, 2021 1 commit
-
-
Peter Eastman authored
* Cleanup to CUDA AmoebaMultipoleForce * Deleted obsolete SOR code
-
- 04 Mar, 2021 1 commit
-
-
Peter Eastman authored
* Replaced several AMOEBA bonded forces with custom forces * Deleted obsolete AMOEBA forces * Replaced AmoebaPiTorsionForce with custom force
-
- 09 Feb, 2021 1 commit
-
-
Peter Eastman authored
-
- 20 Aug, 2020 1 commit
-
-
peastman authored
* Fixed range overflow with very large numbers of atoms * More fixes to overflow with large numbers of atoms * Fix test failures
-
- 18 Aug, 2020 1 commit
-
-
peastman authored
* Updated to latest values for physical constants * Updated documentation on physical constants * Updated Python unit definitions * Fixed constants in test case * Added a comment
-
- 20 Jul, 2020 1 commit
-
-
Peter Eastman authored
-
- 16 Jul, 2020 1 commit
-
-
Peter Eastman authored
-
- 01 Jul, 2020 3 commits
-
-
Peter Eastman authored
-
Peter Eastman authored
-
Peter Eastman authored
-
- 30 Jun, 2020 1 commit
-
-
Peter Eastman authored
-
- 29 Jun, 2020 1 commit
-
-
Peter Eastman authored
-
- 27 May, 2020 1 commit
-
-
peastman authored
-
- 08 Jan, 2020 1 commit
-
-
peastman authored
* Began creating common compute framework to unify code between CUDA and OpenCL * Began OpenCL implementation of common compute framework * Common implementation of CMMotionRemover * CUDA implementation of common compute interface * Converted HarmonicBondForce to common compute API * Converted standard bonded forces to common compute API * Converted ExpressionUtilities to common compute API * Created ComputeParameterSet * Converted custom bonded forces to common compute API * Converted CustomCentroidBondForce to common compute API * Converted CustomManyParticleForce to common compute API * Moved lots of duplicate code from CudaContext and OpenCLContext to ComputeContext * Converted GayBerneForce to common compute API * Removed obsolete kernels * Converted verlet integrators to common compute API * Converted Langevin and Brownian integrators to common compute API * Converted CustomIntegrator to common compute API * Converted CustomNonbondedForce to common compute API * Removed uses of a deprecated API * Fixed failing test cases * Converted GBSAOBCForce to common compute API * Began converting CustomGBForce to common compute API * Finished converting CustomGBForce to common compute API * Merged duplicated code in CudaIntegrationUtilities and OpenCLIntegrationUtilities * Converted RMSDForce and AndersenThermostat to common compute API * Converted CustomHbondForce to common compute API * Merged scripts for encoding kernel sources * Converted Drude plugin to common compute API * Fixed errors in CMake scripts * Attempt at fixing errors on Windows * Added discussion of common compute API to developer guide * Added Windows export macro for common classes * Fixed error in CMMotionRemover * Ubdated travis to newer Ubuntu version * Fixed errors on CPU OpenCL * Fixed Windows linking errors * Added missing pragma for 32 bit atomics * Replaced long long with mm_long * More fixes to Windows linking * Bug fix
-
- 29 Oct, 2019 1 commit
-
-
Frazer Leslie Clews authored
-
- 12 Sep, 2019 1 commit
-
-
Peter Eastman authored
-
- 11 Sep, 2019 2 commits
-
-
peastman authored
-
Chengwen Liu authored
-
- 10 Sep, 2019 3 commits
-
-
Chengwen Liu authored
-
Chengwen Liu authored
-
Chengwen Liu authored
-
- 23 Aug, 2019 1 commit
-
-
Chengwen Liu authored
-