Unverified Commit adfd84c2 authored by Evan Pretti's avatar Evan Pretti Committed by GitHub
Browse files

Add LCPO method (#5130)

* Basic LCPO support

* Add basic test for LCPO from a prmtop file

* API for LCPOForce

* Started LCPO reference implementation

* Finished reference forces & test cases

* Use other test for finite difference since grid might have discontinuous forces

* Reference platform formatting

* Initial implementation of CPU platform

* Bugfixes

* More vectorization and improve neighbor list query speed

* Parallelize part of neighbor search

* Check box size for LCPO with periodic boundary conditions

* Fixes for updating parameters in context

* GBSAOBCForce doesn't use first & last indices for updates, so no need for this optimization here

* Changes to neighbor checking and optimization

* Fixes and minor changes

* Add global surface tension parameter

* Only process half of the pairs in the neighbor list

* Remove unnecessary checks

* Initial version of common platform implementation

* Asynchronously download neighbor list size

* Debugging

* Do pair precomputation in copyPairsToNeighborList

* Recompute interactions instead of scanning neighbor list in inner loop

* Condense position array before computations

* Also make neighbor count download asynchronous on device

* Fixes for kernel launching

* Topology-based LCPO parameter assignment

* Fixes, and use test system for LCPO with nucleic acids

* Always raise instead of warn when LCPO parameters can't be assigned

* Use Amber convention for phosphates
parent ccb83f1d
......@@ -31,6 +31,7 @@ These classes implement forces that are widely used in biomolecular simulation.
generated/GayBerneForce
generated/HarmonicAngleForce
generated/HarmonicBondForce
generated/LCPOForce
generated/NonbondedForce
generated/PeriodicTorsionForce
generated/RBTorsionForce
......
......@@ -739,6 +739,19 @@ The screening parameter can be calculated as
where :math:`I` is the ionic strength in moles/liter, :math:`\epsilon` is the solvent
dielectric constant, and :math:`T` is the temperature in Kelvin.
When one of these implicit solvent force field files is included, you can
specify the :code:`sasaMethod` parameter to select a method for calculating the
solvent-accessible surface area (SASA), e.g.:
::
system = forcefield.createSystem(sasaMethod='LCPO')
Supported options are :code:`'ACE'` for the ACE approximation
:cite:`Schaefer1998`\ :cite:`Ponder` (which is also the default if
:code:`sasaMethod` is not given), :code:`'LCPO'` for the LCPO approximation
:cite:`Weiser1999`, and :code:`None` to disable calculation of the surface area
term entirely.
AMOEBA
------
......@@ -982,6 +995,16 @@ Debye-Huckel screening parameter\ :cite:`Srinivasan1999`:
system = prmtop.createSystem(implicitSolvent=OBC2, implicitSolventKappa=1.0/nanometer)
By default, OpenMM uses the ACE approximation :cite:`Schaefer1998`\ :cite:`Ponder`
to the solvent-accessible surface area, but you can also use the LCPO
approximation :cite:`Weiser1999`:
::
system = prmtop.createSystem(implicitSolvent=OBC2, sasaMethod='LCPO')
This will be slower but more accurate than ACE. Specifying :code:`None` for the
:code:`sasaMethod` parameter will disable calculation of the surface area term
entirely.
Nonbonded Interactions
======================
......
......@@ -713,6 +713,18 @@
doi = {10.1021/acs.jpcb.7b02320},
}
@article{Weiser1999,
author = {Weiser, Jörg and Shenkin, Peter S. and Still, W. Clark},
title = {Approximate Atomic Surfaces from Linear Combinations of Pairwise Overlaps (LCPO)},
journal = {Journal of Computational Chemistry},
volume = {20},
number = {2},
pages = {217-230},
year = {1999},
type = {Journal Article},
doi = {10.1002/(SICI)1096-987X(19990130)20:2<217::AID-JCC4>3.0.CO;2-A}
}
@article{Wennberg2015
author = {Wennberg, Christian L. and Murtola, Teemu and Páll, Szilárd and Abraham, Mark J. and Hess, Berk and Lindahl, Erik},
title = {Direct-Space Corrections Enable Fast and Accurate {Lorentz–Berthelot} Combination Rule {Lennard-Jones} Lattice Summation},
......
......@@ -462,6 +462,38 @@ where :math:`r_i` is the atomic radius of particle *i*\ , :math:`r_i` is
its atomic radius, and :math:`r_\mathit{solvent}` is the solvent radius, which is taken
to be 0.14 nm. The default value for the energy scale :math:`E_{SA}` is 2.25936 kJ/mol/nm\ :sup:`2`\ .
LCPOForce
*********
LCPOForce implements the LCPO (linear combinations of pairwise overlaps) method
for estimating solvent-accessible surface areas :cite:`Weiser1999`. This force
only implements the functional form of the three-body potential used in the LCPO
method, and does not automatically assign standard LCPO parameters; this is
handled in the application layer.
The LCPO approximation estimates the surface area contribution to the energy as
a sum of terms for each atom:
.. math::
E=\gamma\sum_iA_i
where
.. math::
A_i=P_{1,i}S_i+P_{2,i}\sum_{j\in N_i}A_{ij}+P_{3,i}\sum_{\substack{j,k\in N_i\\k\in N_j}}A_{jk}+P_{4,i}\sum_{j\in N_i}A_{ij}\sum_{\substack{k\in N_i\\k\in N_j}}A_{jk}
In this expression, :math:`N_i` refers to the set of all neighbors of atom
:math:`i` (not including :math:`i`), :math:`P_{1,i}` through :math:`P_{4,i}` are
LCPO coefficients specific to atom :math:`i`,
.. math::
S_i=4\pi r_i^2
where :math:`r_i` is the radius of atom :math:`i`, and
.. math::
A_{ij}=2\pi r_i\left(r_i-\frac{d_{ij}}{2}-\frac{r_i^2-r_j^2}{2d_{ij}}\right)
ConstantPotentialForce
**********************
......
......@@ -54,6 +54,7 @@
#include "openmm/HarmonicAngleForce.h"
#include "openmm/HarmonicBondForce.h"
#include "openmm/KernelImpl.h"
#include "openmm/LCPOForce.h"
#include "openmm/MonteCarloBarostat.h"
#include "openmm/OrientationRestraintForce.h"
#include "openmm/PeriodicTorsionForce.h"
......@@ -1055,6 +1056,41 @@ public:
virtual void copyParametersToContext(ContextImpl& context, const GayBerneForce& force) = 0;
};
/**
* This kernel is invoked by LCPOForce to calculate the forces acting on the system and the energy of the system.
*/
class CalcLCPOForceKernel : public KernelImpl {
public:
static std::string Name() {
return "CalcLCPOForce";
}
CalcLCPOForceKernel(std::string name, const Platform& platform) : KernelImpl(name, platform) {
}
/**
* Initialize the kernel.
*
* @param system the System this kernel will be applied to
* @param force the LCPOForce this kernel will be used for
*/
virtual void initialize(const System& system, const LCPOForce& force) = 0;
/**
* Execute the kernel to calculate the forces and/or energy.
*
* @param context the context in which to execute this kernel
* @param includeForces true if forces should be calculated
* @param includeEnergy true if the energy should be calculated
* @return the potential energy due to the force
*/
virtual double execute(ContextImpl& context, bool includeForces, bool includeEnergy) = 0;
/**
* Copy changed parameters over to a context.
*
* @param context the context to copy parameters to
* @param force the LCPOForce to copy the parameters from
*/
virtual void copyParametersToContext(ContextImpl& context, const LCPOForce& force) = 0;
};
/**
* This kernel is invoked by CustomCVForce to calculate the forces acting on the system and the energy of the system.
*/
......
......@@ -58,6 +58,7 @@
#include "openmm/Integrator.h"
#include "openmm/LangevinIntegrator.h"
#include "openmm/LangevinMiddleIntegrator.h"
#include "openmm/LCPOForce.h"
#include "openmm/LocalEnergyMinimizer.h"
#include "openmm/MonteCarloAnisotropicBarostat.h"
#include "openmm/MonteCarloBarostat.h"
......
#ifndef OPENMM_LCPOFORCE_H_
#define OPENMM_LCPOFORCE_H_
/* -------------------------------------------------------------------------- *
* OpenMM *
* -------------------------------------------------------------------------- *
* This is part of the OpenMM molecular simulation toolkit. *
* See https://openmm.org/development. *
* *
* Portions copyright (c) 2025 Stanford University and the Authors. *
* Authors: Peter Eastman, Evan Pretti *
* Contributors: *
* *
* Permission is hereby granted, free of charge, to any person obtaining a *
* copy of this software and associated documentation files (the "Software"), *
* to deal in the Software without restriction, including without limitation *
* the rights to use, copy, modify, merge, publish, distribute, sublicense, *
* and/or sell copies of the Software, and to permit persons to whom the *
* Software is furnished to do so, subject to the following conditions: *
* *
* The above copyright notice and this permission notice shall be included in *
* all copies or substantial portions of the Software. *
* *
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, *
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL *
* THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, *
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR *
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE *
* USE OR OTHER DEALINGS IN THE SOFTWARE. *
* -------------------------------------------------------------------------- */
#include "Force.h"
#include "internal/windowsExport.h"
#include <vector>
namespace OpenMM {
/**
* This class implements the functional form of the potential used in the LCPO
* (linear combinations of pairwise overlaps) method for estimating
* solvent-accessible surface areas. Specifically, it implements the three-body
* potential described in Weiser, Shenkin, and Still, J. Comput. Chem. 20, 217
* (1999).
*
* To use this class, create an LCPOForce object, then call addParticle() once
* for each particle in the System to define its parameters. The number of
* particles for which you define parameters must be exactly equal to the number
* of particles in the System, or else an exception will be thrown when you try
* to create a Context. After a particle has been added, you can modify its
* parameters by calling setParticleParameters(). This will have no effect on
* Contexts that already exist unless you call updateParametersInContext().
*
* To exclude a particle from the force entirely, set its radius to zero.
*/
class OPENMM_EXPORT LCPOForce : public Force {
public:
/**
* Create an LCPOForce.
*/
LCPOForce();
/**
* Set whether this force should apply periodic boundary conditions when calculating distances between particles.
*/
void setUsesPeriodicBoundaryConditions(bool periodic);
/**
* Get the surface tension used to calculate the energy from the surface area.
*
* @return the surface tension, measured in kJ/mol/nm^2
*/
double getSurfaceTension() const;
/**
* Set the surface tension used to calculate the energy from the surface area.
*
* @param surfaceTension the surface tension, measured in kJ/mol/nm^2
*/
void setSurfaceTension(double surfaceTension);
/**
* Get the number of particles for which force field parameters have been defined.
*/
int getNumParticles() const;
/**
* Add parameters for a particle. This should be called once for each particle in the System.
* When it is called for the i'th time, it specifies the parameters for the i'th particle.
*
* @param radius the radius of the particle for the LCPO method, including the solvent probe radius if applicable, measured in nm
* @param p1 the value of the LCPO parameter P1 for the particle
* @param p2 the value of the LCPO parameter P2 for the particle
* @param p3 the value of the LCPO parameter P3 for the particle
* @param p4 the value of the LCPO parameter P4 for the particle, measured in nm^-2
* @return the index of the particle that was added
*/
int addParticle(double radius, double p1, double p2, double p3, double p4);
/**
* Get the nonbonded force parameters for a particle.
*
* @param index the index of the particle for which to get parameters
* @param[out] radius the radius of the particle for the LCPO method, including the solvent probe radius if applicable, measured in nm
* @param[out] p1 the value of the LCPO parameter P1 for the particle
* @param[out] p2 the value of the LCPO parameter P2 for the particle
* @param[out] p3 the value of the LCPO parameter P3 for the particle
* @param[out] p4 the value of the LCPO parameter P4 for the particle, measured in nm^-2
*/
void getParticleParameters(int index, double& radius, double& p1, double& p2, double& p3, double& p4) const;
/**
* Set the nonbonded force parameters for a particle.
*
* @param index the index of the particle for which to set parameters
* @param radius the radius of the particle for the LCPO method, including the solvent probe radius if applicable, measured in nm
* @param p1 the value of the LCPO parameter P1 for the particle
* @param p2 the value of the LCPO parameter P2 for the particle
* @param p3 the value of the LCPO parameter P3 for the particle
* @param p4 the value of the LCPO parameter P4 for the particle, measured in nm^-2
*/
void setParticleParameters(int index, double radius, double p1, double p2, double p3, double p4);
/**
* Update the parameters in a Context to match those stored in this Force
* object. This method provides an efficient method to update certain
* parameters in an existing Context without needing to reinitialize it.
* Simply call setParticleParameters() to modify this object's parameters,
* then call updateParametersInContext() to copy them over to the Context.
* Only the surface tension and the values of particle parameters can be
* updated by this method; other changes made will not be reflected in the
* Context after calling it.
*/
void updateParametersInContext(Context& context);
/**
* Returns whether or not this force makes use of periodic boundary conditions.
*
* @returns true if force uses PBC and false otherwise
*/
bool usesPeriodicBoundaryConditions() const;
protected:
ForceImpl* createImpl() const;
private:
class ParticleInfo;
bool usePeriodic;
double surfaceTension;
std::vector<ParticleInfo> particles;
};
/**
* This is an internal class used to record information about a particle.
* @private
*/
class LCPOForce::ParticleInfo {
public:
double radius, p1, p2, p3, p4;
ParticleInfo(double radius, double p1, double p2, double p3, double p4) : radius(radius), p1(p1), p2(p2), p3(p3), p4(p4) {
}
};
} // namespace OpenMM
#endif /*OPENMM_LCPOFORCE_H_*/
#ifndef OPENMM_LCPOFORCEIMPL_H_
#define OPENMM_LCPOFORCEIMPL_H_
/* -------------------------------------------------------------------------- *
* OpenMM *
* -------------------------------------------------------------------------- *
* This is part of the OpenMM molecular simulation toolkit. *
* See https://openmm.org/development. *
* *
* Portions copyright (c) 2025 Stanford University and the Authors. *
* Authors: Peter Eastman, Evan Pretti *
* Contributors: *
* *
* Permission is hereby granted, free of charge, to any person obtaining a *
* copy of this software and associated documentation files (the "Software"), *
* to deal in the Software without restriction, including without limitation *
* the rights to use, copy, modify, merge, publish, distribute, sublicense, *
* and/or sell copies of the Software, and to permit persons to whom the *
* Software is furnished to do so, subject to the following conditions: *
* *
* The above copyright notice and this permission notice shall be included in *
* all copies or substantial portions of the Software. *
* *
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, *
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL *
* THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, *
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR *
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE *
* USE OR OTHER DEALINGS IN THE SOFTWARE. *
* -------------------------------------------------------------------------- */
#include "ForceImpl.h"
#include "openmm/Kernel.h"
#include "openmm/LCPOForce.h"
namespace OpenMM {
class OPENMM_EXPORT LCPOForceImpl : public ForceImpl {
public:
LCPOForceImpl(const LCPOForce& owner);
void initialize(ContextImpl& context);
const LCPOForce& getOwner() const {
return owner;
}
void updateContextState(ContextImpl& context, bool& forcesInvalid) {
}
double calcForcesAndEnergy(ContextImpl& context, bool includeForces, bool includeEnergy, int groups);
std::map<std::string, double> getDefaultParameters() {
return {}; // This force field doesn't define any parameters.
}
std::vector<std::string> getKernelNames();
void updateParametersInContext(ContextImpl& context);
private:
const LCPOForce& owner;
Kernel kernel;
};
} // namespace OpenMM
#endif /*OPENMM_LCPOFORCEIMPL_H_*/
/* -------------------------------------------------------------------------- *
* OpenMM *
* -------------------------------------------------------------------------- *
* This is part of the OpenMM molecular simulation toolkit. *
* See https://openmm.org/development. *
* *
* Portions copyright (c) 2025 Stanford University and the Authors. *
* Authors: Peter Eastman, Evan Pretti *
* Contributors: *
* *
* Permission is hereby granted, free of charge, to any person obtaining a *
* copy of this software and associated documentation files (the "Software"), *
* to deal in the Software without restriction, including without limitation *
* the rights to use, copy, modify, merge, publish, distribute, sublicense, *
* and/or sell copies of the Software, and to permit persons to whom the *
* Software is furnished to do so, subject to the following conditions: *
* *
* The above copyright notice and this permission notice shall be included in *
* all copies or substantial portions of the Software. *
* *
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, *
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL *
* THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, *
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR *
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE *
* USE OR OTHER DEALINGS IN THE SOFTWARE. *
* -------------------------------------------------------------------------- */
#include "openmm/Force.h"
#include "openmm/LCPOForce.h"
#include "openmm/internal/AssertionUtilities.h"
#include "openmm/internal/LCPOForceImpl.h"
using namespace OpenMM;
using namespace std;
LCPOForce::LCPOForce() : usePeriodic(false), surfaceTension(1.0) {
}
void LCPOForce::setUsesPeriodicBoundaryConditions(bool periodic) {
usePeriodic = periodic;
}
double LCPOForce::getSurfaceTension() const {
return surfaceTension;
}
void LCPOForce::setSurfaceTension(double surfaceTension) {
this->surfaceTension = surfaceTension;
}
int LCPOForce::getNumParticles() const {
return particles.size();
}
int LCPOForce::addParticle(double radius, double p1, double p2, double p3, double p4) {
particles.push_back(ParticleInfo(radius, p1, p2, p3, p4));
return particles.size() - 1;
}
void LCPOForce::getParticleParameters(int index, double& radius, double& p1, double& p2, double& p3, double& p4) const {
ASSERT_VALID_INDEX(index, particles);
const ParticleInfo& info = particles[index];
radius = info.radius;
p1 = info.p1;
p2 = info.p2;
p3 = info.p3;
p4 = info.p4;
}
void LCPOForce::setParticleParameters(int index, double radius, double p1, double p2, double p3, double p4) {
ASSERT_VALID_INDEX(index, particles);
ParticleInfo& info = particles[index];
info.radius = radius;
info.p1 = p1;
info.p2 = p2;
info.p3 = p3;
info.p4 = p4;
}
void LCPOForce::updateParametersInContext(Context& context) {
dynamic_cast<LCPOForceImpl&>(getImplInContext(context)).updateParametersInContext(getContextImpl(context));
}
bool LCPOForce::usesPeriodicBoundaryConditions() const {
return usePeriodic;
}
ForceImpl* LCPOForce::createImpl() const {
return new LCPOForceImpl(*this);
}
/* -------------------------------------------------------------------------- *
* OpenMM *
* -------------------------------------------------------------------------- *
* This is part of the OpenMM molecular simulation toolkit. *
* See https://openmm.org/development. *
* *
* Portions copyright (c) 2025 Stanford University and the Authors. *
* Authors: Peter Eastman, Evan Pretti *
* Contributors: *
* *
* Permission is hereby granted, free of charge, to any person obtaining a *
* copy of this software and associated documentation files (the "Software"), *
* to deal in the Software without restriction, including without limitation *
* the rights to use, copy, modify, merge, publish, distribute, sublicense, *
* and/or sell copies of the Software, and to permit persons to whom the *
* Software is furnished to do so, subject to the following conditions: *
* *
* The above copyright notice and this permission notice shall be included in *
* all copies or substantial portions of the Software. *
* *
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, *
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL *
* THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, *
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR *
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE *
* USE OR OTHER DEALINGS IN THE SOFTWARE. *
* -------------------------------------------------------------------------- */
#include "openmm/internal/ContextImpl.h"
#include "openmm/internal/LCPOForceImpl.h"
#include "openmm/kernels.h"
using namespace OpenMM;
using namespace std;
LCPOForceImpl::LCPOForceImpl(const LCPOForce& owner) : owner(owner) {
forceGroup = owner.getForceGroup();
}
void LCPOForceImpl::initialize(ContextImpl& context) {
kernel = context.getPlatform().createKernel(CalcLCPOForceKernel::Name(), context);
const System& system = context.getSystem();
if (owner.getNumParticles() != system.getNumParticles()) {
throw OpenMMException("LCPOForce must have exactly as many particles as the System it belongs to.");
}
for (int i = 0; i < owner.getNumParticles(); i++) {
double radius, p1, p2, p3, p4;
owner.getParticleParameters(i, radius, p1, p2, p3, p4);
if (radius < 0.0) {
throw OpenMMException("LCPOForce: radius for a particle cannot be negative");
}
}
kernel.getAs<CalcLCPOForceKernel>().initialize(context.getSystem(), owner);
}
double LCPOForceImpl::calcForcesAndEnergy(ContextImpl& context, bool includeForces, bool includeEnergy, int groups) {
if ((groups & (1 << forceGroup)) != 0) {
return kernel.getAs<CalcLCPOForceKernel>().execute(context, includeForces, includeEnergy);
}
return 0.0;
}
std::vector<std::string> LCPOForceImpl::getKernelNames() {
return {CalcLCPOForceKernel::Name()};
}
void LCPOForceImpl::updateParametersInContext(ContextImpl& context) {
kernel.getAs<CalcLCPOForceKernel>().copyParametersToContext(context, owner);
context.systemChanged();
}
......@@ -857,6 +857,53 @@ private:
ComputeKernel copyStateKernel, copyForcesKernel, addForcesKernel;
};
/**
* This kernel is invoked by LCPOForce to calculate the forces acting on the system.
*/
class CommonCalcLCPOForceKernel : public CalcLCPOForceKernel {
public:
CommonCalcLCPOForceKernel(std::string name, const Platform& platform, ComputeContext& cc) : CalcLCPOForceKernel(name, platform),
cc(cc), hasInitializedKernel(false) {
}
/**
* Initialize the kernel.
*
* @param system the System this kernel will be applied to
* @param force the LCPOForce this kernel will be used for
*/
void initialize(const System& system, const LCPOForce& force);
/**
* Execute the kernel to calculate the forces and/or energy.
*
* @param context the context in which to execute this kernel
* @param includeForces true if forces should be calculated
* @param includeEnergy true if the energy should be calculated
* @return the potential energy due to the force
*/
double execute(ContextImpl& context, bool includeForces, bool includeEnergy);
/**
* Copy changed parameters over to a context.
*
* @param context the context to copy parameters to
* @param force the LCPOForce to copy the parameters from
*/
void copyParametersToContext(ContextImpl& context, const LCPOForce& force);
private:
class ForceInfo;
ComputeContext& cc;
ForceInfo* info;
bool hasInitializedKernel, usePeriodic, doInteraction;
int numActiveParticles, numBlocks, maxNeighborPairs, paddedNumActiveParticles;
int maxThreadBlockSize, numForceThreadBlocks, forceThreadBlockSize, findNeighborsThreadBlockSize;
double oneBodyEnergy, cutoffSquared;
ComputeArray activeParticles, parameters;
ComputeArray condensedPos, blockCenter, blockBoundingBox;
ComputeArray numNeighborPairs, numNeighborsForAtom, neighborStartIndex, neighborPairs, neighbors, neighborData;
ComputeKernel condensePosKernel, findBlockBoundsKernel, findNeighborsKernel, computeNeighborStartIndicesKernel, copyPairsToNeighborListKernel, computeInteractionKernel;
ComputeEvent downloadStartEvent, downloadFinishEvent;
ComputeQueue downloadQueue;
};
/**
* This kernel is invoked by VerletIntegrator to take one time step.
*/
......
......@@ -353,7 +353,11 @@ void CommonCalcCustomManyParticleForceKernel::initialize(const System& system, c
replacements["COMPUTE_INTERACTION"] = compute.str();
replacements["NUM_CANDIDATE_COMBINATIONS"] = numCombinations.str();
replacements["FIND_ATOMS_FOR_COMBINATION_INDEX"] = atomsForCombination.str();
replacements["IS_VALID_COMBINATION"] = isValidCombination.str();
string isValidCombinationStr = isValidCombination.str();
if (isValidCombinationStr.empty()) {
isValidCombinationStr = "true";
}
replacements["IS_VALID_COMBINATION"] = isValidCombinationStr;
replacements["VERIFY_CUTOFF"] = verifyCutoff.str();
replacements["VERIFY_EXCLUSIONS"] = verifyExclusions.str();
replacements["PERMUTE_ATOMS"] = permute.str();
......
This diff is collapsed.
typedef struct {
real4 data12;
real4 data21;
} NeighborData;
/**
* Load the position of an atom from global memory.
*/
inline DEVICE real3 loadCondensedPos(GLOBAL const real* RESTRICT condensedPos, int atom) {
int offset = 3 * atom;
return make_real3(condensedPos[offset], condensedPos[offset + 1], condensedPos[offset + 2]);
}
/**
* Record the force on an atom to global memory.
*/
inline DEVICE void storeForce(int atom, real3 force, GLOBAL mm_ulong* RESTRICT forceBuffers) {
ATOMIC_ADD(&forceBuffers[atom], (mm_ulong) realToFixedPoint(force.x));
ATOMIC_ADD(&forceBuffers[atom + PADDED_NUM_ATOMS], (mm_ulong) realToFixedPoint(force.y));
ATOMIC_ADD(&forceBuffers[atom + 2 * PADDED_NUM_ATOMS], (mm_ulong) realToFixedPoint(force.z));
}
/**
* Compute the difference between two vectors, taking periodic boundary
* conditions into account and setting the fourth component to the squared
* magnitude.
*/
inline DEVICE real4 delta(real3 vec1, real3 vec2, real4 periodicBoxSize, real4 invPeriodicBoxSize,
real4 periodicBoxVecX, real4 periodicBoxVecY, real4 periodicBoxVecZ) {
real4 result = make_real4(vec1.x - vec2.x, vec1.y - vec2.y, vec1.z - vec2.z, 0);
#ifdef USE_PERIODIC
APPLY_PERIODIC_TO_DELTA(result)
#endif
result.w = result.x * result.x + result.y * result.y + result.z * result.z;
return result;
}
/**
* Copies positions of active particles to a separate array for performance.
*/
KERNEL void condensePos(GLOBAL const int* RESTRICT activeParticles, GLOBAL const real4* RESTRICT posq, GLOBAL real* RESTRICT condensedPos) {
for (int i = GLOBAL_ID; i < NUM_ACTIVE; i += GLOBAL_SIZE) {
real3 pos = trimTo3(posq[activeParticles[i]]);
int iOffset = 3 * i;
condensedPos[iOffset] = pos.x;
condensedPos[iOffset + 1] = pos.y;
condensedPos[iOffset + 2] = pos.z;
}
}
/**
* Find a bounding box for the atoms in each block.
*/
KERNEL void findBlockBounds(real4 periodicBoxSize, real4 invPeriodicBoxSize,
real4 periodicBoxVecX, real4 periodicBoxVecY, real4 periodicBoxVecZ,
GLOBAL const real* RESTRICT condensedPos, GLOBAL real4* RESTRICT blockCenter,
GLOBAL real4* RESTRICT blockBoundingBox, GLOBAL int* RESTRICT numNeighborPairs) {
// Each thread (index) processes one block of WARP_SIZE atoms (from base
// through min(base + WARP_SIZE, NUM_ACTIVE) - 1) at a time.
int index = GLOBAL_ID;
int base = index * WARP_SIZE;
while (base < NUM_ACTIVE) {
real3 pos3 = loadCondensedPos(condensedPos, base);
real4 pos = make_real4(pos3.x, pos3.y, pos3.z, 0);
#ifdef USE_PERIODIC
APPLY_PERIODIC_TO_POS(pos)
#endif
real4 minPos = pos;
real4 maxPos = pos;
int last = min(base + WARP_SIZE, NUM_ACTIVE);
for (int i = base + 1; i < last; i++) {
pos3 = loadCondensedPos(condensedPos, i);
pos = make_real4(pos3.x, pos3.y, pos3.z, 0);
#ifdef USE_PERIODIC
real4 center = 0.5f * (maxPos + minPos);
APPLY_PERIODIC_TO_POS_WITH_CENTER(pos, center)
#endif
minPos = make_real4(min(minPos.x, pos.x), min(minPos.y, pos.y), min(minPos.z, pos.z), 0);
maxPos = make_real4(max(maxPos.x, pos.x), max(maxPos.y, pos.y), max(maxPos.z, pos.z), 0);
}
real4 blockSize = 0.5f * (maxPos - minPos);
blockBoundingBox[index] = blockSize;
blockCenter[index] = 0.5f * (maxPos + minPos);
// All threads can process GLOBAL_SIZE blocks together; if there are
// more blocks, advance to the next block for this thread.
index += GLOBAL_SIZE;
base = index * WARP_SIZE;
}
// Reset numNeighborPairs for the findNeighbors kernel.
if (GLOBAL_ID == 0) {
*numNeighborPairs = 0;
}
}
/**
* Find a list of neighbors for each atom.
*/
KERNEL void findNeighbors(real4 periodicBoxSize, real4 invPeriodicBoxSize,
real4 periodicBoxVecX, real4 periodicBoxVecY, real4 periodicBoxVecZ,
GLOBAL const real* RESTRICT condensedPos, GLOBAL const real4* RESTRICT parameters,
GLOBAL const real4* RESTRICT blockCenter, GLOBAL const real4* RESTRICT blockBoundingBox,
GLOBAL int* RESTRICT numNeighborPairs, GLOBAL int* RESTRICT numNeighborsForAtom,
GLOBAL int2* RESTRICT neighborPairs, real cutoffSquared, int maxNeighborPairs) {
// parameters: each real4 holds [radius, p2, p3, p4].
// posrCache: each real4 holds [x, y, z, radius].
LOCAL real4 posrCache[FIND_NEIGHBORS_THREAD_BLOCK_SIZE];
int indexInWarp = LOCAL_ID % WARP_SIZE;
int warpStart = LOCAL_ID - indexInWarp;
#if !(defined(__CUDA_ARCH__) || defined(USE_HIP))
LOCAL bool includeBlockFlags[FIND_NEIGHBORS_THREAD_BLOCK_SIZE];
#endif
// Each thread will search for the neighbors of one atom at a time. Note
// that all threads in a warp are processing atoms from the same block.
for (int active1 = GLOBAL_ID; active1 < PADDED_NUM_ACTIVE; active1 += GLOBAL_SIZE) {
int numNeighborsForAtom1 = 0;
real3 pos1 = loadCondensedPos(condensedPos, active1);
real radius1 = parameters[active1].x;
int block1 = active1 / WARP_SIZE;
real4 blockCenter1 = blockCenter[block1];
real4 blockSize1 = blockBoundingBox[block1];
// Loop over atom blocks to search for neighbors. The threads in a warp
// compare block1 against 32 other blocks in parallel.
for (int block2Base = block1; block2Base < NUM_BLOCKS; block2Base += WARP_SIZE) {
int block2 = block2Base + indexInWarp;
bool includeBlock2 = (block2 < NUM_BLOCKS);
if (includeBlock2) {
real4 blockCenter2 = blockCenter[block2];
real4 blockSize2 = blockBoundingBox[block2];
real4 blockDelta = blockCenter1 - blockCenter2;
#ifdef USE_PERIODIC
APPLY_PERIODIC_TO_DELTA(blockDelta)
#endif
blockDelta.x = max((real) 0, fabs(blockDelta.x) - blockSize1.x - blockSize2.x);
blockDelta.y = max((real) 0, fabs(blockDelta.y) - blockSize1.y - blockSize2.y);
blockDelta.z = max((real) 0, fabs(blockDelta.z) - blockSize1.z - blockSize2.z);
includeBlock2 &= (blockDelta.x * blockDelta.x + blockDelta.y * blockDelta.y + blockDelta.z * blockDelta.z < cutoffSquared);
}
// Each thread now loops over any blocks we identified as
// potentially containing neighbors of atoms in block1, and checks
// to see if they are actually neighbors of atom1.
#if defined(__CUDA_ARCH__) || defined(USE_HIP)
int includeBlockFlags = BALLOT(includeBlock2);
while (includeBlockFlags != 0) {
int i = __ffs(includeBlockFlags) - 1;
includeBlockFlags &= includeBlockFlags - 1;
{
#else
SYNC_WARPS;
includeBlockFlags[LOCAL_ID] = includeBlock2;
SYNC_WARPS;
for (int i = 0; i < WARP_SIZE; i++) {
if (includeBlockFlags[warpStart + i]) {
#endif
int block2 = block2Base + i;
int start = block2 * WARP_SIZE;
int active = start + indexInWarp;
int included[WARP_SIZE];
int numIncluded = 0;
// All threads in a warp will compare atoms with atoms in
// block2, so load their parameters into shared memory.
SYNC_WARPS;
real3 pos = loadCondensedPos(condensedPos, active);
posrCache[LOCAL_ID] = make_real4(pos.x, pos.y, pos.z, parameters[active].x);
SYNC_WARPS;
if (active1 < NUM_ACTIVE) {
for (int j = 0; j < WARP_SIZE; j++) {
int active2 = start + j;
real4 posr2 = posrCache[warpStart + j];
real3 pos2 = trimTo3(posr2);
real radius2 = posr2.w;
// Decide whether to include this atom pair in the neighbor list.
real4 delta12 = delta(pos2, pos1, periodicBoxSize, invPeriodicBoxSize, periodicBoxVecX, periodicBoxVecY, periodicBoxVecZ);
real pairCutoff = radius1 + radius2;
if (active2 > active1 && active2 < NUM_ACTIVE && delta12.w < pairCutoff * pairCutoff) {
int includedIndex = numIncluded++;
included[includedIndex] = active2;
}
}
}
// Store any neighbors of this atom that were found.
if (numIncluded) {
int baseIndex = ATOMIC_ADD(numNeighborPairs, numIncluded);
if (baseIndex + numIncluded <= maxNeighborPairs) {
for (int j = 0; j < numIncluded; j++) {
neighborPairs[baseIndex + j] = make_int2(active1, included[j]);
}
}
numNeighborsForAtom1 += numIncluded;
}
}
}
}
numNeighborsForAtom[active1] = numNeighborsForAtom1;
SYNC_WARPS;
}
}
/**
* Sum the neighbor counts to compute the start position of each atom. This
* kernel is executed as a single thread block group.
*/
KERNEL void computeNeighborStartIndices(GLOBAL int* RESTRICT numNeighborPairsPointer,
GLOBAL int* RESTRICT numNeighborsForAtom, GLOBAL int* RESTRICT neighborStartIndex,
int maxNeighborPairs) {
LOCAL unsigned int posBuffer[THREAD_BLOCK_SIZE];
if (*numNeighborPairsPointer > maxNeighborPairs) {
return; // There wasn't enough memory for the neighbor list.
}
unsigned int globalOffset = 0;
for (unsigned int startAtom = 0; startAtom < NUM_ACTIVE; startAtom += LOCAL_SIZE) {
// Load the neighbor counts into local memory.
unsigned int globalIndex = startAtom + LOCAL_ID;
posBuffer[LOCAL_ID] = (globalIndex < NUM_ACTIVE ? numNeighborsForAtom[globalIndex] : 0);
SYNC_THREADS;
// Perform a parallel prefix sum.
for (unsigned int step = 1; step < LOCAL_SIZE; step *= 2) {
unsigned int add = (LOCAL_ID >= step ? posBuffer[LOCAL_ID - step] : 0);
SYNC_THREADS;
posBuffer[LOCAL_ID] += add;
SYNC_THREADS;
}
// Write the results back to global memory.
if (globalIndex < NUM_ACTIVE) {
neighborStartIndex[globalIndex + 1] = posBuffer[LOCAL_ID] + globalOffset;
// Clear this so the next kernel can use it as a counter.
numNeighborsForAtom[globalIndex] = 0;
}
globalOffset += posBuffer[LOCAL_SIZE - 1];
SYNC_THREADS;
}
if (LOCAL_ID == 0) {
neighborStartIndex[0] = 0;
}
}
/**
* Assemble the final neighbor list.
*/
KERNEL void copyPairsToNeighborList(real4 periodicBoxSize, real4 invPeriodicBoxSize,
real4 periodicBoxVecX, real4 periodicBoxVecY, real4 periodicBoxVecZ,
GLOBAL const real* RESTRICT condensedPos, GLOBAL const real4* RESTRICT parameters,
GLOBAL int* RESTRICT numNeighborPairsPointer, GLOBAL int* RESTRICT numNeighborsForAtom,
GLOBAL const int* RESTRICT neighborStartIndex, GLOBAL const int2* RESTRICT neighborPairs,
GLOBAL int2* RESTRICT neighbors, GLOBAL NeighborData* RESTRICT neighborData, int maxNeighborPairs) {
int numNeighborPairs = *numNeighborPairsPointer;
if (numNeighborPairs > maxNeighborPairs) {
return; // There wasn't enough memory for the neighbor list.
}
for (unsigned int pairIndex = GLOBAL_ID; pairIndex < numNeighborPairs; pairIndex += GLOBAL_SIZE) {
int2 pair = neighborPairs[pairIndex];
int offset = ATOMIC_ADD(numNeighborsForAtom + pair.x, 1);
int storeIndex = neighborStartIndex[pair.x] + offset;
// Store the original index so we can get back to (i, j) pairs.
neighbors[storeIndex] = make_int2(pair.y, pairIndex);
// Precompute data for this pair to be looked up later.
real3 iPos = loadCondensedPos(condensedPos, pair.x);
real3 jPos = loadCondensedPos(condensedPos, pair.y);
real4 ijDelta = delta(jPos, iPos, periodicBoxSize, invPeriodicBoxSize, periodicBoxVecX, periodicBoxVecY, periodicBoxVecZ);
real iRadius = parameters[pair.x].x;
real jRadius = parameters[pair.y].x;
real iRadiusPi = PI * iRadius;
real jRadiusPi = PI * jRadius;
real r = SQRT(ijDelta.w);
real rRecip = (real) 1 / r;
real deltaRadiusR = (iRadius * iRadius - jRadius * jRadius) * rRecip;
real deltaRadiusRSq = deltaRadiusR * rRecip;
real3 forceDir = trimTo3(ijDelta) * rRecip;
real3 iForceDir = iRadiusPi * ((real) -1 + deltaRadiusRSq) * forceDir;
real3 jForceDir = jRadiusPi * ((real) -1 - deltaRadiusRSq) * forceDir;
neighborData[storeIndex].data12 = make_real4(iForceDir.x, iForceDir.y, iForceDir.z, iRadiusPi * ((real) 2 * iRadius - r - deltaRadiusR));
neighborData[storeIndex].data21 = make_real4(jForceDir.x, jForceDir.y, jForceDir.z, jRadiusPi * ((real) 2 * jRadius - r + deltaRadiusR));
}
}
/**
* Compute the LCPO interaction.
*/
KERNEL void computeInteraction(
real4 periodicBoxSize, real4 invPeriodicBoxSize,
real4 periodicBoxVecX, real4 periodicBoxVecY, real4 periodicBoxVecZ,
GLOBAL const real* RESTRICT condensedPos, GLOBAL mm_ulong* RESTRICT forceBuffers,
GLOBAL mixed* RESTRICT energyBuffer, GLOBAL const int* RESTRICT activeParticles,
GLOBAL const real4* RESTRICT parameters, GLOBAL int* RESTRICT numNeighborPairsPointer,
GLOBAL const int* RESTRICT neighborStartIndex, GLOBAL const int2* RESTRICT neighborPairs,
GLOBAL int2* RESTRICT neighbors, GLOBAL NeighborData* RESTRICT neighborData, int maxNeighborPairs) {
int numNeighborPairs = *numNeighborPairsPointer;
if (numNeighborPairs > maxNeighborPairs) {
return; // There wasn't enough memory for the neighbor list.
}
mixed energy = 0;
for (unsigned int pairIndex = GLOBAL_ID; pairIndex < numNeighborPairs; pairIndex += GLOBAL_SIZE) {
int2 pair = neighbors[pairIndex];
int i = neighborPairs[pair.y].x;
int j = pair.x;
real3 iPos = loadCondensedPos(condensedPos, i);
real4 iParams = parameters[i];
real4 jParams = parameters[j];
real3 iForce = make_real3(0);
real3 jForce = make_real3(0);
NeighborData ijData = neighborData[pairIndex];
real4 Aij = ijData.data12;
real4 Aji = ijData.data21;
real4 ijForce2 = iParams.y * Aij + jParams.y * Aji;
real3 ijForce2_3 = trimTo3(ijForce2);
iForce += ijForce2_3;
jForce -= ijForce2_3;
energy += ijForce2.w;
real iRadius = iParams.x;
real iRadiusPi = PI * iRadius;
int jStart = neighborStartIndex[j];
int jEnd = neighborStartIndex[j + 1];
for (int jNeighbor = jStart; jNeighbor < jEnd; jNeighbor++) {
int k = neighbors[jNeighbor].x;
// It is faster to recompute the i-k interaction distance and
// parameters to decide whether or not to include k than to try to
// look up its index in the neighbor list of i.
real3 kPos = loadCondensedPos(condensedPos, k);
real4 kParams = parameters[k];
real kRadius = kParams.x;
real kRadiusPi = PI * kRadius;
real4 ikDelta = delta(kPos, iPos, periodicBoxSize, invPeriodicBoxSize, periodicBoxVecX, periodicBoxVecY, periodicBoxVecZ);
real pairCutoff = iRadius + kRadius;
if (ikDelta.w >= pairCutoff * pairCutoff) {
continue;
}
real r = SQRT(ikDelta.w);
real rRecip = (real) 1 / r;
real deltaRadiusR = (iRadius * iRadius - kRadius * kRadius) * rRecip;
real deltaRadiusRSq = deltaRadiusR * rRecip;
real3 forceDir = trimTo3(ikDelta) * rRecip;
real3 iForceDir = iRadiusPi * ((real) -1 + deltaRadiusRSq) * forceDir;
real3 jForceDir = kRadiusPi * ((real) -1 - deltaRadiusRSq) * forceDir;
real4 Aik = make_real4(iForceDir.x, iForceDir.y, iForceDir.z, iRadiusPi * ((real) 2 * iRadius - r - deltaRadiusR));
real4 Aki = make_real4(jForceDir.x, jForceDir.y, jForceDir.z, kRadiusPi * ((real) 2 * kRadius - r + deltaRadiusR));
NeighborData jkData = neighborData[jNeighbor];
real4 Ajk = jkData.data12;
real4 Akj = jkData.data21;
real4 ijkForce34 = (iParams.z + iParams.w * Aij.w) * Ajk + (iParams.z + iParams.w * Aik.w) * Akj;
real4 jikForce34 = (jParams.z + jParams.w * Aji.w) * Aik + (jParams.z + jParams.w * Ajk.w) * Aki;
real4 kijForce34 = (kParams.z + kParams.w * Aki.w) * Aij + (kParams.z + kParams.w * Akj.w) * Aji;
real3 ijkForce34_3 = trimTo3(ijkForce34);
real3 jikForce34_3 = trimTo3(jikForce34);
real3 kijForce34_3 = trimTo3(kijForce34);
real3 ijkForce4 = iParams.w * trimTo3(Aij) * Ajk.w;
real3 ikjForce4 = iParams.w * trimTo3(Aik) * Akj.w;
real3 jikForce4 = jParams.w * trimTo3(Aji) * Aik.w;
real3 jkiForce4 = jParams.w * trimTo3(Ajk) * Aki.w;
real3 kijForce4 = kParams.w * trimTo3(Aki) * Aij.w;
real3 kjiForce4 = kParams.w * trimTo3(Akj) * Aji.w;
energy += ijkForce34.w + jikForce34.w + kijForce34.w;
iForce += jikForce34_3 + kijForce34_3 + ijkForce4 + ikjForce4 + jikForce4 + kijForce4;
jForce += ijkForce34_3 - kijForce34_3 + jkiForce4 - jikForce4 - ijkForce4 + kjiForce4;
real3 kForce = -(ijkForce34_3 + jikForce34_3 + kijForce4 + kjiForce4 + ikjForce4 + jkiForce4);
storeForce(activeParticles[k], kForce, forceBuffers);
}
storeForce(activeParticles[i], iForce, forceBuffers);
storeForce(activeParticles[j], jForce, forceBuffers);
}
energyBuffer[GLOBAL_ID] += energy;
}
......@@ -38,6 +38,7 @@
#include "CpuGayBerneForce.h"
#include "CpuGBSAOBCForce.h"
#include "CpuLangevinMiddleDynamics.h"
#include "CpuLCPOForce.h"
#include "CpuNeighborList.h"
#include "CpuNonbondedForce.h"
#include "CpuPlatform.h"
......@@ -60,7 +61,7 @@ public:
CpuCalcForcesAndEnergyKernel(std::string name, const Platform& platform, CpuPlatform::PlatformData& data, ContextImpl& context);
/**
* Initialize the kernel.
*
*
* @param system the System this kernel will be applied to
*/
void initialize(const System& system);
......@@ -106,13 +107,13 @@ public:
}
/**
* Create a checkpoint recording the current state of the Context.
*
*
* @param stream an output stream the checkpoint data should be written to
*/
void createCheckpoint(ContextImpl& context, std::ostream& stream);
/**
* Load a checkpoint that was written by createCheckpoint().
*
*
* @param stream an input stream the checkpoint data should be read from
*/
void loadCheckpoint(ContextImpl& context, std::istream& stream);
......@@ -130,7 +131,7 @@ public:
}
/**
* Initialize the kernel.
*
*
* @param system the System this kernel will be applied to
* @param force the HarmonicAngleForce this kernel will be used for
*/
......@@ -170,7 +171,7 @@ public:
}
/**
* Initialize the kernel.
*
*
* @param system the System this kernel will be applied to
* @param force the PeriodicTorsionForce this kernel will be used for
*/
......@@ -212,7 +213,7 @@ public:
}
/**
* Initialize the kernel.
*
*
* @param system the System this kernel will be applied to
* @param force the RBTorsionForce this kernel will be used for
*/
......@@ -460,7 +461,7 @@ public:
~CpuCalcGBSAOBCForceKernel();
/**
* Initialize the kernel.
*
*
* @param system the System this kernel will be applied to
* @param force the GBSAOBCForce this kernel will be used for
*/
......@@ -617,6 +618,47 @@ private:
CpuGayBerneForce* ixn;
};
/**
* This kernel is invoked by LCPOForce to calculate the forces acting on the system and the energy of the system.
*/
class CpuCalcLCPOForceKernel : public CalcLCPOForceKernel {
public:
CpuCalcLCPOForceKernel(std::string name, const Platform& platform, CpuPlatform::PlatformData& data) : CalcLCPOForceKernel(name, platform),
data(data), ixn(NULL) {
}
~CpuCalcLCPOForceKernel();
/**
* Initialize the kernel.
*
* @param system the System this kernel will be applied to
* @param force the LCPOForce this kernel will be used for
*/
void initialize(const System& system, const LCPOForce& force);
/**
* Execute the kernel to calculate the forces and/or energy.
*
* @param context the context in which to execute this kernel
* @param includeForces true if forces should be calculated
* @param includeEnergy true if the energy should be calculated
* @return the potential energy due to the force
*/
double execute(ContextImpl& context, bool includeForces, bool includeEnergy);
/**
* Copy changed parameters over to a context.
*
* @param context the context to copy parameters to
* @param force the LCPOForce to copy the parameters from
*/
void copyParametersToContext(ContextImpl& context, const LCPOForce& force);
private:
CpuPlatform::PlatformData& data;
CpuLCPOForce* ixn;
bool doInteraction;
double oneBodyEnergy;
std::vector<int> activeParticles;
std::vector<fvec4> parameters;
};
/**
* This kernel is invoked by LangevinMiddleIntegrator to take one time step.
*/
......@@ -628,21 +670,21 @@ public:
~CpuIntegrateLangevinMiddleStepKernel();
/**
* Initialize the kernel, setting up the particle masses.
*
*
* @param system the System this kernel will be applied to
* @param integrator the LangevinMiddleIntegrator this kernel will be used for
*/
void initialize(const System& system, const LangevinMiddleIntegrator& integrator);
/**
* Execute the kernel.
*
*
* @param context the context in which to execute this kernel
* @param integrator the LangevinMiddleIntegrator this kernel is being used for
*/
void execute(ContextImpl& context, const LangevinMiddleIntegrator& integrator);
/**
* Compute the kinetic energy.
*
*
* @param context the context in which to execute this kernel
* @param integrator the LangevinMiddleIntegrator this kernel is being used for
*/
......
#ifndef OPENMM_CPULCPOFORCE_H_
#define OPENMM_CPULCPOFORCE_H_
/* -------------------------------------------------------------------------- *
* OpenMM *
* -------------------------------------------------------------------------- *
* This is part of the OpenMM molecular simulation toolkit. *
* See https://openmm.org/development. *
* *
* Portions copyright (c) 2025 Stanford University and the Authors. *
* Authors: Evan Pretti *
* Contributors: *
* *
* Permission is hereby granted, free of charge, to any person obtaining a *
* copy of this software and associated documentation files (the "Software"), *
* to deal in the Software without restriction, including without limitation *
* the rights to use, copy, modify, merge, publish, distribute, sublicense, *
* and/or sell copies of the Software, and to permit persons to whom the *
* Software is furnished to do so, subject to the following conditions: *
* *
* The above copyright notice and this permission notice shall be included in *
* all copies or substantial portions of the Software. *
* *
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, *
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL *
* THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, *
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR *
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE *
* USE OR OTHER DEALINGS IN THE SOFTWARE. *
* -------------------------------------------------------------------------- */
#include <map>
#include <vector>
#include "CpuNeighborList.h"
#include "openmm/internal/vectorize.h"
namespace OpenMM {
/**
* Performs energy, force, and charge derivative calculations for the CPU kernel
* for LCPOForce.
*/
class CpuLCPOForce {
private:
/**
* Holds data about a neighbor pair (overlap areas for each sphere in the
* last component of each fvec4 following their derivative vectors).
*/
struct NeighborData {
fvec4 ijData;
fvec4 jiData;
};
/**
* Holds information from a thread's request to record a neighbor.
*/
struct NeighborInfo {
int i, j;
NeighborData data;
};
/**
* Keeps track of neighbors for CpuLCPOForce.
*/
class Neighbors {
private:
int numParticles;
int maxNumNeighbors;
int maxNumNeighborsFound;
std::vector<int> numNeighbors;
std::vector<int> indices;
std::vector<NeighborData> data;
public:
/**
* Creates an empty CpuLCPOForce::Neighbors.
*
* @param numParticles the number of particles to track (this will be numActiveParticles of the parent CpuLCPOForce)
* @param maxNumNeighbors an initial guess for the maximum number of neighbors per particle
*/
Neighbors(int numParticles, int maxNumNeighborsGuess = 0);
/**
* Clears all neighbor data.
*/
void clear();
/**
* Records that two particles are neighbors of each other. This may
* fail silently if there is not enough room for the neighbor list.
*
* @param info neighbor information recorded by a worker thread.
*/
void insert(const NeighborInfo& info);
/**
* Retrieves the neighbors of a particle.
*
* @param i the index of the particle
* @param iNumNeighbors the number of neighbors of the particle
* @param iIndices pointer to the start of the indices of the neighbors
* @param iData pointer to the start of the data associated with the neighbors
*/
void getNeighbors(int i, int& iNumNeighbors, const int*& iIndices, const NeighborData*& iData) const;
/**
* Tests whether a particle is a neighbor of another particle.
*
* @param i the index of the first particle
* @param j the index of the second particle
* @param[out] data data for the pair (only set if they are neighbors)
* @return whether or not the second particle is a neighbor of the first
*/
bool isNeighbor(int i, int j, NeighborData& data);
/**
* Checks whether or not the neighbor list overflowed (and one more more
* insertions failed) since the list was last cleared.
*
* @param maxNumNeighborsNeeded the maximum number of neighbors per particle needed to avoid overflowing
*/
bool didOverflow(int& maxNumNeighborsNeeded) {
maxNumNeighborsNeeded = maxNumNeighborsFound;
return maxNumNeighborsFound > maxNumNeighbors;
}
};
public:
/**
* Creates a CpuLCPOForce.
*
* @param threads thread pool for computations
* @param activeParticles system particle indices for each active particle
* @param parameters parameters (radius, P2, P3, and P4) for each active particle
* @param usePeriodic whether or not to use periodic boundary conditions
*/
CpuLCPOForce(ThreadPool& threads, const std::vector<int>& activeParticles, const std::vector<fvec4>& parameters, bool usePeriodic);
/**
* Indicates that the radii of some particles may have changed and that the
* maximum required cutoff distance should be updated.
*/
void updateRadii();
/**
* Computes energies and forces.
*
* @param boxVectors periodic box vectors
* @param posq particle positions
* @param threadForce thread force accumulators
* @param includeForces whether or not to calculate interaction forces
* @param includeEnergy whether or not to calculate interaction energies
* @param energy energy accumulator
*/
void execute(Vec3* boxVectors, AlignedArray<float>& posq, std::vector<AlignedArray<float> >& threadForce, bool includeForces, bool includeEnergy, double& energy);
private:
/**
* Thread worker for computing energies and forces.
*/
void threadExecute(ThreadPool& threads, int threadIndex);
/**
* Helper for processing a block of the neighbor list.
*/
template<bool USE_PERIODIC, bool IS_TRICLINIC>
void processNeighborListBlock(int blockIndex, std::vector<NeighborInfo>& threadNeighborInfo);
private:
static const int RadiusIndex = 0;
static const int P2Index = 1;
static const int P3Index = 2;
static const int P4Index = 3;
int numActiveParticles;
ThreadPool& threads;
const std::vector<int>& activeParticles;
const std::vector<fvec4>& parameters;
bool usePeriodic, isTriclinic;
CpuNeighborList neighborList;
std::vector<std::set<int> > exclusions;
Neighbors neighbors;
float cutoff;
Vec3* boxVectors;
fvec4 boxSize, recipBoxSize, boxVec4[3];
float* posq;
std::vector<double> threadEnergy;
std::vector<AlignedArray<float> >* threadForce;
std::vector<std::vector<NeighborInfo> > threadNeighbors;
std::atomic<int> atomicCounter;
};
} // namespace OpenMM
#endif // OPENMM_CPULCPOFORCE_H_
\ No newline at end of file
......@@ -56,9 +56,10 @@ public:
* @param usePeriodic whether to apply periodic boundary conditions
* @param maxDistance the neighbor list will contain all pairs that are within this distance of each other
* @param threads used for parallelization
* @param indices numAtoms indices into the atomLocations array for indirect lookup
*/
void computeNeighborList(int numAtoms, const AlignedArray<float>& atomLocations, const std::vector<std::set<int> >& exclusions,
const Vec3* periodicBoxVectors, bool usePeriodic, float maxDistance, ThreadPool& threads);
const Vec3* periodicBoxVectors, bool usePeriodic, float maxDistance, ThreadPool& threads, const std::vector<int>* indices = NULL);
/**
* Build a dense neighbor list, in which every atom interacts with every other (except exclusions), regardless of distance.
*
......@@ -83,11 +84,16 @@ public:
const std::vector<BlockExclusionMask>& getBlockExclusions(int blockIndex) const;
private:
template<bool USE_INDICES>
void computeNeighborList(int numAtoms, const AlignedArray<float>& atomLocations, const std::vector<std::set<int> >& exclusions,
const Vec3* periodicBoxVectors, bool usePeriodic, float maxDistance, ThreadPool& threads);
/**
* This routine contains the code executed by each thread.
*/
template<bool USE_INDICES>
void threadComputeNeighborList(ThreadPool& threads, int threadIndex);
void runThread(int index);
private:
int blockSize;
std::vector<int> sortedAtoms;
......@@ -100,6 +106,7 @@ private:
Voxels* voxels;
const std::vector<std::set<int> >* exclusions;
const float* atomLocations;
const int* indices;
Vec3 periodicBoxVectors[3];
int numAtoms;
bool usePeriodic, dense;
......
......@@ -62,6 +62,8 @@ KernelImpl* CpuKernelFactory::createKernelImpl(std::string name, const Platform&
return new CpuCalcCustomGBForceKernel(name, platform, data);
if (name == CalcGayBerneForceKernel::Name())
return new CpuCalcGayBerneForceKernel(name, platform, data);
if (name == CalcLCPOForceKernel::Name())
return new CpuCalcLCPOForceKernel(name, platform, data);
if (name == IntegrateLangevinMiddleStepKernel::Name())
return new CpuIntegrateLangevinMiddleStepKernel(name, platform, data);
throw OpenMMException((std::string("Tried to create kernel with illegal kernel name '") + name + "'").c_str());
......
......@@ -1775,6 +1775,81 @@ void CpuCalcGayBerneForceKernel::copyParametersToContext(ContextImpl& context, c
ixn = new CpuGayBerneForce(force);
}
CpuCalcLCPOForceKernel::~CpuCalcLCPOForceKernel() {
if (ixn != NULL) {
delete ixn;
}
}
void CpuCalcLCPOForceKernel::initialize(const System& system, const LCPOForce& force) {
doInteraction = false;
oneBodyEnergy = 0.0;
double surfaceTension = force.getSurfaceTension();
for (int i = 0; i < force.getNumParticles(); i++) {
double radius, p1, p2, p3, p4;
force.getParticleParameters(i, radius, p1, p2, p3, p4);
p1 *= surfaceTension;
p2 *= surfaceTension;
p3 *= surfaceTension;
p4 *= surfaceTension;
oneBodyEnergy += 4.0 * PI_M * p1 * radius * radius;
if (radius != 0.0) {
activeParticles.push_back(i);
parameters.push_back(fvec4(radius, p2, p3, p4));
if (p2 != 0.0 || p3 != 0.0 || p4 != 0.0) {
doInteraction = true;
}
}
}
bool usePeriodic = force.usesPeriodicBoundaryConditions();
ixn = new CpuLCPOForce(data.threads, activeParticles, parameters, usePeriodic);
data.isPeriodic |= usePeriodic;
}
double CpuCalcLCPOForceKernel::execute(ContextImpl& context, bool includeForces, bool includeEnergy) {
double energy = oneBodyEnergy;
if (doInteraction) {
ixn->execute(extractBoxVectors(context), data.posq, data.threadForce, includeForces, includeEnergy, energy);
}
return energy;
}
void CpuCalcLCPOForceKernel::copyParametersToContext(ContextImpl& context, const LCPOForce& force) {
doInteraction = false;
oneBodyEnergy = 0.0;
double surfaceTension = force.getSurfaceTension();
int activeIndex = 0;
for (int i = 0; i < force.getNumParticles(); i++) {
double radius, p1, p2, p3, p4;
force.getParticleParameters(i, radius, p1, p2, p3, p4);
p1 *= surfaceTension;
p2 *= surfaceTension;
p3 *= surfaceTension;
p4 *= surfaceTension;
oneBodyEnergy += 4.0 * PI_M * p1 * radius * radius;
if (radius != 0.0) {
if (activeIndex < activeParticles.size()) {
activeParticles[activeIndex] = i;
parameters[activeIndex] = fvec4(radius, p2, p3, p4);
}
activeIndex++;
if (p2 != 0.0 || p3 != 0.0 || p4 != 0.0) {
doInteraction = true;
}
}
}
if (activeIndex != activeParticles.size()) {
throw OpenMMException("updateParametersInContext: The number of non-excluded particles for LCPO has changed");
}
ixn->updateRadii();
}
CpuIntegrateLangevinMiddleStepKernel::~CpuIntegrateLangevinMiddleStepKernel() {
if (dynamics)
delete dynamics;
......
/* -------------------------------------------------------------------------- *
* OpenMM *
* -------------------------------------------------------------------------- *
* This is part of the OpenMM molecular simulation toolkit. *
* See https://openmm.org/development. *
* *
* Portions copyright (c) 2025 Stanford University and the Authors. *
* Authors: Evan Pretti *
* Contributors: *
* *
* Permission is hereby granted, free of charge, to any person obtaining a *
* copy of this software and associated documentation files (the "Software"), *
* to deal in the Software without restriction, including without limitation *
* the rights to use, copy, modify, merge, publish, distribute, sublicense, *
* and/or sell copies of the Software, and to permit persons to whom the *
* Software is furnished to do so, subject to the following conditions: *
* *
* The above copyright notice and this permission notice shall be included in *
* all copies or substantial portions of the Software. *
* *
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, *
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL *
* THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, *
* DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR *
* OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE *
* USE OR OTHER DEALINGS IN THE SOFTWARE. *
* -------------------------------------------------------------------------- */
#include "CpuLCPOForce.h"
#include "SimTKOpenMMRealType.h"
#include "openmm/OpenMMException.h"
using namespace std;
using namespace OpenMM;
CpuLCPOForce::Neighbors::Neighbors(int numParticles, int maxNumNeighborsGuess) :
numParticles(numParticles), maxNumNeighborsFound(0), numNeighbors(numParticles) {
// Round up to the nearest multiple of 4 so that neighbor searches can be vectorized.
maxNumNeighbors = (maxNumNeighborsGuess + 3) & -4;
int numItems = numParticles * maxNumNeighbors;
indices.assign(numItems, -1);
data.resize(numItems);
}
void CpuLCPOForce::Neighbors::clear() {
maxNumNeighborsFound = 0;
numNeighbors.assign(numParticles, 0);
indices.assign(numParticles * maxNumNeighbors, -1);
}
void CpuLCPOForce::Neighbors::insert(const NeighborInfo& info) {
int& iNumNeighbors = numNeighbors[info.i];
int k = iNumNeighbors++;
if (k < maxNumNeighbors) {
k += info.i * maxNumNeighbors;
indices[k] = info.j;
this->data[k] = info.data;
}
maxNumNeighborsFound = max(maxNumNeighborsFound, iNumNeighbors);
}
void CpuLCPOForce::Neighbors::getNeighbors(int i, int& iNumNeighbors, const int*& iIndices, const NeighborData*& iData) const {
iNumNeighbors = numNeighbors[i];
iIndices = &indices[i * maxNumNeighbors];
iData = &data[i * maxNumNeighbors];
}
bool CpuLCPOForce::Neighbors::isNeighbor(int i, int j, NeighborData& data) {
int start = i * maxNumNeighbors;
int stop = start + ((numNeighbors[i] + 3) & -4);
ivec4 j4 = ivec4(j);
for (int k = start; k < stop; k += 4) {
ivec4 mask = (ivec4(&indices[k]) == j4);
if (any(mask)) {
// Exactly one mask element should be set to -1 and the others to 0.
// Use this to extract the appropriate index and retrieve the data.
mask *= ivec4(0, 1, 2, 3);
data = this->data[k - (mask[1] + mask[2] + mask[3])];
return true;
}
}
return false;
}
CpuLCPOForce::CpuLCPOForce(ThreadPool& threads, const vector<int>& activeParticles, const vector<fvec4>& parameters, bool usePeriodic) :
numActiveParticles(activeParticles.size()), threads(threads), activeParticles(activeParticles), parameters(parameters),
usePeriodic(usePeriodic), neighborList(4), exclusions(numActiveParticles), neighbors(numActiveParticles) {
for (int i = 0; i < numActiveParticles; i++) {
exclusions[i].insert(i);
}
updateRadii();
}
void CpuLCPOForce::updateRadii() {
cutoff = 0.0f;
for (int i = 0; i < numActiveParticles; i++) {
cutoff = max(cutoff, parameters[i][RadiusIndex]);
}
cutoff *= 2.0f;
}
void CpuLCPOForce::execute(Vec3* boxVectors, AlignedArray<float>& posq, vector<AlignedArray<float> >& threadForce, bool includeForces, bool includeEnergy, double& energy) {
if (!numActiveParticles) {
return;
}
int numThreads = threads.getNumThreads();
this->posq = &posq[0];
threadEnergy.resize(numThreads);
this->threadForce = &threadForce;
threadNeighbors.resize(numThreads);
isTriclinic = false;
if (usePeriodic) {
double minAllowedSize = 1.999999 * cutoff;
if (boxVectors[0][0] < minAllowedSize || boxVectors[1][1] < minAllowedSize || boxVectors[2][2] < minAllowedSize) {
throw OpenMMException("The periodic box size is less than twice the required cutoff for LCPO.");
}
this->boxVectors = boxVectors;
boxSize = fvec4(boxVectors[0][0], boxVectors[1][1], boxVectors[2][2], 0.0f);
recipBoxSize = fvec4(1.0 / boxVectors[0][0], 1.0 / boxVectors[1][1], 1.0 / boxVectors[2][2], 0.0f);
boxVec4[0] = fvec4(boxVectors[0][0], boxVectors[0][1], boxVectors[0][2], 0.0f);
boxVec4[1] = fvec4(boxVectors[1][0], boxVectors[1][1], boxVectors[1][2], 0.0f);
boxVec4[2] = fvec4(boxVectors[2][0], boxVectors[2][1], boxVectors[2][2], 0.0f);
isTriclinic = (boxVectors[0][1] != 0.0 || boxVectors[0][2] != 0.0 ||
boxVectors[1][0] != 0.0 || boxVectors[1][2] != 0.0 ||
boxVectors[2][0] != 0.0 || boxVectors[2][1] != 0.0);
}
neighborList.computeNeighborList(numActiveParticles, posq, exclusions, boxVectors, usePeriodic, cutoff, threads, &activeParticles);
// Process neighbors from the neighbor list.
atomicCounter = 0;
threads.execute([&] (ThreadPool& threads, int threadIndex) { threadExecute(threads, threadIndex); });
threads.waitForThreads();
// Compile all of the neighbor information from the threads into a single collection.
neighbors.clear();
while (true) {
for (auto threadNeighborInfo : threadNeighbors) {
for (NeighborInfo neighborInfo : threadNeighborInfo) {
neighbors.insert(neighborInfo);
}
}
int maxNumNeighborsNeeded;
if (neighbors.didOverflow(maxNumNeighborsNeeded)) {
neighbors = Neighbors(numActiveParticles, (int) (maxNumNeighborsNeeded * 1.1));
}
else {
break;
}
}
// Accumulate energies and forces.
atomicCounter = 0;
threads.resumeThreads();
threads.waitForThreads();
if (includeEnergy) {
for (int i = 0; i < numThreads; i++) {
energy += threadEnergy[i];
}
}
}
void CpuLCPOForce::threadExecute(ThreadPool& threads, int threadIndex) {
int numThreads = threads.getNumThreads();
// Process neighbors from the neighbor list.
vector<NeighborInfo>& threadNeighborInfo = threadNeighbors[threadIndex];
threadNeighborInfo.clear();
while (true) {
int blockIndex = atomicCounter++;
if (blockIndex >= neighborList.getNumBlocks()) {
break;
}
if (usePeriodic) {
if (isTriclinic) {
processNeighborListBlock<true, true>(blockIndex, threadNeighborInfo);
}
else {
processNeighborListBlock<true, false>(blockIndex, threadNeighborInfo);
}
}
else {
processNeighborListBlock<false, false>(blockIndex, threadNeighborInfo);
}
}
threads.syncThreads();
// Accumulate energies and forces.
int groupSize = max(1, numActiveParticles / (10 * numThreads));
threadEnergy[threadIndex] = 0.0;
float* forces = &(*threadForce)[threadIndex][0];
while (true) {
int start = atomicCounter.fetch_add(groupSize);
if (start >= numActiveParticles) {
break;
}
int end = min(start + groupSize, numActiveParticles);
for (int i = start; i < end; i++) {
float p2i = parameters[i][P2Index];
float p3i = parameters[i][P3Index];
float p4i = parameters[i][P4Index];
float energy = 0.0f;
fvec4 iForce(0.0f);
int iNumNeighbors;
const int* iIndices;
const NeighborData* iData;
neighbors.getNeighbors(i, iNumNeighbors, iIndices, iData);
for (int jNeighbor = 0; jNeighbor < iNumNeighbors; jNeighbor++) {
int j = iIndices[jNeighbor];
fvec4 Aij = iData[jNeighbor].ijData;
fvec4 Aji = iData[jNeighbor].jiData;
float p2j = parameters[j][P2Index];
float p3j = parameters[j][P3Index];
float p4j = parameters[j][P4Index];
fvec4 jForce(0.0f);
// Two-body term.
fvec4 ijForce2 = p2i * Aij + p2j * Aji;
iForce += ijForce2;
jForce -= ijForce2;
energy += ijForce2[3];
// Three-body term: includes all pairs (j, k) of neighbors of i that are also neighbors of each other.
int jNumNeighbors;
const int* jIndices;
const NeighborData* jData;
neighbors.getNeighbors(j, jNumNeighbors, jIndices, jData);
for (int kNeighbor = 0; kNeighbor < jNumNeighbors; kNeighbor++) {
int k = jIndices[kNeighbor];
NeighborData ikNeighborData;
if (!neighbors.isNeighbor(i, k, ikNeighborData)) {
continue;
}
fvec4 Aik = ikNeighborData.ijData;
fvec4 Aki = ikNeighborData.jiData;
fvec4 Ajk = jData[kNeighbor].ijData;
fvec4 Akj = jData[kNeighbor].jiData;
float p3k = parameters[k][P3Index];
float p4k = parameters[k][P4Index];
fvec4 ijkForce34 = (p3i + p4i * Aij[3]) * Ajk + (p3i + p4i * Aik[3]) * Akj;
fvec4 jikForce34 = (p3j + p4j * Aji[3]) * Aik + (p3j + p4j * Ajk[3]) * Aki;
fvec4 kijForce34 = (p3k + p4k * Aki[3]) * Aij + (p3k + p4k * Akj[3]) * Aji;
fvec4 ijkForce4 = p4i * Aij * Ajk[3];
fvec4 ikjForce4 = p4i * Aik * Akj[3];
fvec4 jikForce4 = p4j * Aji * Aik[3];
fvec4 jkiForce4 = p4j * Ajk * Aki[3];
fvec4 kijForce4 = p4k * Aki * Aij[3];
fvec4 kjiForce4 = p4k * Akj * Aji[3];
energy += ijkForce34[3] + jikForce34[3] + kijForce34[3];
iForce += jikForce34 + kijForce34 + ijkForce4 + ikjForce4 + jikForce4 + kijForce4;
jForce += ijkForce34 - kijForce34 + jkiForce4 - jikForce4 - ijkForce4 + kjiForce4;
fvec4 kForce = ijkForce34 + jikForce34 + kijForce4 + kjiForce4 + ikjForce4 + jkiForce4;
float * kForcePointer = &forces[4 * activeParticles[k]];
(fvec4(kForcePointer) - kForce).store(kForcePointer);
}
float * jForcePointer = &forces[4 * activeParticles[j]];
(fvec4(jForcePointer) + jForce).store(jForcePointer);
}
threadEnergy[threadIndex] += energy;
float * iForcePointer = &forces[4 * activeParticles[i]];
(fvec4(iForcePointer) + iForce).store(iForcePointer);
}
}
}
template<bool USE_PERIODIC, bool IS_TRICLINIC>
void CpuLCPOForce::processNeighborListBlock(int blockIndex, vector<NeighborInfo>& threadNeighborInfo) {
const int* iBlock = &neighborList.getSortedAtoms()[4 * blockIndex];
fvec4 iBlockPosq[4] {
fvec4(posq + 4 * activeParticles[iBlock[0]]),
fvec4(posq + 4 * activeParticles[iBlock[1]]),
fvec4(posq + 4 * activeParticles[iBlock[2]]),
fvec4(posq + 4 * activeParticles[iBlock[3]])
};
fvec4 iBlockX, iBlockY, iBlockZ, iBlockQ;
transpose(iBlockPosq, iBlockX, iBlockY, iBlockZ, iBlockQ);
fvec4 iBlockRadius(parameters[iBlock[0]][RadiusIndex], parameters[iBlock[1]][RadiusIndex], parameters[iBlock[2]][RadiusIndex], parameters[iBlock[3]][RadiusIndex]);
CpuNeighborList::NeighborIterator iterator = neighborList.getNeighborIterator(blockIndex);
while (iterator.next()) {
int j = iterator.getNeighbor();
fvec4 jPos(posq + 4 * activeParticles[j]);
// Only include particles close enough to each other.
fvec4 deltaX = jPos[0] - iBlockX;
fvec4 deltaY = jPos[1] - iBlockY;
fvec4 deltaZ = jPos[2] - iBlockZ;
if (USE_PERIODIC) {
if (IS_TRICLINIC) {
fvec4 scale3 = round(deltaZ * recipBoxSize[2]);
deltaX -= scale3 * boxVec4[2][0];
deltaY -= scale3 * boxVec4[2][1];
deltaZ -= scale3 * boxVec4[2][2];
fvec4 scale2 = round(deltaY * recipBoxSize[1]);
deltaX -= scale2 * boxVec4[1][0];
deltaY -= scale2 * boxVec4[1][1];
fvec4 scale1 = round(deltaX * recipBoxSize[0]);
deltaX -= scale1 * boxVec4[0][0];
}
else {
deltaX -= boxSize[0] * round(deltaX * recipBoxSize[0]);
deltaY -= boxSize[1] * round(deltaY * recipBoxSize[1]);
deltaZ -= boxSize[2] * round(deltaZ * recipBoxSize[2]);
}
}
fvec4 r2 = deltaX * deltaX + deltaY * deltaY + deltaZ * deltaZ;
float jRadius = parameters[j][RadiusIndex];
fvec4 rCut = iBlockRadius + jRadius;
ivec4 include = blendZero(r2 < rCut * rCut, fvec4::expandBitsToMask(~iterator.getExclusions()));
if (!any(include)) {
continue;
}
fvec4 r = sqrt(r2);
fvec4 rRecip = 1.0f / r;
deltaX *= rRecip;
deltaY *= rRecip;
deltaZ *= rRecip;
// Precompute and store the buried areas and their derivatives.
fvec4 iBlockRadiusPi = PI_M * iBlockRadius;
float jRadiusPi = PI_M * jRadius;
fvec4 deltaRadiusR = (iBlockRadius * iBlockRadius - jRadius * jRadius) * rRecip;
fvec4 deltaRadiusRSq = deltaRadiusR * rRecip;
fvec4 iBlockForce = iBlockRadiusPi * (-1.0f + deltaRadiusRSq);
fvec4 jForce = jRadiusPi * (-1.0f - deltaRadiusRSq);
fvec4 ijData0 = iBlockForce * deltaX;
fvec4 ijData1 = iBlockForce * deltaY;
fvec4 ijData2 = iBlockForce * deltaZ;
fvec4 ijData3 = iBlockRadiusPi * (2.0f * iBlockRadius - r - deltaRadiusR);
fvec4 jiData0 = jForce * deltaX;
fvec4 jiData1 = jForce * deltaY;
fvec4 jiData2 = jForce * deltaZ;
fvec4 jiData3 = jRadiusPi * (2.0f * jRadius - r + deltaRadiusR);
transpose(ijData0, ijData1, ijData2, ijData3);
transpose(jiData0, jiData1, jiData2, jiData3);
if (include[0]) threadNeighborInfo.push_back({iBlock[0], j, ijData0, jiData0});
if (include[1]) threadNeighborInfo.push_back({iBlock[1], j, ijData1, jiData1});
if (include[2]) threadNeighborInfo.push_back({iBlock[2], j, ijData2, jiData2});
if (include[3]) threadNeighborInfo.push_back({iBlock[3], j, ijData3, jiData3});
}
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment