DPDIntegrator (#4799)

* Created DPDIntegrator class * Reference implementation of DPDIntegrator * Build neighbor list for DPDIntegrator * Minor fixes * Documentation for DPDIntegrator * Python API for DPDIntegrator * Preliminary OpenCL implementation of DPDIntegrator * Enable USE_PERIODIC * Use updated positions in DPD thermostat * Working on neighbor list for OpenCL DPDIntegrator * ReorderListener for particle types * Serialization for DPDIntegrator * CUDA implementation of DPDIntegrator * HIP implementation of DPDIntegrator * Fixed compile error in Python wrapper * Fixed compile error in wrappers * Fixed uninitialized memory in reference neighbor list * Added DPDIntegrator to C++ API docs * Fixed incorrect launch size * Fixed nan in DPD random number generator * Minor optimizations * Improved load balancing * Fixed an indexing error * Neighbor list uses the maximum cutoff of any force * Fixed HIP compilation error * Fixed access to invalid memory * Added test case for diffusion coefficient * Try to debug segfaults on CI * Debugging * Debugging * Debugging * Debugging * Debugging * Debugging * Possible fix * Debugging * Debugging * Debugging * Use correct block size on CPU OpenCL * Workaround for bug in Intel's OpenCL for CPUs * Removed an unnecessary define * Removed debugging code * Include Dart * More Intel workarounds * Workaround for error in NVIDIA OpenCL

DPDIntegrator (#4799)
* Created DPDIntegrator class * Reference implementation of DPDIntegrator * Build neighbor list for DPDIntegrator * Minor fixes * Documentation for DPDIntegrator * Python API for DPDIntegrator * Preliminary OpenCL implementation of DPDIntegrator * Enable USE_PERIODIC * Use updated positions in DPD thermostat * Working on neighbor list for OpenCL DPDIntegrator * ReorderListener for particle types * Serialization for DPDIntegrator * CUDA implementation of DPDIntegrator * HIP implementation of DPDIntegrator * Fixed compile error in Python wrapper * Fixed compile error in wrappers * Fixed uninitialized memory in reference neighbor list * Added DPDIntegrator to C++ API docs * Fixed incorrect launch size * Fixed nan in DPD random number generator * Minor optimizations * Improved load balancing * Fixed an indexing error * Neighbor list uses the maximum cutoff of any force * Fixed HIP compilation error * Fixed access to invalid memory * Added test case for diffusion coefficient * Try to debug segfaults on CI * Debugging * Debugging * Debugging * Debugging * Debugging * Debugging * Possible fix * Debugging * Debugging * Debugging * Use correct block size on CPU OpenCL * Workaround for bug in Intel's OpenCL for CPUs * Removed an unnecessary define * Removed debugging code * Include Dart * More Intel workarounds * Workaround for error in NVIDIA OpenCL
de180e4e · Peter Eastman · GitHub · 1ece5c28 · de180e4e · de180e4e
Unverified Commit de180e4e authored Apr 14, 2025 by Peter Eastman Committed by GitHub Apr 14, 2025
20 changed files
--- a/docs-source/api-c++/integrators.rst
+++ b/docs-source/api-c++/integrators.rst
@@ -57,8 +57,19 @@ The RPMD integrator implements Ring Polymer MD.
    generated/RPMDIntegrator
-Customizing ``Integrator``
+Dissipative Particle Dynamics integrators
-==========================
+=========================================
+The DPD integrator implements Dissipative Particle Dynamics.
+.. toctree::
+    :maxdepth: 2
+    generated/DPDIntegrator
+Custom integrators
+==================
 These classes facilitate customisation of the integrator. ``CustomIntegrator``
 allows a wide variety of integration algorithms to be implemented efficiently

--- a/docs-source/usersguide/references.bib
+++ b/docs-source/usersguide/references.bib
@@ -125,6 +125,16 @@
   type = {Journal Article}
 }
+@article{Espanol1995
+   author = {Español, P. and Warren, P.},
+   title = {Statistical Mechanics of Dissipative Particle Dynamics},
+   journal = {Europhysics Letters},
+   volume = {30},
+   pages = {191-196},
+   year = {1995},
+   type = {Journal Article}
+}
 @article{Essmann1995
   author = {Essmann, Ulrich and Perera, Lalith and Berkowitz, Max L. and Darden, Tom and Lee, Hsing and Pedersen, Lee G.},
   title = {A smooth particle mesh {Ewald} method},

--- a/docs-source/usersguide/theory/04_integrators.rst
+++ b/docs-source/usersguide/theory/04_integrators.rst
@@ -52,34 +52,11 @@ the Langevin equation of motion:
 where :math:`\mathbf{v}_i` is the velocity of particle *i*\ , :math:`\mathbf{f}_i` is
 the force acting on it, :math:`m_i` is its mass, :math:`\gamma` is the friction
 coefficient, and :math:`\mathbf{R}_i` is an uncorrelated random force whose
-components are chosen from a normal distribution with mean zero and unit variance. *T* is the temperature of
+components are chosen from a normal distribution with mean zero and variance
-the heat bath.
+:math:`2 \gamma k_B T`. *T* is the temperature of the heat bath.
-The integration is done using the Langevin leap-frog method. :cite:`Izaguirre2010`
+The integration is done using the LFMiddle discretization. :cite:`Zhang2019` In each step,
-In each step, the positions and velocities are updated as follows:
+the positions and velocities are updated as follows:
-.. math::
-   \mathbf{v}_{i}(t+\Delta t/2)=\mathbf{v}_{i}(t-\Delta t/2)\alpha+\mathbf{f}_{i}(t)(1-\alpha)/(\gamma{m}_{i}) + \sqrt{{k}_{B}T(1-\alpha^2)/{m}_{i}}R
-.. math::
-   \mathbf{r}_{i}(t+\Delta t)=\mathbf{r}_{i}(t)+\mathbf{v}_{i}(t+\Delta t/2)\Delta t
-where :math:`k` is Boltzmann's constant, :math:`T` is the temperature,
-:math:`\gamma` is the friction coefficient, :math:`R` is a normally distributed
-random number, and :math:`\alpha=\exp(-\gamma\Delta t)`.
-The same comments about the offset between positions and velocities apply to
-this integrator as to VerletIntegrator.
-LangevinMiddleIntegrator
-************************
-This integrator is similar to LangevinIntegerator, but it instead uses the LFMiddle
-discretization. :cite:`Zhang2019` In each step, the positions and velocities
-are updated as follows:
 .. math::
@@ -98,17 +75,17 @@ are updated as follows:
   \mathbf{r}_{i}(t+\Delta t) = \mathbf{r}_{i}(t+\Delta t/2) + \mathbf{v'}_{i}(t+\Delta t/2)\Delta t/2
-This tends to produce more accurate sampling of configurational properties (such
+where :math:`k` is Boltzmann's constant, :math:`T` is the temperature,
-as free energies), but less accurate sampling of kinetic properties.  Because
+:math:`\gamma` is the friction coefficient, :math:`R` is a normally distributed
-configurational properties are much more important than kinetic ones in most
+random number, and :math:`\alpha=\exp(-\gamma\Delta t)`.
-simulations, this integrator is generally preferred over LangevinIntegrator.  It
-often allows one to use a larger time step while still maintaining similar or
-better accuracy.
-One disadvantage of this integrator is that it requires applying constraints
+The same comments about the offset between positions and velocities apply to
-twice per time step, compared to only once for LangevinIntegrator.  This
+this integrator as to VerletIntegrator.
-can make it slightly slower for systems that involve constraints.  However, this
-usually is more than compensated by allowing you to use a larger time step.
+LangevinMiddleIntegrator
+************************
+This integrator is identical to LangevinIntegerator.  Separate classes exist only for historical reasons.
 .. _nosehoover-integrators-theory:
@@ -288,3 +265,32 @@ CustomIntegrator is a very powerful tool, and this description only gives a
 vague idea of the scope of its capabilities.  For full details and examples,
 consult the API documentation.
+DPDIntegrator
+*************
+This implements dissipative particle dynamics (DPD). :cite:`Espanol1995`  It is
+similar to a Langevin integrator, but instead of applying friction and noise to
+the Cartesian coordinates of particles, it applies them to inter-particle
+displacements.  This allows it to conserve momentum and efficiently model the
+hydrodynamics of coarse grained models.
+The system evolves according to the equation of motion
+.. math::
+   m_i\frac{d\mathbf{v}_i}{dt}=\mathbf{f}_i + \sum_{j \neq i } \mathbf{e}_{ij} [ -\gamma \omega^2(r_{ij})(\mathbf{e}_{ij} \cdot \mathbf{v}_{ij}) + \sqrt{2 \gamma k_BT} \omega(r_{ij}) R_{ij} ]
+where :math:`\mathbf{v}_i` is the velocity of particle *i*\ , :math:`\mathbf{f}_i` is
+the force acting on it, :math:`m_i` is its mass, :math:`r_{ij}` is the distance
+between particles *i* and *j*\ , :math:`\mathbf{e}_{ij}` is a unit vector pointing from
+particle *i* to particle *j*\ , :math:`\gamma` is the friction coefficient, and
+:math:`\mathbf{R}_{ij} = \mathbf{R}_{ji}` is an uncorrelated random force whose
+components are chosen from a normal distribution with mean zero and unit variance.
+*T* is the temperature of the heat bath.  The friction and noise are limited to
+pairs closer than a cutoff distance :math:`r_c` using the function
+:math:`\omega(r) = 1-r/r_c`.  The friction coefficient :math:`\gamma` and cutoff
+distance :math:`r_c` may be constants, or alternatively they can vary depending
+on the types of the particles.
+The integration is done using the same LFMiddle discretization used for
+LangevinIntegrator. :cite:`Zhang2019`
--- a/olla/include/openmm/kernels.h
+++ b/olla/include/openmm/kernels.h
@@ -9,7 +9,7 @@
 * Biological Structures at Stanford, funded under the NIH Roadmap for        *
 * Medical Research, grant U54 GM072970. See https://simtk.org.               *
 *                                                                            *
- * Portions copyright (c) 2008-2024 Stanford University and the Authors.      *
+ * Portions copyright (c) 2008-2025 Stanford University and the Authors.      *
 * Authors: Peter Eastman                                                     *
 * Contributors:                                                              *
 *                                                                            *
@@ -49,6 +49,7 @@
 #include "openmm/CustomNonbondedForce.h"
 #include "openmm/CustomManyParticleForce.h"
 #include "openmm/CustomTorsionForce.h"
+#include "openmm/DPDIntegrator.h"
 #include "openmm/GayBerneForce.h"
 #include "openmm/GBSAOBCForce.h"
 #include "openmm/HarmonicAngleForce.h"
@@ -1401,6 +1402,39 @@ public:
    virtual void setPerDofVariable(ContextImpl& context, int variable, const std::vector<Vec3>& values) = 0;
 };
+/**
+ * This kernel is invoked by DPDIntegrator to take one time step.
+ */
+class IntegrateDPDStepKernel : public KernelImpl {
+public:
+    static std::string Name() {
+        return "IntegrateDPDStep";
+    }
+    IntegrateDPDStepKernel(std::string name, const Platform& platform) : KernelImpl(name, platform) {
+    }
+    /**
+     * Initialize the kernel.
+     * 
+     * @param system     the System this kernel will be applied to
+     * @param integrator the DPDIntegrator this kernel will be used for
+     */
+    virtual void initialize(const System& system, const DPDIntegrator& integrator) = 0;
+    /**
+     * Execute the kernel.
+     * 
+     * @param context    the context in which to execute this kernel
+     * @param integrator the DPDIntegrator this kernel is being used for
+     */
+    virtual void execute(ContextImpl& context, const DPDIntegrator& integrator) = 0;
+    /**
+     * Compute the kinetic energy.
+     * 
+     * @param context    the context in which to execute this kernel
+     * @param integrator the DPDIntegrator this kernel is being used for
+     */
+    virtual double computeKineticEnergy(ContextImpl& context, const DPDIntegrator& integrator) = 0;
+};
 /**
 * This kernel is invoked by AndersenThermostat at the start of each time step to adjust the particle velocities.
 */

--- a/openmmapi/include/OpenMM.h
+++ b/openmmapi/include/OpenMM.h
@@ -9,7 +9,7 @@
 * Biological Structures at Stanford, funded under the NIH Roadmap for        *
 * Medical Research, grant U54 GM072970. See https://simtk.org.               *
 *                                                                            *
- * Portions copyright (c) 2009-2021 Stanford University and the Authors.      *
+ * Portions copyright (c) 2009-2025 Stanford University and the Authors.      *
 * Authors: Peter Eastman                                                     *
 * Contributors:                                                              *
 *                                                                            *
@@ -49,6 +49,7 @@
 #include "openmm/CustomIntegrator.h"
 #include "openmm/CustomManyParticleForce.h"
 #include "openmm/CustomNonbondedForce.h"
+#include "openmm/DPDIntegrator.h"
 #include "openmm/Force.h"
 #include "openmm/GayBerneForce.h"
 #include "openmm/GBSAOBCForce.h"

--- a/openmmapi/include/openmm/DPDIntegrator.h
+++ b/openmmapi/include/openmm/DPDIntegrator.h
+#ifndef OPENMM_DPDINTEGRATOR_H_
+#define OPENMM_DPDINTEGRATOR_H_
+/* -------------------------------------------------------------------------- *
+ *                                   OpenMM                                   *
+ * -------------------------------------------------------------------------- *
+ * This is part of the OpenMM molecular simulation toolkit originating from   *
+ * Simbios, the NIH National Center for Physics-Based Simulation of           *
+ * Biological Structures at Stanford, funded under the NIH Roadmap for        *
+ * Medical Research, grant U54 GM072970. See https://simtk.org.               *
+ *                                                                            *
+ * Portions copyright (c) 2025 Stanford University and the Authors.           *
+ * Authors: Peter Eastman                                                     *
+ * Contributors:                                                              *
+ *                                                                            *
+ * Permission is hereby granted, free of charge, to any person obtaining a    *
+ * copy of this software and associated documentation files (the "Software"), *
+ * to deal in the Software without restriction, including without limitation  *
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,   *
+ * and/or sell copies of the Software, and to permit persons to whom the      *
+ * Software is furnished to do so, subject to the following conditions:       *
+ *                                                                            *
+ * The above copyright notice and this permission notice shall be included in *
+ * all copies or substantial portions of the Software.                        *
+ *                                                                            *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,   *
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL    *
+ * THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,    *
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR      *
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE  *
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.                                     *
+ * -------------------------------------------------------------------------- */
+#include "Integrator.h"
+#include "openmm/Kernel.h"
+#include "internal/windowsExport.h"
+#include <map>
+#include <utility>
+#include <vector>
+namespace OpenMM {
+/**
+ * This integrator implements dissipative particle dynamics (DPD).  It is similar to a
+ * LangevinIntegrator, but instead off applying the friction and noise forces to the
+ * Cartesian coordinates of particles, they are applied to inter-particle distances.
+ * 
+ * Only particles within a cutoff distance apply friction to each other.  Most often
+ * a single cutoff distance and friction coefficient are used for all particle pairs.
+ * In some cases you may want to use different values for the interactions between
+ * different types of particles.  In that case you can call setParticleType() to set
+ * the types of particles and addTypePair() to set the parameters to use for their
+ * interactions.
+ * 
+ * Particle types, type pairs, and default parameters must be set before you create
+ * a Context.  Changing them will have no effect on an existing Context unless you
+ * reinitialize it.
+ * 
+ * This integrator can be applied either to periodic or non-periodic systems.  It
+ * applies periodic boundary conditions if system.usesPeriodicBoundaryConditions()
+ * returns true, which happens if any force in the system uses periodic boundary
+ * conditions.
+ */
+class OPENMM_EXPORT DPDIntegrator : public Integrator {
+public:
+    /**
+     * Create a DPDIntegrator.  All particles default to having type 0.
+     * 
+     * @param temperature      the temperature of the heat bath (in Kelvin)
+     * @param defaultFriction  the default friction coefficient (in inverse picoseconds).  This value is
+     *                         used for interactions whose parameters have not been set with addTypePair().
+     * @param defaultCutoff    the default cutoff distance (in nanometers).  This value is
+     *                         used for interactions whose parameters have not been set with addTypePair().
+     * @param stepSize         the step size with which to integrate the system (in picoseconds)
+     */
+    DPDIntegrator(double temperature, double defaultFriction, double defaultCutoff, double stepSize);
+    /**
+     * Get the temperature of the heat bath (in Kelvin).
+     *
+     * @return the temperature of the heat bath, measured in Kelvin
+     */
+    double getTemperature() const {
+        return temperature;
+    }
+    /**
+     * Set the temperature of the heat bath (in Kelvin).
+     *
+     * @param temp    the temperature of the heat bath, measured in Kelvin
+     */
+    void setTemperature(double temp);
+    /**
+     * Get the default friction coefficient for interactions that have not been specified with
+     * addTypePair (in inverse ps).
+     */
+    double getDefaultFriction() const {
+        return defaultFriction;
+    }
+    /**
+     * Set the default friction coefficient for interactions that have not been specified with
+     * addTypePair (in inverse ps).
+     */
+    void setDefaultFriction(double friction);
+    /**
+     * Get the default cutoff distance for interactions that have not been specified with
+     * addTypePair (in nm).
+     */
+    double getDefaultCutoff() const {
+        return defaultCutoff;
+    }
+    /**
+     * Set the default cutoff distance for interactions that have not been specified with
+     * addTypePair (in nm).
+     */
+    void setDefaultCutoff(double cutoff);
+    /**
+     * Get the type of a particle.  This is an arbitrary integer.  All particles initially
+     * default to type 0.
+     */
+    int getParticleType(int index) const;
+    /**
+     * Set the type of a particle.  This is an arbitrary integer.  All particles initially
+     * default to type 0.
+     */
+    void setParticleType(int index, int type);
+    /**
+     * Get a map whose keys are particle indices and whose values are particle types.  This
+     * contains only the particles that have been specifically set with setParticleType().
+     * All others have a default type of 0.
+     */
+    const std::map<int, int>& getParticleTypes() const;
+    /**
+     * Get the number of type pairs that have been added to the integrator.
+     */
+    int getNumTypePairs() const {
+        return pairs.size();
+    }
+    /**
+     * Add a type pair.  This overrides the default friction and cutoff distance for interactions
+     * between particles of two particular types.
+     * 
+     * @param type1     the first particle type
+     * @param type2     the second particle type
+     * @param friction  the friction for interactions between particles of these two types
+     * @param cutoff    the cutoff distance for interactions between particles of these two types
+     * @return the index of the type pair that was just added.
+     */
+    int addTypePair(int type1, int type2, double friction, double cutoff);
+    /**
+     * Get the parameters of a type pair.  This overrides the default friction and cutoff distance
+     * for interactions between particles of two particular types.
+     * 
+     * @param pairIndex      the index of the type pair
+     * @param[out] type1     the index of the first particle type
+     * @param[out] type2     the index of the second particle type
+     * @param[out] friction  the friction for interactions between particles of these two types
+     * @param[out] cutoff    the cutoff distance for interactions between particles of these two types
+     */
+    void getTypePairParameters(int pairIndex, int& type1, int& type2, double& friction, double& cutoff) const;
+    /**
+     * Set the parameters of a type pair.  This overrides the default friction and cutoff distance
+     * for interactions between particles of two particular types.
+     * 
+     * @param pairIndex the index of the type pair
+     * @param type1     the index of the first particle type
+     * @param type2     the index of the second particle type
+     * @param friction  the friction for interactions between particles of these two types
+     * @param cutoff    the cutoff distance for interactions between particles of these two types
+     */
+    void setTypePairParameters(int pairIndex, int type1, int type2, double friction, double cutoff);
+    /**
+     * Get the random number seed.  See setRandomNumberSeed() for details.
+     */
+    int getRandomNumberSeed() const {
+        return randomNumberSeed;
+    }
+    /**
+     * Set the random number seed.  The precise meaning of this parameter is undefined, and is left up
+     * to each Platform to interpret in an appropriate way.  It is guaranteed that if two simulations
+     * are run with different random number seeds, the sequence of random forces will be different.  On
+     * the other hand, no guarantees are made about the behavior of simulations that use the same seed.
+     * In particular, Platforms are permitted to use non-deterministic algorithms which produce different
+     * results on successive runs, even if those runs were initialized identically.
+     *
+     * If seed is set to 0 (which is the default value assigned), a unique seed is chosen when a Context
+     * is created from this Integrator. This is done to ensure that each Context receives unique random seeds
+     * without you needing to set them explicitly.
+     */
+    void setRandomNumberSeed(int seed) {
+        randomNumberSeed = seed;
+    }
+    /**
+     * Advance a simulation through time by taking a series of time steps.
+     * 
+     * @param steps   the number of time steps to take
+     */
+    void step(int steps);
+protected:
+    /**
+     * This will be called by the Context when it is created.  It informs the Integrator
+     * of what context it will be integrating, and gives it a chance to do any necessary initialization.
+     * It will also get called again if the application calls reinitialize() on the Context.
+     */
+    void initialize(ContextImpl& context);
+    /**
+     * This will be called by the Context when it is destroyed to let the Integrator do any necessary
+     * cleanup.  It will also get called again if the application calls reinitialize() on the Context.
+     */
+    void cleanup();
+    /**
+     * Get the names of all Kernels used by this Integrator.
+     */
+    std::vector<std::string> getKernelNames();
+    /**
+     * Compute the kinetic energy of the system at the current time.
+     */
+    double computeKineticEnergy();
+    /**
+     * Computing kinetic energy for this integrator does not require forces.
+     */
+    bool kineticEnergyRequiresForce() const;
+    /**
+     * Get the time interval by which velocities are offset from positions.  This is used to
+     * adjust velocities when setVelocitiesToTemperature() is called on a Context.
+     */
+    double getVelocityTimeOffset() const {
+        return getStepSize()/2;
+    }
+private:
+    class TypePairInfo;
+    double temperature, defaultFriction, defaultCutoff;
+    int randomNumberSeed;
+    std::map<int, int> particleType;
+    std::vector<TypePairInfo> pairs;
+    Kernel kernel;
+};
+/**
+ * This is an internal class used to record information about a type pair.
+ * @private
+ */
+class DPDIntegrator::TypePairInfo {
+public:
+    int type1, type2;
+    double friction, cutoff;
+    TypePairInfo() : type1(-1), type2(-1), friction(1.0), cutoff(0.0) {
+    }
+    TypePairInfo(int type1, int type2, double friction, double cutoff) :
+        type1(type1), type2(type2), friction(friction), cutoff(cutoff) {
+    }
+};
+} // namespace OpenMM
+#endif /*OPENMM_DPDINTEGRATOR_H_*/
--- a/openmmapi/include/openmm/internal/DPDIntegratorUtilities.h
+++ b/openmmapi/include/openmm/internal/DPDIntegratorUtilities.h
+#ifndef OPENMM_DPDINTEGRATORUTILITIES_H_
+#define OPENMM_DPDINTEGRATORUTILITIES_H_
+/* -------------------------------------------------------------------------- *
+ *                                   OpenMM                                   *
+ * -------------------------------------------------------------------------- *
+ * This is part of the OpenMM molecular simulation toolkit originating from   *
+ * Simbios, the NIH National Center for Physics-Based Simulation of           *
+ * Biological Structures at Stanford, funded under the NIH Roadmap for        *
+ * Medical Research, grant U54 GM072970. See https://simtk.org.               *
+ *                                                                            *
+ * Portions copyright (c) 2025 Stanford University and the Authors.           *
+ * Authors: Peter Eastman                                                     *
+ * Contributors:                                                              *
+ *                                                                            *
+ * Permission is hereby granted, free of charge, to any person obtaining a    *
+ * copy of this software and associated documentation files (the "Software"), *
+ * to deal in the Software without restriction, including without limitation  *
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,   *
+ * and/or sell copies of the Software, and to permit persons to whom the      *
+ * Software is furnished to do so, subject to the following conditions:       *
+ *                                                                            *
+ * The above copyright notice and this permission notice shall be included in *
+ * all copies or substantial portions of the Software.                        *
+ *                                                                            *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,   *
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL    *
+ * THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,    *
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR      *
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE  *
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.                                     *
+ * -------------------------------------------------------------------------- */
+#include "openmm/DPDIntegrator.h"
+#include <vector>
+namespace OpenMM {
+/**
+ * This class defines a set of utility functions that are useful in implementing DPDIntegrator.
+ */
+class OPENMM_EXPORT DPDIntegratorUtilities {
+public:
+    /**
+     * Create tables of the parameters to use for each pair of types.
+     * 
+     * @param integrator         the integrator to analyze
+     * @param numParticles       the number of particles in the System
+     * @param[out] numTypes      the number of unique particle types
+     * @param[out] particleType  the type of each particle, represented as an index between 0 and numTypes
+     * @param[out] friction      element [i][j] is the friction to use between particles of type i and j
+     * @param[out] cutoff        element [i][j] is the cutoff to use between particles of type i and j
+     * @param[out] maxCutoff     the maximum cutoff distance between any pair of particles
+     */
+    static void createTypeTables(const DPDIntegrator& integrator, int numParticles, int& numTypes, std::vector<int>& particleType,
+            std::vector<std::vector<double> >& friction, std::vector<std::vector<double> >& cutoff, double& maxCutoff);
+};
+} // namespace OpenMM
+#endif /*OPENMM_DPDINTEGRATORUTILITIES_H_*/
--- a/openmmapi/src/DPDIntegrator.cpp
+++ b/openmmapi/src/DPDIntegrator.cpp
+/* -------------------------------------------------------------------------- *
+ *                                   OpenMM                                   *
+ * -------------------------------------------------------------------------- *
+ * This is part of the OpenMM molecular simulation toolkit originating from   *
+ * Simbios, the NIH National Center for Physics-Based Simulation of           *
+ * Biological Structures at Stanford, funded under the NIH Roadmap for        *
+ * Medical Research, grant U54 GM072970. See https://simtk.org.               *
+ *                                                                            *
+ * Portions copyright (c) 2025 Stanford University and the Authors.           *
+ * Authors: Peter Eastman                                                     *
+ * Contributors:                                                              *
+ *                                                                            *
+ * Permission is hereby granted, free of charge, to any person obtaining a    *
+ * copy of this software and associated documentation files (the "Software"), *
+ * to deal in the Software without restriction, including without limitation  *
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,   *
+ * and/or sell copies of the Software, and to permit persons to whom the      *
+ * Software is furnished to do so, subject to the following conditions:       *
+ *                                                                            *
+ * The above copyright notice and this permission notice shall be included in *
+ * all copies or substantial portions of the Software.                        *
+ *                                                                            *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,   *
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL    *
+ * THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,    *
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR      *
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE  *
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.                                     *
+ * -------------------------------------------------------------------------- */
+#include "openmm/DPDIntegrator.h"
+#include "openmm/Context.h"
+#include "openmm/OpenMMException.h"
+#include "openmm/internal/ContextImpl.h"
+#include "openmm/kernels.h"
+#include <string>
+using namespace OpenMM;
+using namespace std;
+DPDIntegrator::DPDIntegrator(double temperature, double defaultFriction, double defaultCutoff, double stepSize) {
+    setTemperature(temperature);
+    setDefaultFriction(defaultFriction);
+    setDefaultCutoff(defaultCutoff);
+    setStepSize(stepSize);
+    setConstraintTolerance(1e-5);
+    setRandomNumberSeed(0);
+}
+void DPDIntegrator::initialize(ContextImpl& contextRef) {
+    if (owner != NULL && &contextRef.getOwner() != owner)
+        throw OpenMMException("This Integrator is already bound to a context");
+    context = &contextRef;
+    owner = &contextRef.getOwner();
+    kernel = context->getPlatform().createKernel(IntegrateDPDStepKernel::Name(), contextRef);
+    kernel.getAs<IntegrateDPDStepKernel>().initialize(contextRef.getSystem(), *this);
+}
+void DPDIntegrator::setTemperature(double temp) {
+    if (temp < 0)
+        throw OpenMMException("Temperature cannot be negative");
+    temperature = temp;
+}
+void DPDIntegrator::setDefaultFriction(double friction) {
+    if (friction < 0)
+        throw OpenMMException("Friction cannot be negative");
+    defaultFriction = friction;
+}
+void DPDIntegrator::setDefaultCutoff(double cutoff) {
+    if (cutoff <= 0)
+        throw OpenMMException("Cutoff must be positive");
+    defaultCutoff = cutoff;
+}
+int DPDIntegrator::getParticleType(int index) const {
+    if (particleType.find(index) == particleType.end())
+        return 0;
+    return particleType.at(index);
+}
+void DPDIntegrator::setParticleType(int index, int type) {
+    particleType[index] = type;
+}
+const map<int, int>& DPDIntegrator::getParticleTypes() const {
+    return particleType;
+}
+int DPDIntegrator::addTypePair(int type1, int type2, double friction, double cutoff) {
+    pairs.push_back(TypePairInfo(type1, type2, friction, cutoff));
+    return pairs.size()-1;
+}
+void DPDIntegrator::getTypePairParameters(int pairIndex, int& type1, int& type2, double& friction, double& cutoff) const {
+    ASSERT_VALID_INDEX(pairIndex, pairs);
+    type1 = pairs[pairIndex].type1;
+    type2 = pairs[pairIndex].type2;
+    friction = pairs[pairIndex].friction;
+    cutoff = pairs[pairIndex].cutoff;
+}
+void DPDIntegrator::setTypePairParameters(int pairIndex, int type1, int type2, double friction, double cutoff) {
+    ASSERT_VALID_INDEX(pairIndex, pairs);
+    pairs[pairIndex].type1 = type1;
+    pairs[pairIndex].type2 = type2;
+    pairs[pairIndex].friction = friction;
+    pairs[pairIndex].cutoff = cutoff;
+}
+void DPDIntegrator::cleanup() {
+    kernel = Kernel();
+}
+vector<string> DPDIntegrator::getKernelNames() {
+    std::vector<std::string> names;
+    names.push_back(IntegrateDPDStepKernel::Name());
+    return names;
+}
+double DPDIntegrator::computeKineticEnergy() {
+    return kernel.getAs<IntegrateDPDStepKernel>().computeKineticEnergy(*context, *this);
+}
+bool DPDIntegrator::kineticEnergyRequiresForce() const {
+    return false;
+}
+void DPDIntegrator::step(int steps) {
+    if (context == NULL)
+        throw OpenMMException("This Integrator is not bound to a context!");
+    for (int i = 0; i < steps; ++i) {
+        context->updateContextState();
+        context->calcForcesAndEnergy(true, false, getIntegrationForceGroups());
+        kernel.getAs<IntegrateDPDStepKernel>().execute(*context, *this);
+    }
+}
--- a/openmmapi/src/DPDIntegratorUtilities.cpp
+++ b/openmmapi/src/DPDIntegratorUtilities.cpp
+/* -------------------------------------------------------------------------- *
+ *                                   OpenMM                                   *
+ * -------------------------------------------------------------------------- *
+ * This is part of the OpenMM molecular simulation toolkit originating from   *
+ * Simbios, the NIH National Center for Physics-Based Simulation of           *
+ * Biological Structures at Stanford, funded under the NIH Roadmap for        *
+ * Medical Research, grant U54 GM072970. See https://simtk.org.               *
+ *                                                                            *
+ * Portions copyright (c) 2025 Stanford University and the Authors.           *
+ * Authors: Peter Eastman                                                     *
+ * Contributors:                                                              *
+ *                                                                            *
+ * Permission is hereby granted, free of charge, to any person obtaining a    *
+ * copy of this software and associated documentation files (the "Software"), *
+ * to deal in the Software without restriction, including without limitation  *
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,   *
+ * and/or sell copies of the Software, and to permit persons to whom the      *
+ * Software is furnished to do so, subject to the following conditions:       *
+ *                                                                            *
+ * The above copyright notice and this permission notice shall be included in *
+ * all copies or substantial portions of the Software.                        *
+ *                                                                            *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,   *
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL    *
+ * THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,    *
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR      *
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE  *
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.                                     *
+ * -------------------------------------------------------------------------- */
+#include "openmm/internal/DPDIntegratorUtilities.h"
+#include "openmm/OpenMMException.h"
+#include <map>
+#include <cstdio>
+using namespace OpenMM;
+using namespace std;
+void DPDIntegratorUtilities::createTypeTables(const DPDIntegrator& integrator, int numParticles, int& numTypes, vector<int>& particleType,
+        vector<vector<double> >& friction, vector<vector<double> >& cutoff, double& maxCutoff) {
+    // Identify all unique types and record the types of particles.
+    map<int, int> typeMap;
+    particleType.resize(numParticles);
+    for (int i = 0; i < numParticles; i++) {
+        int type = integrator.getParticleType(i);
+        auto mapping = typeMap.find(type);
+        if (mapping == typeMap.end()) {
+            particleType[i] = typeMap.size();
+            typeMap[type] = particleType[i];
+        }
+        else
+            particleType[i] = mapping->second;
+    }
+    numTypes = typeMap.size();
+    // Build tables of friction and cutoff.
+    friction.clear();
+    cutoff.clear();
+    friction.resize(numTypes, vector<double>(numTypes, integrator.getDefaultFriction()));
+    cutoff.resize(numTypes, vector<double>(numTypes, integrator.getDefaultCutoff()));
+    map<int, int> inverseMap;
+    for (auto mapping : typeMap)
+        inverseMap[mapping.second] = mapping.first;
+    maxCutoff = integrator.getDefaultCutoff();
+    for (int i = 0; i < integrator.getNumTypePairs(); i++) {
+        int t1, t2;
+        double f, c;
+        integrator.getTypePairParameters(i, t1, t2, f, c);
+        if (f < 0.0)
+            throw OpenMMException("DPDIntegrator: friction cannot be negative");
+        if (c <= 0.0)
+            throw OpenMMException("DPDIntegrator: cutoff must be positive");
+        int type1 = inverseMap[t1];
+        int type2 = inverseMap[t2];
+        friction[type1][type2] = f;
+        friction[type2][type1] = f;
+        cutoff[type1][type2] = c;
+        cutoff[type2][type1] = c;
+        if (c > maxCutoff)
+            maxCutoff = c;
+    }
+}
--- a/platforms/common/include/openmm/common/CommonKernels.h
+++ b/platforms/common/include/openmm/common/CommonKernels.h
@@ -1057,6 +1057,45 @@ private:
    double prevTemp, prevFriction, prevErrorTol;
 };
+/**
+ * This kernel is invoked by DPDIntegrator to take one time step.
+ */
+class CommonIntegrateDPDStepKernel : public IntegrateDPDStepKernel {
+public:
+    CommonIntegrateDPDStepKernel(std::string name, const Platform& platform, ComputeContext& cc) : IntegrateDPDStepKernel(name, platform), cc(cc),
+            hasInitializedKernels(false) {
+    }
+    /**
+     * Initialize the kernel.
+     * 
+     * @param system     the System this kernel will be applied to
+     * @param integrator the DPDIntegrator this kernel will be used for
+     */
+    void initialize(const System& system, const DPDIntegrator& integrator);
+    /**
+     * Execute the kernel.
+     * 
+     * @param context    the context in which to execute this kernel
+     * @param integrator the DPDIntegrator this kernel is being used for
+     */
+    void execute(ContextImpl& context, const DPDIntegrator& integrator);
+    /**
+     * Compute the kinetic energy.
+     * 
+     * @param context    the context in which to execute this kernel
+     * @param integrator the DPDIntegrator this kernel is being used for
+     */
+    double computeKineticEnergy(ContextImpl& context, const DPDIntegrator& integrator);
+private:
+    class ReorderListener;
+    ComputeContext& cc;
+    bool hasInitializedKernels;
+    int numTypes, randomSeed, blockSize;
+    double maxCutoff;
+    ComputeArray particleType, pairParams, oldDelta, velDelta, tileCounter;
+    ComputeKernel kernel1, kernel2, kernel3, kernel4;
+};
 /**
 * This kernel is invoked to remove center of mass motion from the system.
 */

--- a/platforms/common/src/CommonKernels.cpp
+++ b/platforms/common/src/CommonKernels.cpp
@@ -34,6 +34,8 @@
 #include "openmm/internal/ContextImpl.h"
 #include "openmm/internal/CustomCentroidBondForceImpl.h"
 #include "openmm/internal/CustomCompoundBondForceImpl.h"
+#include "openmm/internal/DPDIntegratorUtilities.h"
+#include "openmm/internal/OSRngSeed.h"
 #include "openmm/internal/ThreadPool.h"
 #include "openmm/internal/timer.h"
 #include "CommonKernelSources.h"
@@ -3293,6 +3295,182 @@ double CommonIntegrateVariableLangevinStepKernel::computeKineticEnergy(ContextIm
    return cc.getIntegrationUtilities().computeKineticEnergy(0.5*integrator.getStepSize());
 }
+class CommonIntegrateDPDStepKernel::ReorderListener : public ComputeContext::ReorderListener {
+public:
+    ReorderListener(ComputeContext& cc, vector<int>& particleTypes, ComputeArray& particleTypeArray) :
+            cc(cc), particleTypes(particleTypes), particleTypeArray(particleTypeArray) {
+    }
+    void execute() {
+        // Reorder particleTypes to reflect the new atom order.
+        vector<int> sortedTypes(particleTypes.size());
+        const vector<int>& order = cc.getAtomIndex();
+        for (int i = 0; i < particleTypes.size(); i++)
+            sortedTypes[i] = particleTypes[order[i]];
+        particleTypeArray.upload(sortedTypes);
+    }
+private:
+    ComputeContext& cc;
+    ComputeArray& particleTypeArray;
+    vector<int> particleTypes;
+};
+void CommonIntegrateDPDStepKernel::initialize(const System& system, const DPDIntegrator& integrator) {
+    // Record information about the integrator.
+    vector<int> particleTypeVec;
+    vector<vector<double> > frictionTable, cutoffTable;
+    DPDIntegratorUtilities::createTypeTables(integrator, system.getNumParticles(), numTypes, particleTypeVec, frictionTable, cutoffTable, maxCutoff);
+    while (particleTypeVec.size() < cc.getPaddedNumAtoms())
+        particleTypeVec.push_back(0);
+    // We want the NonbondedUtilities to build a neighbor list.  Add an empty interaction
+    // with a nonexistant force group to it.
+    cc.getNonbondedUtilities().addInteraction(true, system.usesPeriodicBoundaryConditions(), false, maxCutoff, vector<vector<int> >(), "", 32);
+    // Create the arrays.
+    cc.initializeContexts();
+    ContextSelector selector(cc);
+    if (cc.getUseDoublePrecision() || cc.getUseMixedPrecision())
+        oldDelta.initialize<mm_double4>(cc, cc.getPaddedNumAtoms(), "oldDelta");
+    else
+        oldDelta.initialize<mm_float4>(cc, cc.getPaddedNumAtoms(), "oldDelta");
+    particleType.initialize<int>(cc, cc.getPaddedNumAtoms(), "dpdParticleType");
+    pairParams.initialize<mm_float2>(cc, numTypes*numTypes, "dpdPairParams");
+    velDelta.initialize<long long>(cc, 3*cc.getPaddedNumAtoms(), "velDelta");
+    tileCounter.initialize<int>(cc, 1, "tileCounter");
+    particleType.upload(particleTypeVec);
+    vector<mm_float2> pairParamsVec(numTypes*numTypes);
+    for (int i = 0; i < numTypes; i++)
+        for (int j = 0; j < numTypes; j++)
+            pairParamsVec[i+j*numTypes] = mm_float2(frictionTable[i][j], cutoffTable[i][j]);
+    pairParams.upload(pairParamsVec);
+    cc.addAutoclearBuffer(velDelta);
+    cc.addReorderListener(new ReorderListener(cc, particleTypeVec, particleType));
+    randomSeed = integrator.getRandomNumberSeed();
+    if (randomSeed == 0)
+        randomSeed = osrngseed(); // A seed of 0 means use a unique one
+}
+void CommonIntegrateDPDStepKernel::execute(ContextImpl& context, const DPDIntegrator& integrator) {
+    ContextSelector selector(cc);
+    IntegrationUtilities& integration = cc.getIntegrationUtilities();
+    NonbondedUtilities& nb = cc.getNonbondedUtilities();
+    int numAtoms = cc.getNumAtoms();
+    int paddedNumAtoms = cc.getPaddedNumAtoms();
+    if (!hasInitializedKernels) {
+        map<string, string> defines;
+        defines["M_PI"] = cc.doubleToString(M_PI);
+        defines["MAX_CUTOFF"] = cc.doubleToString(maxCutoff);
+        defines["TILE_SIZE"] = cc.intToString(ComputeContext::TileSize);
+        blockSize = max(32, cc.getNonbondedUtilities().getForceThreadBlockSize());
+        defines["WORK_GROUP_SIZE"] = cc.intToString(blockSize);
+        if (context.getSystem().usesPeriodicBoundaryConditions())
+            defines["USE_PERIODIC"] = "1";
+        if (cc.getIsCPU())
+            defines["INTEL_WORKAROUND"] = "1";
+        try {
+            // Workaround for errors in NVIDIA OpenCL.
+            string platformName = context.getPlatform().getPropertyValue(context.getOwner(), "OpenCLPlatformName");
+            if (platformName.rfind("NVIDIA", 0) == 0)
+                defines["NVIDIA_WORKAROUND"] = "1";
+        }
+        catch (...) {
+            // This isn't the OpenCL platform.
+        }
+        ComputeProgram program = cc.compileProgram(CommonKernelSources::dpd, defines);
+        kernel1 = program->createKernel("integrateDPDPart1");
+        kernel2 = program->createKernel("integrateDPDPart2");
+        kernel3 = program->createKernel("integrateDPDPart3");
+        kernel4 = program->createKernel("integrateDPDPart4");
+        hasInitializedKernels = true;
+        kernel1->addArg(numAtoms);
+        kernel1->addArg(paddedNumAtoms);
+        kernel1->addArg(cc.getVelm());
+        kernel1->addArg(cc.getLongForceBuffer());
+        kernel1->addArg(integration.getStepSize());
+        kernel1->addArg(tileCounter);
+        kernel2->addArg(numAtoms);
+        kernel2->addArg(paddedNumAtoms);
+        kernel2->addArg(cc.getPosq());
+        kernel2->addArg(cc.getVelm());
+        kernel2->addArg(velDelta);
+        kernel2->addArg(integration.getStepSize());
+        kernel2->addArg(particleType);
+        kernel2->addArg(numTypes);
+        kernel2->addArg(pairParams);
+        for (int i = 0; i < 7; i++)
+            kernel2->addArg(); // Random seed, kT, and box vectors will be set just before it is executed.
+        kernel2->addArg(nb.getExclusionTiles());
+        kernel2->addArg((int) nb.getExclusionTiles().getSize());
+        kernel2->addArg(nb.getInteractingTiles());
+        kernel2->addArg(nb.getInteractionCount());
+        kernel2->addArg(nb.getBlockCenters());
+        kernel2->addArg(nb.getBlockBoundingBoxes());
+        kernel2->addArg(nb.getInteractingAtoms());
+        kernel2->addArg(tileCounter);
+        if (cc.getUseMixedPrecision())
+            kernel2->addArg(cc.getPosqCorrection());
+        kernel3->addArg(numAtoms);
+        kernel3->addArg(paddedNumAtoms);
+        kernel3->addArg(cc.getVelm());
+        kernel3->addArg(velDelta);
+        kernel3->addArg(integration.getPosDelta());
+        kernel3->addArg(oldDelta);
+        kernel3->addArg(integration.getStepSize());
+        kernel4->addArg(numAtoms);
+        kernel4->addArg(cc.getPosq());
+        kernel4->addArg(cc.getVelm());
+        kernel4->addArg(integration.getPosDelta());
+        kernel4->addArg(oldDelta);
+        kernel4->addArg(integration.getStepSize());
+        if (cc.getUseMixedPrecision())
+            kernel4->addArg(cc.getPosqCorrection());
+    }
+    // Perform the integration.
+    vector<int> p;
+    particleType.download(p);
+    double stepSize = integrator.getStepSize();
+    cc.getIntegrationUtilities().setNextStepSize(stepSize);
+    kernel1->execute(numAtoms);
+    particleType.download(p);
+    integration.applyVelocityConstraints(integrator.getConstraintTolerance());
+    particleType.download(p);
+    kernel2->setArg(9, randomSeed+cc.getStepCount());
+    kernel2->setArg(10, (float) (BOLTZ*integrator.getTemperature()));
+    setPeriodicBoxArgs(cc, kernel2, 11);
+    particleType.download(p);
+    kernel2->execute(2*nb.getNumForceThreadBlocks()*blockSize, blockSize);
+    particleType.download(p);
+    kernel3->execute(numAtoms);
+    particleType.download(p);
+    integration.applyConstraints(integrator.getConstraintTolerance());
+    particleType.download(p);
+    kernel4->execute(numAtoms);
+    particleType.download(p);
+    integration.computeVirtualSites();
+    // Update the time and step count.
+    cc.setTime(cc.getTime()+stepSize);
+    cc.setStepCount(cc.getStepCount()+1);
+    cc.reorderAtoms();
+    // Reduce UI lag.
+    flushPeriodically(cc);
+    particleType.download(p);
+}
+double CommonIntegrateDPDStepKernel::computeKineticEnergy(ContextImpl& context, const DPDIntegrator& integrator) {
+    return cc.getIntegrationUtilities().computeKineticEnergy(0.0);
+}
 void CommonRemoveCMMotionKernel::initialize(const System& system, const CMMotionRemover& force) {
    ContextSelector selector(cc);
    frequency = force.getFrequency();

--- a/platforms/common/src/kernels/dpd.cc
+++ b/platforms/common/src/kernels/dpd.cc
+typedef struct {
+    mm_ulong state, increment;
+    float next;
+    bool nextIsValid;
+} RandomState;
+/**
+ * Generate a random number with PCH-XSH-RR.
+ */
+DEVICE unsigned int getRandomInt(RandomState* random) {
+    unsigned int xs = ((random->state>>18)^random->state)>>27;
+    unsigned int rot = random->state>>59;
+    random->state = random->state*6364136223846793005ULL + random->increment;
+    return (xs>>rot) | (xs<<((-rot)&31));
+}
+/**
+ * Generate a normally distributed random number with Box-Muller.
+ */
+DEVICE float getRandomNormal(RandomState* random) {
+    if (random->nextIsValid) {
+        random->nextIsValid = false;
+        return random->next;
+    }
+    float scale = 1/(float) 0x100000000;
+    float x = scale*max(getRandomInt(random), 1u);
+    float y = scale*getRandomInt(random);
+    float multiplier = SQRT(-2*LOG(x));
+    float angle = 2*M_PI*y;
+    random->next = multiplier*COS(angle);
+    random->nextIsValid = true;
+    return multiplier*SIN(angle);
+}
+/**
+ * Perform the first part of integration: velocity step.
+ */
+KERNEL void integrateDPDPart1(int numAtoms, int paddedNumAtoms, GLOBAL mixed4* RESTRICT velm, GLOBAL const mm_long* RESTRICT force,
+        GLOBAL const mixed2* RESTRICT dt, GLOBAL int* RESTRICT tileCounter) {
+    mixed fscale = dt[0].y/(mixed) 0x100000000;
+    for (int index = GLOBAL_ID; index < numAtoms; index += GLOBAL_SIZE) {
+        mixed4 velocity = velm[index];
+        if (velocity.w != 0.0) {
+            velocity.x += fscale*velocity.w*force[index];
+            velocity.y += fscale*velocity.w*force[index+paddedNumAtoms];
+            velocity.z += fscale*velocity.w*force[index+paddedNumAtoms*2];
+            velm[index] = velocity;
+        }
+    }
+    if (GLOBAL_ID == 0)
+        *tileCounter = 0;
+}
+/**
+ * Load the position of a particle.
+ */
+inline DEVICE mixed3 loadPos(GLOBAL const real4* RESTRICT posq, GLOBAL const real4* RESTRICT posqCorrection, int index) {
+#ifdef USE_MIXED_PRECISION
+    real4 pos1 = posq[index];
+    real4 pos2 = posqCorrection[index];
+    return make_mixed3(pos1.x+(mixed)pos2.x, pos1.y+(mixed)pos2.y, pos1.z+(mixed)pos2.z);
+#else
+    return trimTo3(posq[index]);
+#endif
+}
+/**
+ * Compute the friction and noise for a pair of particles.
+ */
+DEVICE void processPair(int i, int j, real3 delta, mixed4 vel1, mixed4 vel2, int type1, int type2, int paddedNumAtoms,
+        GLOBAL mm_ulong* RESTRICT velDelta, mixed dt, float kT, int numTypes, GLOBAL const float2* RESTRICT params, RandomState* random,
+        real4 periodicBoxSize, real4 invPeriodicBoxSize, real4 periodicBoxVecX, real4 periodicBoxVecY, real4 periodicBoxVecZ) {
+    if (vel1.w == 0 && vel2.w == 0)
+        return;
+    float2 pairParams = params[type1+type2*numTypes];
+    float friction = pairParams.x;
+    float cutoff = pairParams.y;
+    real r2 = delta.x*delta.x + delta.y*delta.y + delta.z*delta.z;
+    real invR = RSQRT(r2);
+    real r = r2*invR;
+    if (r >= cutoff)
+        return;
+    real mass1 = (vel1.w == 0 ? 0 : 1/vel1.w);
+    real mass2 = (vel2.w == 0 ? 0 : 1/vel2.w);
+    real m = mass1*mass2/(mass1+mass2);
+    real omega = 1-(r/cutoff);
+    real vscale = EXP(-dt*2*friction*omega*omega);
+    real noisescale = SQRT(1-vscale*vscale);
+    real3 dir = delta*invR;
+    real3 v = make_real3(vel2.x-vel1.x, vel2.y-vel1.y, vel2.z-vel1.z);
+    real dv = (1-vscale)*dot(v, dir) + noisescale*SQRT(kT/m)*getRandomNormal(random);
+    ATOMIC_ADD(&velDelta[i], (mm_ulong) realToFixedPoint(m*vel1.w*dv*dir.x));
+    ATOMIC_ADD(&velDelta[i+paddedNumAtoms], (mm_ulong) realToFixedPoint(m*vel1.w*dv*dir.y));
+    ATOMIC_ADD(&velDelta[i+2*paddedNumAtoms], (mm_ulong) realToFixedPoint(m*vel1.w*dv*dir.z));
+    ATOMIC_ADD(&velDelta[j], (mm_ulong) realToFixedPoint(-m*vel2.w*dv*dir.x));
+    ATOMIC_ADD(&velDelta[j+paddedNumAtoms], (mm_ulong) realToFixedPoint(-m*vel2.w*dv*dir.y));
+    ATOMIC_ADD(&velDelta[j+2*paddedNumAtoms], (mm_ulong) realToFixedPoint(-m*vel2.w*dv*dir.z));
+}
+/**
+ * Perform the second part of integration: position half step, then interact with heat bath.
+ */
+KERNEL void integrateDPDPart2(int numAtoms, int paddedNumAtoms, GLOBAL const real4* RESTRICT posq, GLOBAL const mixed4* RESTRICT velm, GLOBAL mm_long* RESTRICT velDelta,
+        GLOBAL const mixed2* RESTRICT dt, GLOBAL const int* particleType, int numTypes, GLOBAL const float2* RESTRICT params, mm_long seed,
+        float kT, real4 periodicBoxSize, real4 invPeriodicBoxSize, real4 periodicBoxVecX, real4 periodicBoxVecY, real4 periodicBoxVecZ,
+        GLOBAL const int2* RESTRICT exclusionTiles, int numExclusionTiles, GLOBAL const int* RESTRICT tiles, GLOBAL const unsigned int* RESTRICT interactionCount,
+        GLOBAL const real4* RESTRICT blockCenter, GLOBAL const real4* RESTRICT blockSize, GLOBAL const int* RESTRICT interactingAtoms, GLOBAL int* RESTRICT tileCounter
+#ifdef USE_MIXED_PRECISION
+        , GLOBAL const real4* RESTRICT posqCorrection
+#endif
+        ) {
+    const int totalWarps = GLOBAL_SIZE/TILE_SIZE;
+    const int warp = GLOBAL_ID/TILE_SIZE;
+    const int tgx = LOCAL_ID & (TILE_SIZE-1);
+    const int tbx = LOCAL_ID - tgx;
+    mixed halfdt = 0.5f*dt[0].y;
+    LOCAL mixed3 localPos[WORK_GROUP_SIZE];
+    LOCAL mixed4 localVel[WORK_GROUP_SIZE];
+    LOCAL volatile int localType[WORK_GROUP_SIZE];
+#ifndef USE_MIXED_PRECISION
+    GLOBAL real4* posqCorrection = 0;
+#endif
+    // Initialize the random generator for this thread.  The seed is incremented each step,
+    // and the stream ID is the global thread index.  Skipping a variable number of values
+    // also seems to be necessary to decorrelate the streams.
+    RandomState random;
+    random.state = 0;
+    random.increment = (GLOBAL_ID<<1) | 1;
+    random.nextIsValid = false;
+    getRandomInt(&random);
+    random.state += seed;
+    getRandomInt(&random);
+    for (int i = 0; i < LOCAL_ID%16; i++)
+        getRandomInt(&random);
+    // First loop: process fixed tiles (ones that contain exclusions).
+    for (int tile = warp; tile < numExclusionTiles; tile += totalWarps) {
+        const int2 tileIndices = exclusionTiles[tile];
+        const unsigned int x = tileIndices.x;
+        const unsigned int y = tileIndices.y;
+        int atom1 = x*TILE_SIZE + tgx;
+        mixed4 vel1 = velm[atom1];
+        mixed3 pos1 = loadPos(posq, posqCorrection, atom1);
+        if (vel1.w != 0)
+            pos1 += halfdt*trimTo3(vel1);
+        int type1 = particleType[atom1];
+        if (x == y) {
+            localVel[LOCAL_ID] = vel1;
+            localPos[LOCAL_ID] = pos1;
+            localType[LOCAL_ID] = type1;
+        }
+        else {
+            int atom2 = y*TILE_SIZE + tgx;
+            mixed4 vel2 = velm[atom2];
+            mixed3 pos2 = loadPos(posq, posqCorrection, atom2);
+            if (vel2.w != 0)
+                pos2 += halfdt*trimTo3(vel2);
+            localVel[LOCAL_ID] = vel2;
+            localPos[LOCAL_ID] = pos2;
+            localType[LOCAL_ID] = particleType[atom2];
+        }
+        SYNC_WARPS;
+        if (atom1 < numAtoms) {
+            for (int i = 0; i < TILE_SIZE; i++) {
+#ifdef INTEL_WORKAROUND
+                // Workaround for bug in Intel's OpenCL for CPUs.
+                MEM_FENCE;
+#endif
+                int atom2 = y*TILE_SIZE+i;
+                if ((x != y || atom1 < atom2) && atom2 < numAtoms) {
+                    mixed3 pos2 = localPos[tbx+i];
+                    real3 delta = make_real3(pos2.x-pos1.x, pos2.y-pos1.y, pos2.z-pos1.z);
+#ifdef USE_PERIODIC
+                    APPLY_PERIODIC_TO_DELTA(delta)
+#endif
+                    processPair(atom1, atom2, delta, vel1, localVel[tbx+i], type1, localType[tbx+i],
+                                paddedNumAtoms, (GLOBAL mm_ulong*) velDelta, dt[0].y, kT, numTypes, params, &random,
+                                periodicBoxSize, invPeriodicBoxSize, periodicBoxVecX, periodicBoxVecY, periodicBoxVecZ);
+                }
+            }
+        }
+#ifdef NVIDIA_WORKAROUND
+        SYNC_THREADS;
+#else
+        SYNC_WARPS;
+#endif
+    }
+    // Second loop: process tiles from the neighbor list.
+    unsigned int numTiles = interactionCount[0];
+    LOCAL int atomIndices[WORK_GROUP_SIZE];
+    LOCAL int nextTile[WORK_GROUP_SIZE/TILE_SIZE];
+    for (int tile = warp; tile < numTiles; tile += totalWarps) {
+        if (tgx == 0)
+            nextTile[tbx/TILE_SIZE] = ATOMIC_ADD(tileCounter, 1);
+        SYNC_WARPS;
+        int tileIndex = nextTile[tbx/TILE_SIZE];
+        int x = tiles[tileIndex];
+        real4 blockSizeX = blockSize[x];
+        bool singlePeriodicCopy = (0.5f*periodicBoxSize.x-blockSizeX.x >= MAX_CUTOFF &&
+                                   0.5f*periodicBoxSize.y-blockSizeX.y >= MAX_CUTOFF &&
+                                   0.5f*periodicBoxSize.z-blockSizeX.z >= MAX_CUTOFF);
+        int atom1 = x*TILE_SIZE + tgx;
+        mixed4 vel1 = velm[atom1];
+        mixed3 pos1 = loadPos(posq, posqCorrection, atom1);
+        if (vel1.w != 0)
+            pos1 += halfdt*trimTo3(vel1);
+        int type1 = particleType[atom1];
+        int atom2 = interactingAtoms[tileIndex*TILE_SIZE+tgx];
+        atomIndices[LOCAL_ID] = atom2;
+        mixed4 vel2 = velm[atom2];
+        mixed3 pos2 = loadPos(posq, posqCorrection, atom2);
+        if (vel2.w != 0)
+            pos2 += halfdt*trimTo3(vel2);
+        localVel[LOCAL_ID] = vel2;
+        localPos[LOCAL_ID] = pos2;
+        localType[LOCAL_ID] = particleType[atom2];
+#ifdef USE_PERIODIC
+        if (singlePeriodicCopy) {
+            // The box is small enough that we can just translate all the atoms into a single periodic
+            // box, then skip having to apply periodic boundary conditions later.
+            real4 blockCenterX = blockCenter[x];
+            APPLY_PERIODIC_TO_POS_WITH_CENTER(pos1, blockCenterX)
+            APPLY_PERIODIC_TO_POS_WITH_CENTER(localPos[LOCAL_ID], blockCenterX)
+            SYNC_WARPS;
+            if (atom1 < numAtoms) {
+                for (int i = 0; i < TILE_SIZE; i++) {
+#ifdef INTEL_WORKAROUND
+                    // Workaround for bug in Intel's OpenCL for CPUs.
+                    MEM_FENCE;
+#endif
+                    int atom2 = atomIndices[tbx+i];
+                    if (atom2 < numAtoms) {
+                        pos2 = localPos[tbx+i];
+                        real3 delta = make_real3(pos2.x-pos1.x, pos2.y-pos1.y, pos2.z-pos1.z);
+                        processPair(atom1, atom2, delta, vel1, localVel[tbx+i], type1, localType[tbx+i],
+                                    paddedNumAtoms, (GLOBAL mm_ulong*) velDelta, dt[0].y, kT, numTypes, params, &random,
+                                    periodicBoxSize, invPeriodicBoxSize, periodicBoxVecX, periodicBoxVecY, periodicBoxVecZ);
+                    }
+                }
+            }
+        }
+        else
+#endif
+        {
+            // We need to apply periodic boundary conditions separately for each interaction.
+            SYNC_WARPS;
+            if (atom1 < numAtoms) {
+                for (int i = 0; i < TILE_SIZE; i++) {
+#ifdef INTEL_WORKAROUND
+                    // Workaround for bug in Intel's OpenCL for CPUs.
+                    MEM_FENCE;
+#endif
+                    int atom2 = atomIndices[tbx+i];
+                    if (atom2 < numAtoms) {
+                        pos2 = localPos[tbx+i];
+                        real3 delta = make_real3(pos2.x-pos1.x, pos2.y-pos1.y, pos2.z-pos1.z);
+#ifdef USE_PERIODIC
+                        APPLY_PERIODIC_TO_DELTA(delta)
+#endif
+                        processPair(atom1, atom2, delta, vel1, localVel[tbx+i], type1, localType[tbx+i],
+                                    paddedNumAtoms, (GLOBAL mm_ulong*) velDelta, dt[0].y, kT, numTypes, params, &random,
+                                    periodicBoxSize, invPeriodicBoxSize, periodicBoxVecX, periodicBoxVecY, periodicBoxVecZ);
+                    }
+                }
+            }
+        }
+        SYNC_WARPS;
+    }
+}
+/**
+ * Perform the third part of integration: update the velocities and the second position half step.
+ */
+KERNEL void integrateDPDPart3(int numAtoms, int paddedNumAtoms, GLOBAL mixed4* RESTRICT velm, GLOBAL mm_long* RESTRICT velDelta, GLOBAL mixed4* RESTRICT posDelta,
+        GLOBAL mixed4* RESTRICT oldDelta, GLOBAL const mixed2* RESTRICT dt) {
+    mixed halfdt = 0.5f*dt[0].y;
+    mixed deltascale = 1/(mixed) 0x100000000;
+    for (int index = GLOBAL_ID; index < numAtoms; index += GLOBAL_SIZE) {
+        mixed4 velocity = velm[index];
+        if (velocity.w != 0.0) {
+            mixed4 delta = make_mixed4(halfdt*velocity.x, halfdt*velocity.y, halfdt*velocity.z, 0);
+            velocity.x += deltascale*velDelta[index];
+            velocity.y += deltascale*velDelta[index+paddedNumAtoms];
+            velocity.z += deltascale*velDelta[index+paddedNumAtoms*2];
+            velm[index] = velocity;
+            delta += make_mixed4(halfdt*velocity.x, halfdt*velocity.y, halfdt*velocity.z, 0);
+            posDelta[index] = delta;
+            oldDelta[index] = delta;
+        }
+    }
+}
+/**
+ * Perform the fourth part of integration: apply constraint forces to velocities, then record
+ * the constrained positions.
+ */
+KERNEL void integrateDPDPart4(int numAtoms, GLOBAL real4* RESTRICT posq, GLOBAL mixed4* RESTRICT velm,
+         GLOBAL mixed4* RESTRICT posDelta, GLOBAL mixed4* RESTRICT oldDelta, GLOBAL const mixed2* RESTRICT dt
+#ifdef USE_MIXED_PRECISION
+        , GLOBAL real4* RESTRICT posqCorrection
+#endif
+        ) {
+    mixed invDt = 1/dt[0].y;
+    for (int index = GLOBAL_ID; index < numAtoms; index += GLOBAL_SIZE) {
+        mixed4 velocity = velm[index];
+        if (velocity.w != 0.0) {
+            mixed4 delta = posDelta[index];
+            velocity.x += (delta.x-oldDelta[index].x)*invDt;
+            velocity.y += (delta.y-oldDelta[index].y)*invDt;
+            velocity.z += (delta.z-oldDelta[index].z)*invDt;
+            velm[index] = velocity;
+#ifdef USE_MIXED_PRECISION
+            real4 pos1 = posq[index];
+            real4 pos2 = posqCorrection[index];
+            mixed4 pos = make_mixed4(pos1.x+(mixed)pos2.x, pos1.y+(mixed)pos2.y, pos1.z+(mixed)pos2.z, pos1.w);
+#else
+            real4 pos = posq[index];
+#endif
+            pos.x += delta.x;
+            pos.y += delta.y;
+            pos.z += delta.z;
+#ifdef USE_MIXED_PRECISION
+            posq[index] = make_real4((real) pos.x, (real) pos.y, (real) pos.z, (real) pos.w);
+            posqCorrection[index] = make_real4(pos.x-(real) pos.x, pos.y-(real) pos.y, pos.z-(real) pos.z, 0);
+#else
+            posq[index] = pos;
+#endif
+        }
+    }
+}
--- a/platforms/cpu/src/CpuRandom.cpp
+++ b/platforms/cpu/src/CpuRandom.cpp
-/* Portions copyright (c) 2013-2024 Stanford University and Simbios.
+/* Portions copyright (c) 2013-2025 Stanford University and Simbios.
 * Authors: Peter Eastman
 * Contributors: 
 *
@@ -49,7 +49,7 @@ void CpuRandom::initialize(int seed, int numThreads) {
    hasInitialized = true;
    threadRandom.resize(numThreads);
    nextGaussian.resize(numThreads);
-    nextGaussianIsValid.resize(numThreads, false);
+    nextGaussianIsValid.resize(numThreads);
    /* Use a quick and dirty RNG to pick seeds for the real random number generator.
     * A random seed of 0 means pick a unique seed
@@ -62,6 +62,7 @@ void CpuRandom::initialize(int seed, int numThreads) {
        r = (1664525*r + 1013904223) & 0xFFFFFFFF;
        threadRandom[i] = new OpenMM_SFMT::SFMT();
        init_gen_rand(r, *threadRandom[i]);
+        nextGaussianIsValid[i] = false;
    }
 }

--- a/platforms/cuda/include/CudaNonbondedUtilities.h
+++ b/platforms/cuda/include/CudaNonbondedUtilities.h
@@ -9,7 +9,7 @@
 * Biological Structures at Stanford, funded under the NIH Roadmap for        *
 * Medical Research, grant U54 GM072970. See https://simtk.org.               *
 *                                                                            *
- * Portions copyright (c) 2009-2023 Stanford University and the Authors.      *
+ * Portions copyright (c) 2009-2025 Stanford University and the Authors.      *
 * Authors: Peter Eastman                                                     *
 * Contributors:                                                              *
 *                                                                            *
@@ -353,7 +353,7 @@ private:
    std::vector<std::string> energyParameterDerivatives;
    std::map<int, double> groupCutoff;
    std::map<int, std::string> groupKernelSource;
-    double lastCutoff;
+    double maxCutoff;
    bool useCutoff, usePeriodic, anyExclusions, usePadding, useNeighborList, forceRebuildNeighborList, canUsePairList, useLargeBlocks;
    int startTileIndex, startBlockIndex, numBlocks, maxExclusions, numForceThreadBlocks, forceThreadBlockSize, numAtoms, groupFlags, numBlockSizes;
    unsigned int maxTiles, maxSinglePairs, tilesAfterReorder;
@@ -368,7 +368,6 @@ private:
 class CudaNonbondedUtilities::KernelSet {
 public:
    bool hasForces;
-    double cutoffDistance;
    std::string source;
    CUfunction forceKernel, energyKernel, forceEnergyKernel;
    CUfunction findBlockBoundsKernel;

--- a/platforms/cuda/src/CudaKernelFactory.cpp
+++ b/platforms/cuda/src/CudaKernelFactory.cpp
@@ -140,6 +140,8 @@ KernelImpl* CudaKernelFactory::createKernelImpl(std::string name, const Platform
        return new CommonIntegrateVariableLangevinStepKernel(name, platform, cu);
    if (name == IntegrateCustomStepKernel::Name())
        return new CommonIntegrateCustomStepKernel(name, platform, cu);
+    if (name == IntegrateDPDStepKernel::Name())
+        return new CommonIntegrateDPDStepKernel(name, platform, cu);
    if (name == ApplyAndersenThermostatKernel::Name())
        return new CommonApplyAndersenThermostatKernel(name, platform, cu);
    if (name == IntegrateNoseHooverStepKernel::Name())

--- a/platforms/cuda/src/CudaNonbondedUtilities.cpp
+++ b/platforms/cuda/src/CudaNonbondedUtilities.cpp
@@ -6,7 +6,7 @@
 * Biological Structures at Stanford, funded under the NIH Roadmap for        *
 * Medical Research, grant U54 GM072970. See https://simtk.org.               *
 *                                                                            *
- * Portions copyright (c) 2009-2023 Stanford University and the Authors.      *
+ * Portions copyright (c) 2009-2025 Stanford University and the Authors.      *
 * Authors: Peter Eastman                                                     *
 * Contributors:                                                              *
 *                                                                            *
@@ -61,7 +61,7 @@ public:
 };
 CudaNonbondedUtilities::CudaNonbondedUtilities(CudaContext& context) : context(context), useCutoff(false), usePeriodic(false), useNeighborList(false), anyExclusions(false), usePadding(true),
-        blockSorter(NULL), pinnedCountBuffer(NULL), forceRebuildNeighborList(true), lastCutoff(0.0), groupFlags(0), canUsePairList(true), tilesAfterReorder(0) {
+        blockSorter(NULL), pinnedCountBuffer(NULL), forceRebuildNeighborList(true), groupFlags(0), canUsePairList(true), tilesAfterReorder(0) {
    // Decide how many thread blocks to use.
    string errorMessage = "Error initializing nonbonded utilities";
@@ -262,6 +262,7 @@ void CudaNonbondedUtilities::initialize(const System& system) {
    // Create data structures for the neighbor list.
+    maxCutoff = getMaxCutoffDistance();
    if (useCutoff) {
        // Select a size for the arrays that hold the neighbor list.  We have to make a fairly
        // arbitrary guess, but if this turns out to be too small we'll increase it later.
@@ -407,7 +408,7 @@ void CudaNonbondedUtilities::prepareInteractions(int forceGroups) {
    KernelSet& kernels = groupKernels[forceGroups];
    if (useCutoff && usePeriodic) {
        double4 box = context.getPeriodicBoxSize();
-        double minAllowedSize = 1.999999*kernels.cutoffDistance;
+        double minAllowedSize = 1.999999*maxCutoff;
        if (box.x < minAllowedSize || box.y < minAllowedSize || box.z < minAllowedSize)
            throw OpenMMException("The periodic box size has decreased to less than twice the nonbonded cutoff.");
    }
@@ -418,15 +419,12 @@ void CudaNonbondedUtilities::prepareInteractions(int forceGroups) {
    // Compute the neighbor list.
-    if (lastCutoff != kernels.cutoffDistance)
-        forceRebuildNeighborList = true;
    context.executeKernel(kernels.findBlockBoundsKernel, &findBlockBoundsArgs[0], context.getNumAtomBlocks());
    context.executeKernel(kernels.computeSortKeysKernel, &computeSortKeysArgs[0], context.getNumAtomBlocks());
    blockSorter->sort(sortedBlocks);
    context.executeKernel(kernels.sortBoxDataKernel, &sortBoxDataArgs[0], context.getNumAtoms());
    context.executeKernel(kernels.findInteractingBlocksKernel, &findInteractingBlocksArgs[0], context.getNumAtoms(), 256);
    forceRebuildNeighborList = false;
-    lastCutoff = kernels.cutoffDistance;
    interactionCount.download(pinnedCountBuffer, false);
    cuEventRecord(downloadCountEvent, context.getCurrentStream());
 }
@@ -503,25 +501,22 @@ void CudaNonbondedUtilities::setAtomBlockRange(double startFraction, double endF
 void CudaNonbondedUtilities::createKernelsForGroups(int groups) {
    KernelSet kernels;
-    double cutoff = 0.0;
    string source;
    for (int i = 0; i < 32; i++) {
        if ((groups&(1<<i)) != 0) {
-            cutoff = max(cutoff, groupCutoff[i]);
            source += groupKernelSource[i];
        }
    }
    kernels.hasForces = (source.size() > 0);
-    kernels.cutoffDistance = cutoff;
    kernels.source = source;
    kernels.forceKernel = kernels.energyKernel = kernels.forceEnergyKernel = NULL;
    if (useCutoff) {
-        double paddedCutoff = padCutoff(cutoff);
+        double paddedCutoff = padCutoff(maxCutoff);
        map<string, string> defines;
        defines["TILE_SIZE"] = context.intToString(CudaContext::TileSize);
        defines["NUM_BLOCKS"] = context.intToString(context.getNumAtomBlocks());
        defines["NUM_ATOMS"] = context.intToString(context.getNumAtoms());
-        defines["PADDING"] = context.doubleToString(paddedCutoff-cutoff);
+        defines["PADDING"] = context.doubleToString(paddedCutoff-maxCutoff);
        defines["PADDED_CUTOFF"] = context.doubleToString(paddedCutoff);
        defines["PADDED_CUTOFF_SQUARED"] = context.doubleToString(paddedCutoff*paddedCutoff);
        defines["NUM_TILES_WITH_EXCLUSIONS"] = context.intToString(exclusionTiles.getSize());

--- a/platforms/cuda/src/CudaPlatform.cpp
+++ b/platforms/cuda/src/CudaPlatform.cpp
@@ -6,7 +6,7 @@
 * Biological Structures at Stanford, funded under the NIH Roadmap for        *
 * Medical Research, grant U54 GM072970. See https://simtk.org.               *
 *                                                                            *
- * Portions copyright (c) 2008-2024 Stanford University and the Authors.      *
+ * Portions copyright (c) 2008-2025 Stanford University and the Authors.      *
 * Authors: Peter Eastman                                                     *
 * Contributors:                                                              *
 *                                                                            *
@@ -104,6 +104,7 @@ CudaPlatform::CudaPlatform() {
    registerKernelFactory(IntegrateVariableVerletStepKernel::Name(), factory);
    registerKernelFactory(IntegrateVariableLangevinStepKernel::Name(), factory);
    registerKernelFactory(IntegrateCustomStepKernel::Name(), factory);
+    registerKernelFactory(IntegrateDPDStepKernel::Name(), factory);
    registerKernelFactory(ApplyAndersenThermostatKernel::Name(), factory);
    registerKernelFactory(ApplyMonteCarloBarostatKernel::Name(), factory);
    registerKernelFactory(RemoveCMMotionKernel::Name(), factory);

--- a/platforms/cuda/tests/TestCudaDPDIntegrator.cpp
+++ b/platforms/cuda/tests/TestCudaDPDIntegrator.cpp
+/* -------------------------------------------------------------------------- *
+ *                                   OpenMM                                   *
+ * -------------------------------------------------------------------------- *
+ * This is part of the OpenMM molecular simulation toolkit originating from   *
+ * Simbios, the NIH National Center for Physics-Based Simulation of           *
+ * Biological Structures at Stanford, funded under the NIH Roadmap for        *
+ * Medical Research, grant U54 GM072970. See https://simtk.org.               *
+ *                                                                            *
+ * Portions copyright (c) 2025 Stanford University and the Authors.           *
+ * Authors: Peter Eastman                                                     *
+ * Contributors:                                                              *
+ *                                                                            *
+ * Permission is hereby granted, free of charge, to any person obtaining a    *
+ * copy of this software and associated documentation files (the "Software"), *
+ * to deal in the Software without restriction, including without limitation  *
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,   *
+ * and/or sell copies of the Software, and to permit persons to whom the      *
+ * Software is furnished to do so, subject to the following conditions:       *
+ *                                                                            *
+ * The above copyright notice and this permission notice shall be included in *
+ * all copies or substantial portions of the Software.                        *
+ *                                                                            *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR *
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,   *
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL    *
+ * THE AUTHORS, CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,    *
+ * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR      *
+ * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE  *
+ * USE OR OTHER DEALINGS IN THE SOFTWARE.                                     *
+ * -------------------------------------------------------------------------- */
+#include "CudaTests.h"
+#include "TestDPDIntegrator.h"
+void runPlatformTests() {
+}
--- a/platforms/hip/include/HipNonbondedUtilities.h
+++ b/platforms/hip/include/HipNonbondedUtilities.h
@@ -9,7 +9,7 @@
 * Biological Structures at Stanford, funded under the NIH Roadmap for        *
 * Medical Research, grant U54 GM072970. See https://simtk.org.               *
 *                                                                            *
- * Portions copyright (c) 2009-2023 Stanford University and the Authors.      *
+ * Portions copyright (c) 2009-2025 Stanford University and the Authors.      *
 * Portions copyright (c) 2020-2023 Advanced Micro Devices, Inc.              *
 * Authors: Peter Eastman, Nicholas Curtis                                    *
 * Contributors:                                                              *
@@ -354,7 +354,7 @@ private:
    std::vector<std::string> energyParameterDerivatives;
    std::map<int, double> groupCutoff;
    std::map<int, std::string> groupKernelSource;
-    double lastCutoff;
+    double maxCutoff;
    bool useCutoff, usePeriodic, anyExclusions, usePadding, useNeighborList, forceRebuildNeighborList, canUsePairList, useLargeBlocks;
    int startTileIndex, startBlockIndex, numBlocks, numTilesInBatch, maxExclusions;
    int numForceThreadBlocks, forceThreadBlockSize, findInteractingBlocksThreadBlockSize, numAtoms, groupFlags;
@@ -370,7 +370,6 @@ private:
 class HipNonbondedUtilities::KernelSet {
 public:
    bool hasForces;
-    double cutoffDistance;
    std::string source;
    hipFunction_t forceKernel, energyKernel, forceEnergyKernel;
    hipFunction_t findBlockBoundsKernel;

--- a/platforms/hip/src/HipKernelFactory.cpp
+++ b/platforms/hip/src/HipKernelFactory.cpp
@@ -139,6 +139,8 @@ KernelImpl* HipKernelFactory::createKernelImpl(std::string name, const Platform&
        return new CommonIntegrateVariableLangevinStepKernel(name, platform, cu);
    if (name == IntegrateCustomStepKernel::Name())
        return new CommonIntegrateCustomStepKernel(name, platform, cu);
+    if (name == IntegrateDPDStepKernel::Name())
+        return new CommonIntegrateDPDStepKernel(name, platform, cu);
    if (name == ApplyAndersenThermostatKernel::Name())
        return new CommonApplyAndersenThermostatKernel(name, platform, cu);
    if (name == IntegrateNoseHooverStepKernel::Name())