Amortize PME atom-grid re-sort for large systems (#5305)
Large systems (>15000 atoms) re-sorted the PME atom grid every step, skipping the step-counter amortization used for smaller systems. On current GPUs the per-step sort is mostly wasted work, so re-sort every 2 steps instead. Smaller systems are unchanged. The sort only changes charge-spread memory locality; results are identical up to floating-point summation order.
Showing
Please register or sign in to comment