• Adel Johar's avatar
    Final HIP Platform implementation for AMD GPUs on ROCm (#3338) · a39fa14a
    Adel Johar authored
    
    
    * Support kernel files with extensions of any length (like .hip)
    
    * Do not allow to replace symbols in single-line comments
    
    * Add OPENMM_BUILD_COMMON CMake option
    
    It allows to build and install common platform files even if
    CUDA or OpenCL platforms are not built.
    This is required for HIP platform (openmm-hip) if ROCm OpenCL
    packages are not installed.
    
    * Add an option for Python wrapper to install into user packages
    
    OPENMM_PYTHON_USER_INSTALL is OFF be default.
    
    * Support FFT backends in Amoeba plugin
    
    The HIP platform supports FFT backends, this commit moves
    findLegalFFTDimension to ComputeContext, so platforms can have their own
    implementations.
    
    * Compatibility for common platform w/ new HIP platform
    
    * Do not use volatile with private and local AtomData parameters on HIP
    
    The generated code is not optimal, for example, the compiler generates
    flat_load instructions instead of ds_read.
    
    * Tune launch bounds for PME grid-related kernels and add WA for RDNA
    
    Force the compiler to use all registers for gridSpreadCharge and
    gridInterpolateForce by limiting max waves per EU to 1 on CDNA GPUs,
    RDNA GPUs work better without it.
    
    * Optimize atom data structs in GBSA and Amoeba on HIP
    
    Manually rearrange fields, add paddings and force alignments to
    have faster accesses to shared memory: ds_read and ds_write may
    work slower if addresses are not aligned by 16 bytes.
    Co-authored-by: default avatarAnton Gorenko <anton@streamhpc.com>
    Co-authored-by: default avatarNick Curtis <nicholas.curtis@amd.com>
    a39fa14a
AmoebaCommonKernels.cpp 176 KB