"wrappers/vscode:/vscode.git/clone" did not exist on "2ca67c5c850700130e3c9bf8c15a94f09919aaf0"
  1. 05 Sep, 2024 2 commits
    • Anton Gorenko's avatar
      Use VkFFT in HipFFT3D, remove hipFFT and the builtin FFT · f717ed89
      Anton Gorenko authored
      * VkFFT-based 3D FFT;
      * Caching of compiled VkFFT kernels;
      * Extend FFT tests with more sizes.
      f717ed89
    • Anton Gorenko's avatar
      Optimize PME kernels · a0acfbc9
      Anton Gorenko authored
      * Compile with -munsafe-fp-atomics to enable fast hardware f32 atomic
        add on global memory on pre-MI100 GPUs;
      * Use fixed point charge spreading on other GPUs, otherwise float atomic
        add will be compiled as a slow CAS loop;
      * Tune block sizes, use executeKernelFlat;
      * Tune launch bounds of PME grid-related kernels: force the compiler to
        use all registers by limiting max waves per EU to 1.
      a0acfbc9
  2. 01 Sep, 2024 1 commit