"src/targets/vscode:/vscode.git/clone" did not exist on "0d2606bb60f6a9feb67a4a2a431ac89220e6b9e4"
  1. 18 Sep, 2019 1 commit
    • Shucai Xiao's avatar
      Remove gemm copy and simplify rocblas call (#356) · a0f9b785
      Shucai Xiao authored
      * Remove extra copy in gemm
      
      * combine rocblas gemm call
      
      * clang format
      
      * fix a bug in calling rocblas function
      
      * clang format'
      
      * backup of temporary changes
      
      * clang format
      
      * unify the gemm call to avoid multiple gpu implemantation
      
      * clang format
      
      * remove unnecessary code
      
      * backup temp changes
      
      * clang format
      
      * fix cppcheck error
      
      * code backup
      
      * clang format
      
      * remove unnecessary synchronization function
      
      * clang format
      
      * fix bugs
      
      * clang format
      
      * more optimization related to gemm
      
      * clang format
      
      * code cleanup
      
      * implementation that can achieves better performance
      
      * clang format
      
      * temp changes to try performance
      
      * clang format
      
      * revert to previous commits
      
      * fixed review comments
      
      * clang format
      
      * fix review comments
      a0f9b785
  2. 16 Sep, 2019 3 commits
    • Paul Fultz II's avatar
      Add flags to driver to run quantization (#361) · f445d962
      Paul Fultz II authored
      * Add flags to quantize in driver
      
      * Formatting
      
      * Fix compile error
      f445d962
    • kahmed10's avatar
      Add fusions for sigmoid and tanh (#354) · ef5e7ce0
      kahmed10 authored
      * add tests, fix bug in ternary op
      
      * formatting
      
      * uncomment fusion
      ef5e7ce0
    • Shucai Xiao's avatar
      Refactor reduce ops (#350) · 307c40cd
      Shucai Xiao authored
      * first version of refactoring reduce operators.
      
      * clang format
      
      * refactor the gpu implemantation of the reduce_mean operator
      
      * clang format
      
      * refactor gpu implementation of the resuce_sum operator
      
      * fix cpp check error
      
      * fix cppcheck error
      
      * fix cppcheck error
      
      * fix review comments
      
      * clang format
      
      * fix a jenkin error
      
      * fixed review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      307c40cd
  3. 04 Sep, 2019 4 commits
  4. 03 Sep, 2019 6 commits
  5. 01 Sep, 2019 2 commits
  6. 31 Aug, 2019 1 commit
  7. 30 Aug, 2019 3 commits
  8. 29 Aug, 2019 5 commits
  9. 28 Aug, 2019 3 commits
  10. 27 Aug, 2019 8 commits
  11. 26 Aug, 2019 4 commits