1. 21 Feb, 2022 1 commit
  2. 07 Feb, 2022 2 commits
  3. 05 Feb, 2022 2 commits
  4. 28 Jan, 2022 1 commit
  5. 27 Jan, 2022 1 commit
  6. 26 Jan, 2022 1 commit
    • Paul's avatar
      Updates · 1cc6c88c
      Paul authored
      1cc6c88c
  7. 21 Jan, 2022 1 commit
  8. 10 Jan, 2022 3 commits
  9. 07 Jan, 2022 2 commits
  10. 06 Jan, 2022 3 commits
  11. 11 Dec, 2021 5 commits
  12. 09 Dec, 2021 1 commit
    • Shucai Xiao's avatar
      Softmax perf optimization (#1014) · 2e337c7f
      Shucai Xiao authored
      Changed the number of threads in a block from 256 to 128
      Increased the max number of blocks in the kernel from 256 to 1M.
      For the case that the axis is the last dimension, we removed the computation of index since it is not required.
      
      With these change, we can get about 2x speedup compared to the develop branch for the softmax op used in the BertSquad model.
      2e337c7f
  13. 08 Dec, 2021 1 commit
  14. 07 Dec, 2021 1 commit
  15. 02 Dec, 2021 1 commit
  16. 01 Dec, 2021 4 commits
  17. 30 Nov, 2021 2 commits
  18. 24 Nov, 2021 3 commits
  19. 18 Nov, 2021 1 commit
  20. 16 Nov, 2021 4 commits