1. 04 Mar, 2022 5 commits
  2. 03 Mar, 2022 2 commits
  3. 02 Mar, 2022 1 commit
  4. 01 Mar, 2022 4 commits
  5. 28 Feb, 2022 8 commits
  6. 26 Feb, 2022 2 commits
  7. 08 Feb, 2022 3 commits
  8. 31 Jan, 2022 1 commit
  9. 09 Dec, 2021 1 commit
    • Shucai Xiao's avatar
      Softmax perf optimization (#1014) · 2e337c7f
      Shucai Xiao authored
      Changed the number of threads in a block from 256 to 128
      Increased the max number of blocks in the kernel from 256 to 1M.
      For the case that the axis is the last dimension, we removed the computation of index since it is not required.
      
      With these change, we can get about 2x speedup compared to the develop branch for the softmax op used in the BertSquad model.
      2e337c7f
  10. 08 Oct, 2021 1 commit
  11. 01 Oct, 2021 1 commit
    • turneram's avatar
      Add multinomial op (#954) · 0b7672d7
      turneram authored
      
      
      Add multinomial op to onnx parser with ref and GPU implementations.
      
      The onnx parser inserts a literal of shape {batch_size, sample_size} with random values in the range [0, 1) and inserts existing ops to compute the cumulative density function. The multinomial operator multiplies the random values by the sum of the CDF and returns the index of the first element of the CDF that is greater than the result, representing samples randomly drawn from [0, class_size) that follow the log-probability distribution.
      
      Resolves #821
      Co-authored-by: default avatarShucai Xiao <shucai@gmail.com>
      0b7672d7
  12. 27 Sep, 2021 1 commit
  13. 16 Sep, 2021 1 commit
    • Shucai Xiao's avatar
      Loop operator (#853) · a275f590
      Shucai Xiao authored
      
      
      Add Loop operator for opset version 13.
      Notes: 1) Default max iteration number is 10 if no max iteration number is provided
      2) To change the max iter number, a user can set the max_loop_iterations in the onnx_option struct when parsing a model.
      3) The returned shape of the scan output is from the max_loop_iterations even the actual loop num is less than that. This issue also applies to other operators like NonZero and NonMaxSuppression. A issue #948 is created to track this and to be resolved later.
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      a275f590
  14. 02 Sep, 2021 2 commits
  15. 09 Aug, 2021 1 commit
  16. 09 Jul, 2021 1 commit
  17. 08 Jul, 2021 3 commits
  18. 25 Jun, 2021 2 commits