1. 22 Nov, 2023 1 commit
  2. 17 Nov, 2023 1 commit
  3. 30 Oct, 2023 1 commit
  4. 20 Oct, 2023 1 commit
  5. 06 Oct, 2023 1 commit
  6. 03 Oct, 2023 1 commit
  7. 16 Sep, 2023 1 commit
  8. 13 Aug, 2023 1 commit
  9. 08 Aug, 2023 1 commit
  10. 08 Jun, 2023 1 commit
  11. 19 May, 2023 1 commit
  12. 24 Apr, 2023 1 commit
    • Charlie Lin's avatar
      Dynamic shape hip::copy_to_gpu and hip::copy_from_gpu (#1694) · 84acaea0
      Charlie Lin authored
      Updates the hip::copy_to_gpu and hip::copy_from_gpu operators to work with dynamic shapes
      
      Allows for offload_copy to be used with dynamic batch
      
      Changed assert in select_module because the argument might now be smaller with how offload_copy will work with dynamic batch. (maximum buffer size will be used)
      84acaea0
  13. 16 Feb, 2023 1 commit
  14. 31 Jan, 2023 1 commit
    • Umang Yadav's avatar
      hipRTC fixes (#1531) · 91cc7242
      Umang Yadav authored
      Added CMakeFlag for hipRTC. MIGRAPHX_USE_HIPRTC.
      Added stages in Jenkins for hipRTC.
      Fixes for some of the pending issues from hipRTC.
      91cc7242
  15. 23 Sep, 2022 1 commit
  16. 06 Sep, 2022 1 commit
  17. 22 Jun, 2022 1 commit
  18. 11 Apr, 2022 1 commit
  19. 18 Mar, 2022 1 commit
  20. 14 Mar, 2022 1 commit
  21. 02 Mar, 2022 1 commit
  22. 09 Dec, 2021 1 commit
    • Shucai Xiao's avatar
      Softmax perf optimization (#1014) · 2e337c7f
      Shucai Xiao authored
      Changed the number of threads in a block from 256 to 128
      Increased the max number of blocks in the kernel from 256 to 1M.
      For the case that the axis is the last dimension, we removed the computation of index since it is not required.
      
      With these change, we can get about 2x speedup compared to the develop branch for the softmax op used in the BertSquad model.
      2e337c7f
  23. 08 Oct, 2021 1 commit
  24. 01 Oct, 2021 1 commit
    • turneram's avatar
      Add multinomial op (#954) · 0b7672d7
      turneram authored
      
      
      Add multinomial op to onnx parser with ref and GPU implementations.
      
      The onnx parser inserts a literal of shape {batch_size, sample_size} with random values in the range [0, 1) and inserts existing ops to compute the cumulative density function. The multinomial operator multiplies the random values by the sum of the CDF and returns the index of the first element of the CDF that is greater than the result, representing samples randomly drawn from [0, class_size) that follow the log-probability distribution.
      
      Resolves #821
      Co-authored-by: default avatarShucai Xiao <shucai@gmail.com>
      0b7672d7
  25. 27 Sep, 2021 1 commit
  26. 16 Sep, 2021 1 commit
    • Shucai Xiao's avatar
      Loop operator (#853) · a275f590
      Shucai Xiao authored
      
      
      Add Loop operator for opset version 13.
      Notes: 1) Default max iteration number is 10 if no max iteration number is provided
      2) To change the max iter number, a user can set the max_loop_iterations in the onnx_option struct when parsing a model.
      3) The returned shape of the scan output is from the max_loop_iterations even the actual loop num is less than that. This issue also applies to other operators like NonZero and NonMaxSuppression. A issue #948 is created to track this and to be resolved later.
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      a275f590
  27. 02 Sep, 2021 2 commits
  28. 09 Aug, 2021 1 commit
  29. 09 Jul, 2021 1 commit
  30. 08 Jul, 2021 3 commits
  31. 25 Jun, 2021 4 commits
  32. 11 Jun, 2021 1 commit
  33. 08 Jun, 2021 1 commit
    • Cagri Eryilmaz's avatar
      Reverse Op (#846) · 9c54fc4f
      Cagri Eryilmaz authored
      
      
      * init reverseOp branch: ref op + ref test. WIP
      
      * first passing basic test
      
      * cleanup
      
      * additional axis implementation
      
      * additional test
      
      * ref op implementation vec to int for axis
      
      * ref op test change for axis
      
      * initial gpu files and test
      
      * updates to implementation and test
      
      * fixed some issues
      
      * clang format
      
      * cleanup
      
      * formatting
      
      * removing comments
      
      * remove local size, back to default
      
      * update tests: replace with std functions
      
      * multiple axis for reverse op
      
      * fix a build error
      
      * clang format
      
      * more tests
      
      * fix a bug for the reverse device function
      
      * clang format
      
      * fix a bug
      
      * clang format
      
      * ref test updates, multiaxis
      
      * formatting
      Co-authored-by: default avatarShucai Xiao <Shucai.Xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      9c54fc4f
  34. 03 May, 2021 1 commit