1. 24 Sep, 2022 1 commit
  2. 23 Sep, 2022 1 commit
  3. 21 Sep, 2022 2 commits
  4. 19 Sep, 2022 2 commits
    • Paul Fultz II's avatar
      Improve layernorm and reductions performance (#1348) · 97a1ed2d
      Paul Fultz II authored
      Compute mean and variance in same reduction
      Set block size to numbers divisible by 32 instead powers of 2
      Global is also set exactly instead of being divisible by block size
      More exact matching of global/local can help get rid of branching/loops
      Reduce vectors first before doing dpp_reduce
      Explicitly vectorize array operators since the compiler doesnt always vectorize them
      Still uses old for loop when its computing at compile-time since the reinterpret_cast nor the all the vector types is supported
      97a1ed2d
    • Chris Austen's avatar
  5. 16 Sep, 2022 2 commits
  6. 15 Sep, 2022 1 commit
  7. 14 Sep, 2022 4 commits
  8. 13 Sep, 2022 1 commit
    • turneram's avatar
      Use rocblas_gemm_ex for batched gemms with broadcasted B (#1354) · a10a8ef1
      turneram authored
      Improves performance for 4/6 GEMMs used by huggingface BERT models with batch_size>1 by using a non-batched rocBLAS call for GEMMs where the B input has a broadcasted batch dimension.
      The four verify tests added reflect the actual configurations used by bert-base-cased, with varied batch sizes.
      
      Also adds a matcher to simplify_reshapes to move multibroadcasts after concats.
      a10a8ef1
  9. 09 Sep, 2022 1 commit
  10. 08 Sep, 2022 2 commits
  11. 07 Sep, 2022 1 commit
  12. 06 Sep, 2022 1 commit
  13. 31 Aug, 2022 1 commit
  14. 29 Aug, 2022 1 commit
  15. 27 Aug, 2022 2 commits
  16. 26 Aug, 2022 1 commit
  17. 24 Aug, 2022 1 commit
  18. 23 Aug, 2022 1 commit
    • Charlie Lin's avatar
      Dynamic ref NMS (#1288) · fa3c21fa
      Charlie Lin authored
      Has NMS op output a dynamic shape (ONNX spec behavior)
      Allows for dynamic input shape to NMS op
      fa3c21fa
  19. 21 Aug, 2022 1 commit
    • varunsh's avatar
      Update is_supported (#1334) · 79e15ca9
      varunsh authored
      * Update is_supported
      * Return object from is_supported
      * Return by reference in interator
      79e15ca9
  20. 19 Aug, 2022 3 commits
  21. 18 Aug, 2022 1 commit
    • shivadbhavsar's avatar
      pybind updates for torch_migraphx library (#1323) · 8045f7c8
      shivadbhavsar authored
      Add function argument_from_pointer to allow directly passing a migraphx.shape object and a memory address. 
      Expose the is_compiled() method from migraphx::program. 
      Expose the enum types under migraphx::op. 
      8045f7c8
  22. 17 Aug, 2022 3 commits
  23. 16 Aug, 2022 2 commits
  24. 12 Aug, 2022 2 commits
  25. 11 Aug, 2022 1 commit
  26. 09 Aug, 2022 1 commit