Commits · 063ba0c45de1d20bc3c7595a8a79dea4bd2e3402 · gaoqiong / MIGraphX

19 Apr, 2022 2 commits
- Hacked fixes for pointwise · 063ba0c4
  Paul authored Apr 19, 2022
  
  063ba0c4
- Fix headers · f449cd1d
  Paul authored Apr 19, 2022
  
  f449cd1d
13 Apr, 2022 1 commit
- Fix problem with incomplete types with older clang versions (#1174) · a11ef66a
  Paul Fultz II authored Apr 13, 2022
```
also added the PYTHON_DISABLE_VERSIONS cmake variable to disable python versions.
```
  a11ef66a
12 Apr, 2022 2 commits
- Fix out-of-bounds access when generate uses nonpacked tensors (#1160) · 262ba721
  Paul Fultz II authored Apr 12, 2022
```
out-of-bounds access when generate uses nonpacked tensors and add some additional asserts for gpu memory.
```
  262ba721
- parallelize the ref implementation of the gemm operator (#1142) · 88b3dd34
  Shucai Xiao authored Apr 12, 2022
```
ref implementation of the gemm op is sequential, this PR is to parallelize the gemm computation in the ref implementation.
```
  88b3dd34
11 Apr, 2022 4 commits

scatter operator refactoring to include reduction (#1124) · 701c2014

bpickrel authored Apr 11, 2022

Change the "scatter" struct and op to a base/child set of three: scatter_none, scatter_add, scatter_mul to mirror Onnx' ScatterElements op. and its three reduction options. (Onnx Scatter op is deprecated and is equivalent to scatter_none.)

Provides both a reference op. and update to Onnx parsing. Tests updated and new test case added.

701c2014

fix a bug in create tensor_view with vec data type (#1155) · 3c301efa

Shucai Xiao authored Apr 11, 2022

When create a tensor_view with vector date type, the last dimension of the shape should be divided by the vec_size.

3c301efa

clang format · 401d0f68
Shucai Xiao authored Apr 11, 2022

401d0f68
backup changes · 992f57ba
Shucai Xiao authored Apr 11, 2022

992f57ba

08 Apr, 2022 1 commit
- Fix comparisons in migraphx::value class (#1146) · 1e0bbd78
  Paul Fultz II authored Apr 08, 2022
```
* Fix comparisons in migraphx::value class
```
  1e0bbd78
06 Apr, 2022 1 commit

Python Binding for the Manual Graph Buidling (#1143) · c4b6469a

Umang Yadav authored Apr 06, 2022

Adds following API binding and tests to python :

add_return
add_instruction
add_parameter
create_module.

c4b6469a

04 Apr, 2022 4 commits
- clang format · 789f86fb
  Shucai Xiao authored Apr 04, 2022
  
  789f86fb
- some additional code cleanup · 8e485cc8
  Shucai Xiao authored Apr 04, 2022
  
  8e485cc8
- clang format · a6477298
  Shucai Xiao authored Apr 04, 2022
  
  a6477298
- refactor of the layernorm code · fe849702
  Shucai Xiao authored Apr 04, 2022
  
  fe849702
31 Mar, 2022 2 commits
- clang format · af110526
  Shucai Xiao authored Mar 31, 2022
  
  af110526
- Change the doc to mention only gpu or ref as targets (#1153) · c59f4079
  Umang Yadav authored Mar 31, 2022
```
Documentation update for valid targets
```
  c59f4079
29 Mar, 2022 5 commits
- clang format · 48b39e06
  Shucai Xiao authored Mar 29, 2022
  
  48b39e06
- simplify the layernorm kernel arguments · de99db23
  Shucai Xiao authored Mar 29, 2022
  
  de99db23
- Refactor runtime compiled kernels to use the same compile_ops pipeline (#1125) · 661046c6
  Paul Fultz II authored Mar 29, 2022
```
This adds the infrastructure so we can compile everything in parallel, whereas before only pointwise kernels were compiled in parallel. This will also directly integrate with lowering and the gpu-driver. The kernels for pointwise and roialign are using this infrastructure. Scatternd is not since it does require standard shape.

This also makes it easier to add new runtime compiled kernels in the future.
```
  661046c6
- clang format · 780fffc8
  Shucai Xiao authored Mar 28, 2022
  
  780fffc8
- also rewrite layernorm kernel using half2 datatype · fc48a1d3
  Shucai Xiao authored Mar 28, 2022
  
  fc48a1d3
28 Mar, 2022 7 commits
- clang format · c6700632
  Shucai Xiao authored Mar 28, 2022
  
  c6700632
- half and half2 have the same results · 69c94135
  Shucai Xiao authored Mar 28, 2022
  
  69c94135
- clang format · 580673a0
  Shucai Xiao authored Mar 28, 2022
  
  580673a0
- backup code changes · 80a6ca93
  Shucai Xiao authored Mar 28, 2022
  
  80a6ca93
- layernorm kernel optimization · a5181cd0
  Shucai Xiao authored Mar 28, 2022
  
  a5181cd0
- Use ifdef instead of comment for the auto-generated method declarations for... · 8e4d622f
  Paul Fultz II authored Mar 28, 2022
```
Use ifdef instead of comment for the auto-generated method declarations for type erased classes (#1138)

It seems the formatting of comments are unreadable for larger methods, so instead just generate a struct with the methods in the interface and add a comment if its optional. It wraps this in #ifdef TYPE_ERASED_DECLARATION(assuming this would never be defined) instead of #if 0, so most editors can still provide syntax highlighting(although I think vscode with clangd will still gray it out unfortunately).
```
  8e4d622f
- Use ccache for runtime compilation (#1131) · ad056b1f
  Paul Fultz II authored Mar 28, 2022
```
* Use ccache for runtime compilation
```
  ad056b1f
25 Mar, 2022 1 commit
- Improve handling of string literals in value class (#1141) · c73c0dae
  Paul Fultz II authored Mar 25, 2022
```
* Handle string literal in construction
* Improve get_default with vector
```
  c73c0dae
24 Mar, 2022 4 commits
- Add initial experimental custom op (#1109) · 251cdd74
  Paul Fultz II authored Mar 24, 2022
```
This creates a custom op which has name() and compute_shape() methods. 
```
  251cdd74
- clang format · 86a03f28
  Shucai Xiao authored Mar 23, 2022
  
  86a03f28
- remove duplicate definition · d5c2538c
  Shucai Xiao authored Mar 23, 2022
  
  d5c2538c
- parallelize the ref implementation of the gemm operator · 9b19b73f
  Shucai Xiao authored Mar 23, 2022
  
  9b19b73f
22 Mar, 2022 1 commit
- Remove borrowed lifetime from operators that are no longer borrowing their lifetime (#1134) · cd165ebd
  Paul Fultz II authored Mar 22, 2022
```
Operators using arg.reshape() method the lifetime will be extended.
```
  cd165ebd
21 Mar, 2022 1 commit
- Lp normalization op (#1129) · 03225b57
  Charlie Lin authored Mar 21, 2022
```
* LpNormalization ONNX parser
```
  03225b57
18 Mar, 2022 2 commits

Complete GPU implementation of CumSum op (#1094) · 548783c8

turneram authored Mar 18, 2022

Add exclusive and reverse modes to gpu implementation of prefix_scan_sum, which completes support for ONNX op CumSum

548783c8

Make get_context experimental (#1137) · e521fa3f

Paul Fultz II authored Mar 18, 2022

The get_context may change in the future(when we support multi-targets) so make this experimental for now.

e521fa3f

15 Mar, 2022 2 commits

Expose APIs for the MIGraphX program (#1093) · 64e79a94

Umang Yadav authored Mar 15, 2022

API includes following
create_module,
get_main_module
add_instruction without module args
add_instruction with module args
add_parameter
add_return

64e79a94

Add iterators to kernels tensor_view and fix roialign to work with non-standard shape (#1126) · 31e63991

Paul Fultz II authored Mar 15, 2022

This adds iterators to tensor_view, which can allow kernels to work with non-standard shapes like for roialign.

To improve the performance of indexing when using the iterators, the shape class was updated to use integral_constants since the compiler doesn't always fold the const values. An integral_constant will at least enforce that in the AST.

Finally, since index calculations with single integers are improved, I also updated pointwise to use single index rather than multi index. There is about 4% improvement in some cases.

31e63991