Commits · 8398fb196712221173b8e459b0d88945a4cdf1ba · gaoqiong / MIGraphX

25 Jul, 2022 1 commit

varunsh authored Jul 25, 2022

* Add is_supported to the target
* Add get_target_assignments
* Rename assignment to target_assignments
* Add ref target header to test
* Add fpga target
* Make context const in compute

8a30d698

22 Jul, 2022 1 commit
- Improve error reporting in the API (#1274) · c722117d
  Umang Yadav authored Jul 22, 2022
```
C++ API is not printing thrown exception string. this improves on it.
```
  c722117d
21 Jul, 2022 1 commit
- Dynamic check_shapes (#1295) · c51cf531
  Charlie Lin authored Jul 21, 2022
```
Dynamic shape handling in shape object
```
  c51cf531
19 Jul, 2022 3 commits

Fix TF parsing for creating literals and Fix name lookups for input params (#1298) · 4d59b7c7

Umang Yadav authored Jul 19, 2022

Bug 1: create_literal was using back_inserter to copy vector with already allocated size, causing double the size of literal.
Fix 1 : not use back_inserter
Bug 2: Input param to model can be from operation that has multiple output, in that case name of the input param would contain : e.g. input_1:0
Fix 2: Look for : and take substring

4d59b7c7

Dynamic dimension input onnx parser (#1249) · 5a87fcbd

Charlie Lin authored Jul 19, 2022

Depends on #1199

Adds ONNX parser functionality for dynamic input shapes.
Uses options parameter in parse_onnx()

5a87fcbd

Fix op includes (#1308) · 39b307b2

Charlie Lin authored Jul 19, 2022

Changes to operator includes:

removed some includes that were not used
included argument.hpp where clang-tidy wanted it

39b307b2

12 Jul, 2022 3 commits
- Reduce header inclusion in op headers (#1271) · ba1b7850
  Paul Fultz II authored Jul 12, 2022
```
Reduce header inclusion in op headers
```
  ba1b7850
- Add tests for C API (#1266) · a7a32a9e
  Paul Fultz II authored Jul 12, 2022
```
This will ensure that migraphx.h can be included from a C compiler, and check that the C API can be called. This includes stdbool.h which is needed when using bool from C.
```
  a7a32a9e
- Use current device when constructng context (#1294) · 68189043
  Paul Fultz II authored Jul 11, 2022
  
  68189043
11 Jul, 2022 4 commits
- Add __restrict__ to jit kernel params (#1300) · 2781ccd8
  turneram authored Jul 11, 2022
  
  2781ccd8
- remove skip_broadcast, and add tolerance to add_0 · f0e84612
  Ted Themistokleous authored Jul 11, 2022
  
  f0e84612
- Revert "Change threshold for has_value tolerance" · ed55d323
  Ted Themistokleous authored Jul 11, 2022
```
This reverts commit 4aeacc17.
```
  ed55d323
- Improve kernel code generation (#1285) · 2bbb50c4
  Paul Fultz II authored Jul 11, 2022
```
* Only run __syncthreads when there is data to preload
* Improve loops
* Add const attribute to improve optimizations
```
  2bbb50c4
08 Jul, 2022 5 commits
- Update perf report to show the number of operators and per operator avg time in summary (#1287) · 05b13c9f
  Paul Fultz II authored Jul 08, 2022
```
Show the number of operators and per operator avg time in summary...

Summary:
gpu::gemm: 8.738ms / 73 = 0.119699ms, 64%
gpu::triadd_layernorm: 0.831381ms / 24 = 0.0346409ms, 7%
```
  05b13c9f
- Remove check for div zero, push this to be detected in a seperate pass · fcc84214
  Ted Themistokleous authored Jul 08, 2022
  
  fcc84214
- Add env var to enable debug symbols for gpu kernels (#1284) · adbafc06
  Paul Fultz II authored Jul 08, 2022
```
Improve the assembly dump to track where certain instruction come from.
```
  adbafc06
- Add is_supported and get_target_assignments (#1269) · 8192f37f
  varunsh authored Jul 07, 2022
```
Added is_supported and get_target_assignments methods to the target and program, respectively, to eventually support multi-target compilation and execution.
```
  8192f37f
- Dyn shape update (#1199) · 1c0b2a4a
  Charlie Lin authored Jul 07, 2022
```
Initial sketch for changes to shape to handle dynamic dimensions
```
  1c0b2a4a
07 Jul, 2022 1 commit

Add a step to unsqeeze axis (#1242) · bd503d89

Paul Fultz II authored Jul 07, 2022

Instead of just unsqueezing to an axis of 1 a step can be set to use instead. So instead of unsqueezing {3, 12} to {3, 1, 12} a step of 2 will unsqeeze to {3, 2, 6} instead

bd503d89

06 Jul, 2022 1 commit

Verify load and save (#1265) · f2531606

Paul Fultz II authored Jul 05, 2022

*In the verification tests, check that saving and reloading the program is the same program. This also fixes serialization to always load instructions in the same order. There is also fixes for deconv and quant_conv which didn't save the solution id, and was broken for serialization.

f2531606

05 Jul, 2022 8 commits
- Add jit softmax (#1243) · 8520e0b8
  Paul Fultz II authored Jul 05, 2022
```
* Add softmax kernel
```
  8520e0b8
- Replace skip_broadcasts with skip_broadcasts_converts · 47dbf164
  Ted Themistokleous authored Jul 05, 2022
```
Use this call to also skip converts when running a simplify_algebra pass over a program.
```
  47dbf164
- fixup! fixup! Add skip_broadcasts to simplify algebra value matchers · cd8c2f8e
  Ted Themistokleous authored Jul 05, 2022
  
  cd8c2f8e
- fixup! Add skip_broadcasts to simplify algebra value matchers · f715ad3f
  Ted Themistokleous authored Jul 05, 2022
  
  f715ad3f
- Horizontally fuse contiguous operators (#1232) · 27e980c4
  Paul Fultz II authored Jul 05, 2022
```
This reorders the transposes across slice to improve horizontal fusion for contiguous. This also improves eliminate_contiguous to remove contiguous better across splits.
```
  27e980c4
- Remove variables for find_zero_div_const · 16e6cd5a
  Ted Themistokleous authored Jul 05, 2022
```
Allows us to not throw warnings instead of using [[maybe_unused]] flag instead.
```
  16e6cd5a
- Add skip_broadcasts to simplify algebra value matchers · 48492136
  Ted Themistokleous authored Jul 05, 2022
```
Adds this to handle broadcasted values instead of just scalars
```
  48492136
- Change threshold for has_value tolerance · 4aeacc17
  Ted Themistokleous authored Jul 05, 2022
```
Used to avoid the case where  1e-12 is used and is matched as zero errornously resulting in removing the call with the incorrect value.
```
  4aeacc17
03 Jul, 2022 1 commit

Add mlir fusion (#1251) · ca8a54fe

Paul Fultz II authored Jul 03, 2022

* Add mlir c api

* Formatting

* Create a type attribute

* Formatting

* Parse module

* Formatting

* Add mlir dump function

* Add test case

* Formatting

* Fix tidy issues

* Update mlit version

* Update to newer mlir

* Format

* Move mlir to the gpu and update the test

* Formatting

* Fix bug when appending module

* Format

* Remove old cmake flag

* Update message

* Add return

* Format

* Add mlir_compile

* Format

* Register dialect

* Handle unsinged integers

* Dont provide output for return instruction

* Format

* Add code to insert memrefs

* Format

* Add mlir verification

* Formatting

* Enable pointwise_fusion

* Disable eliminate_data_type

* Set kernal name

* Format

* Fix device name

* Formatting

* Fix output arg

* Format

* Updates

* Upate hash

* Add fuse_mlir pass

* Format

* Add fuse mlir

* Format

* Update mlir

* Sort parameter names

* Format

* Reenable disabled passes

* Remove old mlir conv

* Remove asym default padding

* Add more verbose tracing

* Format

* Fix compilation errors

* Format

* Whitelist operators

* Format

* Add namespace

* Format

* Update triple

* Format

* Use func dialect

* Format

* Use func.return

* Format

* Upgrade mlir version

* Add comment

* Handle symetrical padding

* Format

* Cleanup debug output

* Format

* List failed tests

* Move mlir compile to jit pipeline

* Format

* Update version

* Add source locations

* Format

* Correctly add module

* Format

* Update failed tests

* Fix failures when mlir is disabled

* Format

* Update mlir version

* Check type for fp32

* Format

* Remove failed test

* Update mlir in driver

* Tidy fixes

* Foramt

* Tidy fixes

* Format

* Fix const

* Remove from requirements

* Fix cmake version

* Fix tidy warning

* Use another ifdef

* Fix tidy

* Other tidy fix

* Format

* Update hash

* Add missing license files

* Format

* Format

* Fix fnction name

ca8a54fe

01 Jul, 2022 1 commit
- fixup! fixup! fixup! Simplify algebra for 0*x, x*0 and 0/x operations · 3fc986f0
  Ted Themistokleous authored Jul 01, 2022
  
  3fc986f0
30 Jun, 2022 10 commits
- fixup! fixup! Simplify algebra for 0*x, x*0 and 0/x operations · 4b7210e1
  Ted Themistokleous authored Jun 30, 2022
  
  4b7210e1
- fixup! Simplify algebra for 0*x, x*0 and 0/x operations · d6764b7d
  Ted Themistokleous authored Jun 30, 2022
  
  d6764b7d
- fixup! Add support for division by zero errors · 41e3b9f6
  Ted Themistokleous authored Jun 30, 2022
  
  41e3b9f6
- Add support for division by zero errors · c2d4ceb0
  Ted Themistokleous authored Jun 30, 2022
```
Throw an exception when this occurs to indicate our simpliciation passes resulted in a singularity somewhere. Related to #1236
```
  c2d4ceb0
- Simplify algebra for 0*x, x*0 and 0/x operations · 20d2b5d9
  Ted Themistokleous authored Jun 30, 2022
```
Simplify addition zero multiplication and divide operations. Added approrpiate test cases with returns and replacing the instruction and operand to just return zero.
```
  20d2b5d9
- Simplify x - 0 and 0 - x operations · 93f733c3
  Ted Themistokleous authored Jun 29, 2022
```
Using the unit/neg unit matchers to handle subtraction operations in the same steps. Added unit tests for both cases.
```
  93f733c3
- Combine negative unit args · 62c67b45
  Ted Themistokleous authored Jun 29, 2022
  
  62c67b45
- Simplify algebra of negative division operations · 013e18cf
  Ted Themistokleous authored Jun 28, 2022
```
Part of changes that go wtih #1236. Reverts -1 divide operations to a simple negation of the parameter
```
  013e18cf
- Remove addition matcher and roll in with div/mul unit matchers · 86c464ca
  Ted Themistokleous authored Jun 29, 2022
```
Add handling for zero addition operations into the find_unit_ops() matcher functor.
```
  86c464ca
- Simplify Algebra of 0+x & x+0 = x · 8ddb505a
  Ted Themistokleous authored Jun 29, 2022
```
Added test case and code to simplify zero additions between paremeters and literals during simplifications.  In reference to issue #1236
```
  8ddb505a