Commits · 85ff4f850fc9c0bb642f09fe80a010781aa94635 · gaoqiong / MIGraphX

"src/targets/gpu/vscode:/vscode.git/clone" did not exist on "ca8a54fe732e725f0e22ebc09187bd71faf131a5"

06 Jun, 2023 2 commits

re-enable hiprtc (#1812) · 85ff4f85
Umang Yadav authored Jun 06, 2023

85ff4f85

Conditionally enable GeLU approximation (#1810) · c5d0c5b6

Umang Yadav authored Jun 05, 2023

Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf

c5d0c5b6

05 Jun, 2023 1 commit

Test and doc update for shape.from_permutation() (#1742) · 68446f7a

Charlie Lin authored Jun 05, 2023

Changed the doc for find_permutation(shape) to be more clear that it is finding the permutation that would make the shape standard

68446f7a

04 Jun, 2023 1 commit
- default to ROCm 5.5 (#1808) · 5df11e0f
  Igor Mirosavljevic authored Jun 04, 2023
  
  5df11e0f
02 Jun, 2023 1 commit
- replace np.bool with bool as per numpy request (#1640) · 10c42663
  Chris Austen authored Jun 02, 2023
  
  10c42663
01 Jun, 2023 1 commit

Convert Fp16 instance-norm to FP32 temporarily (#1779) · 49b341d3

Umang Yadav authored Jun 01, 2023

By converting to fp32 : fp16 3d-unet model accuracy comes out the same as FP32 accuracy.

By using reduce_sum method on Fp16 : accuracy comes out ~0.9% lower compared to fp32 while keeping entire model in fp16.

49b341d3

31 May, 2023 2 commits
- Check if generate files are different (#1789) · 37711924
  Paul Fultz II authored May 31, 2023
  
  37711924
- Update pass manager to handle multi-target compilation (#1672) · 9473e3a2
  Umang Yadav authored May 31, 2023
```
partially solves #1656
This PR only handles compilation part of multitarget.
```
  9473e3a2
30 May, 2023 2 commits

Improvements to driver output (#1710) · d32ab85b

Paul Fultz II authored May 30, 2023

Use generate_argument instead of generate_literal for python output as generate_literal doesnt exists
Shorten the names for variables from the main module
Use prefix p_ for parameters
Use shorter variable m for main module in python

d32ab85b

Add option to use type erased matchers to reduce symbol names (#1755) · 55f420fb
Paul Fultz II authored May 30, 2023

55f420fb

29 May, 2023 2 commits
- input parameters cleanup (#1777) · 3c93c314
  Pavle Jacovic authored May 30, 2023
  
  3c93c314
- Ensure CI labels map correctly (#1780) · 3ea6ff7b
  Chris Austen authored May 29, 2023
  
  3ea6ff7b
28 May, 2023 1 commit
- Enable quantizing both int8 and fp16 in the driver (#1757) · 26c1efa5
  Paul Fultz II authored May 28, 2023
```
* Allow quantizing for both int8 and fp16
```
  26c1efa5
25 May, 2023 1 commit
- Update cpp generator to handle inf from float (#1758) · 763dd1da
  Ted Themistokleous authored May 25, 2023
```
Use std::numeric_limits::min/max() functions plus the appropriate value to encode -inf/inf 
```
  763dd1da
24 May, 2023 2 commits
- Change compiler_replace to a class that stores the code objects directly (#1739) · 37f5df20
  Paul Fultz II authored May 24, 2023
```
Enable retrieving the code object to do tuning in the future.
```
  37f5df20
- Update xdlops/rocblas fp32 arch (#1752) · 77042e30
  kahmed10 authored May 24, 2023
```
Refactor supported gfx archs
```
  77042e30
23 May, 2023 2 commits
- Backout fp16 max/min HIP API change (#1771) · 42772fd6
  Umang Yadav authored May 23, 2023
```
back out changes for rocm-5.5
```
  42772fd6
- Readme update (#1733) · 873d0473
  Djordje Petrovic authored May 23, 2023
  
  873d0473
20 May, 2023 1 commit
- Use half HIP APIs to compute max and min (#1764) · 88fb551c
  Umang Yadav authored May 19, 2023
```
* use half hip functions to compute max and min
* add verify test for min and max
```
  88fb551c
19 May, 2023 3 commits
- update to v0.11.0 of rocm-docs-core (#1763) · 0e6ee3f7
  Chris Austen authored May 19, 2023
  
  0e6ee3f7
- Enabling native int32 type support (#1721) · 8d9d5d1c
  Zhuoran Yin authored May 19, 2023
```
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
```
  8d9d5d1c
- Docsupdate (#1748) · 3557ce90
  Chris Austen authored May 18, 2023
```
Co-authored-by: Sam Wu <sam.wu2@amd.com>
Co-authored-by: Paul <pfultz2@yahoo.com>
```
  3557ce90
18 May, 2023 1 commit
- Use action to free space which uses apt remove to remove all the dependencies as well (#1756) · c7ca67ff
  Umang Yadav authored May 18, 2023
  
  c7ca67ff
17 May, 2023 2 commits

adjust docker files to support new rocm 5.5 (#1729) · 5e35957b
Chris Austen authored May 17, 2023
```
Move CI to support the rocm5.5 release
```
5e35957b

scalar unsqueeze broadcast support (#1753) · 2140fe19

shivadbhavsar authored May 16, 2023

Adding support for broadcasted scalars to unsqueeze op.

Specifying steps other than 1 is disallowed in this implementation since we want the output the always be a tensor. We can support varying step sizes if we allow a broadcasted scalar output from this op.

2140fe19

11 May, 2023 1 commit
- Update onnxruntime main 5a43828b3d73028bfd33b3856f82698d9ab02cb1 (#1741) · 177e5dbc
  github-actions[bot] authored May 10, 2023
```
Co-authored-by: causten <causten@users.noreply.github.com>
```
  177e5dbc
09 May, 2023 1 commit
- stop docker when failing to install to continue (#1749) · cea16502
  Chris Austen authored May 09, 2023
  
  cea16502
08 May, 2023 2 commits
- Remove workaround for Sin (#1701) · 89f7ac0d
  Umang Yadav authored May 08, 2023
  
  89f7ac0d
- Dynamic batch C++ API example (#1728) · 7cf05301
  Charlie Lin authored May 08, 2023
```
Example of using the C++ API to run an ONNX model with dynamic batch
```
  7cf05301
06 May, 2023 1 commit
- Optimize file space of github runners (#1743) · 2bebf64d
  Chris Austen authored May 06, 2023
```
Remove various file not required for what we use Github runners for
```
  2bebf64d
05 May, 2023 3 commits
- Python API update for dynamic batch (#1723) · ccc4b8a4
  Charlie Lin authored May 05, 2023
```
Python API with documentation updates
```
  ccc4b8a4
- [MLIR][5.7] add input fusion support for view ops (#1705) · 4996c6d7
  Manupa Karunaratne authored May 05, 2023
```
Adds support for slice,transpose,contigous and reshape fusions into input tensors for a fused mlir kernel.
```
  4996c6d7
- add tf supported ops in driver, sort both onnx and tf alphabetically (#1732) · 4fb3fd4a
  kahmed10 authored May 04, 2023
```
add option to print tf supported ops
sort both onnx and tf ops alphabetically
```
  4fb3fd4a
04 May, 2023 2 commits

Rewrite multiplies with dot operator (#1685) · 457703a8

Paul Fultz II authored May 04, 2023

When multiplying either the input or output across the K dimensions then the multiple can be applied to the constant which can then be folded with propagate_const.

457703a8

[mlir] Adding quant convolution fusion as anchor op (#1683) · 7f105952

Zhuoran Yin authored May 03, 2023

Exposed the mlir_enabled() call the decide for lowering pipeline's enablement
Disabled the rewrite quantization pipeline in mlir compilation
Added quant convolution as anchor ops
Fixed the return type expectations
Added the fall back hip implementation for quantizelinear and dequantizelinear
Will need advises to improve the implementation for quantizelinear

7f105952

03 May, 2023 1 commit

Update C/C++ API for dynamic batch (#1712) · 0ff00ef6

Charlie Lin authored May 02, 2023

Relies on Removed split_single_dyn_dim compile flag #1711
Exposes dynamic_dimension as a opaque object with dynamic_dimensions and optimals
Exposes ONNX dyn_input_dims and default_dyn_dim to run with dynamic batch
Updates api.py to be able to create objects from aggregate initialization (used for dynamic_dimension)
Uses offload copy for now

0ff00ef6

02 May, 2023 1 commit

Handle broadcasts across dot and concat (#1689) · a8ace295

Paul Fultz II authored May 02, 2023

Improves the constant propagation for bert models. Larger batch size no longer use as large of constants.  Also improves the speed of model compilation

a8ace295

01 May, 2023 2 commits
- Add input parameters for performance-backup (#1719) · 942c135f
  Pavle Jacovic authored May 01, 2023
  
  942c135f
- move to 2.7.0 version (#1724) · 47bdb3f8
  Chris Austen authored May 01, 2023
  
  47bdb3f8
28 Apr, 2023 1 commit
- Removed split_single_dyn_dim compile flag (#1711) · bcc1f64a
  Charlie Lin authored Apr 28, 2023
  
  bcc1f64a