Commits · d208adfc96482e5a31ad9989fdfdc38692e360f8 · gaoqiong / MIGraphX

15 Jun, 2023 2 commits

use __hmax, __hmin (#1813) · d208adfc
Umang Yadav authored Jun 15, 2023

d208adfc

fix parse_instancenorm to create broadcast and multibroadcast instruc… (#1715) · 41ba30d5

Brian Pickrell authored Jun 15, 2023

* fix parse_instancenorm to create broadcast and multibroadcast instructions with two dynamic shape arguments instead of 1.  Their make_op() functions don't support dynamic shapes when called with one input.  This caused an error when parsing an ONNX 3duunet model

* Use add_common_op() to create multibroadcast op.

* add verification and parsing test for instance_norm with dynamic input.  Parse test doesn't pass.

* fix for test; still doesn't pass

* another fix for test; still doesn't pass

* work in progress, instance_norm_dyn_batch_test works but instance_norm_test doesn't

* fix onnx instancenorm tests to match parser changes.  Passes all check tests

* Updated comments explaining usage of add_common_op()

* hand-merged conflicts with develop

* fix instance_norm_half_test after merge

* add Onnx test instance_norm_dyn_batch_half_test

* add shape test cases broadcast_1in_dyn_error and multibroadcast_1in_dyn_error_0

41ba30d5

14 Jun, 2023 2 commits
- Fix TRACE_EVAL > 1 (#1835) · 5bf067ed
  Umang Yadav authored Jun 14, 2023
```
* add fix for the trace_eval

* Add throw for the debug builds

* Formatting

---------
Co-authored-by: Chris Austen <causten@users.noreply.github.com>
```
  5bf067ed
- Print message from driver if offload copy is set for compiled program (#1802) · aa508e1d
  Umang Yadav authored Jun 14, 2023
  
  aa508e1d
13 Jun, 2023 1 commit
- Fix shape typo in API test (#1787) · 193f105d
  Charlie Lin authored Jun 13, 2023
  
  193f105d
12 Jun, 2023 1 commit
- Enable reshape on nonstandard shapes (#1681) · 0dae73fa
  Paul Fultz II authored Jun 12, 2023
  
  0dae73fa
09 Jun, 2023 3 commits
- Enable hipRTC (#1827) · c900e382
  Chris Austen authored Jun 09, 2023
  
  c900e382
- Fix compile warnings for shadowing variable names (#1825) · dfde6d07
  Umang Yadav authored Jun 09, 2023
  
  dfde6d07
- Add missing specialization for the `nullptr` for the hash function (#1824) · 26aabd2a
  Umang Yadav authored Jun 09, 2023
```
#1791 Added hash function for value class. It uses the Visit function and has specialization for the bool_type and <vector> type but was missing specialization for the nullptr. Nullptr caused compilation issues for RHEL, SLES and CentOS.
```
  26aabd2a
08 Jun, 2023 2 commits
- Add initial CK integration plus auto-tuning for kernels (#1791) · 25af8710
  Paul Fultz II authored Jun 08, 2023
```
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
```
  25af8710
- disable hipRTC temporarily (#1817) · e5a33aad
  Chris Austen authored Jun 07, 2023
  
  e5a33aad
06 Jun, 2023 2 commits

re-enable hiprtc (#1812) · 85ff4f85
Umang Yadav authored Jun 06, 2023

85ff4f85

Conditionally enable GeLU approximation (#1810) · c5d0c5b6

Umang Yadav authored Jun 05, 2023

Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf

c5d0c5b6

05 Jun, 2023 1 commit

Test and doc update for shape.from_permutation() (#1742) · 68446f7a

Charlie Lin authored Jun 05, 2023

Changed the doc for find_permutation(shape) to be more clear that it is finding the permutation that would make the shape standard

68446f7a

04 Jun, 2023 1 commit
- default to ROCm 5.5 (#1808) · 5df11e0f
  Igor Mirosavljevic authored Jun 04, 2023
  
  5df11e0f
02 Jun, 2023 1 commit
- replace np.bool with bool as per numpy request (#1640) · 10c42663
  Chris Austen authored Jun 02, 2023
  
  10c42663
01 Jun, 2023 1 commit

Convert Fp16 instance-norm to FP32 temporarily (#1779) · 49b341d3

Umang Yadav authored Jun 01, 2023

By converting to fp32 : fp16 3d-unet model accuracy comes out the same as FP32 accuracy.

By using reduce_sum method on Fp16 : accuracy comes out ~0.9% lower compared to fp32 while keeping entire model in fp16.

49b341d3

31 May, 2023 2 commits
- Check if generate files are different (#1789) · 37711924
  Paul Fultz II authored May 31, 2023
  
  37711924
- Update pass manager to handle multi-target compilation (#1672) · 9473e3a2
  Umang Yadav authored May 31, 2023
```
partially solves #1656
This PR only handles compilation part of multitarget.
```
  9473e3a2
30 May, 2023 2 commits

Improvements to driver output (#1710) · d32ab85b

Paul Fultz II authored May 30, 2023

Use generate_argument instead of generate_literal for python output as generate_literal doesnt exists
Shorten the names for variables from the main module
Use prefix p_ for parameters
Use shorter variable m for main module in python

d32ab85b

Add option to use type erased matchers to reduce symbol names (#1755) · 55f420fb
Paul Fultz II authored May 30, 2023

55f420fb

29 May, 2023 2 commits
- input parameters cleanup (#1777) · 3c93c314
  Pavle Jacovic authored May 30, 2023
  
  3c93c314
- Ensure CI labels map correctly (#1780) · 3ea6ff7b
  Chris Austen authored May 29, 2023
  
  3ea6ff7b
28 May, 2023 1 commit
- Enable quantizing both int8 and fp16 in the driver (#1757) · 26c1efa5
  Paul Fultz II authored May 28, 2023
```
* Allow quantizing for both int8 and fp16
```
  26c1efa5
25 May, 2023 1 commit
- Update cpp generator to handle inf from float (#1758) · 763dd1da
  Ted Themistokleous authored May 25, 2023
```
Use std::numeric_limits::min/max() functions plus the appropriate value to encode -inf/inf 
```
  763dd1da
24 May, 2023 2 commits
- Change compiler_replace to a class that stores the code objects directly (#1739) · 37f5df20
  Paul Fultz II authored May 24, 2023
```
Enable retrieving the code object to do tuning in the future.
```
  37f5df20
- Update xdlops/rocblas fp32 arch (#1752) · 77042e30
  kahmed10 authored May 24, 2023
```
Refactor supported gfx archs
```
  77042e30
23 May, 2023 2 commits
- Backout fp16 max/min HIP API change (#1771) · 42772fd6
  Umang Yadav authored May 23, 2023
```
back out changes for rocm-5.5
```
  42772fd6
- Readme update (#1733) · 873d0473
  Djordje Petrovic authored May 23, 2023
  
  873d0473
20 May, 2023 1 commit
- Use half HIP APIs to compute max and min (#1764) · 88fb551c
  Umang Yadav authored May 19, 2023
```
* use half hip functions to compute max and min
* add verify test for min and max
```
  88fb551c
19 May, 2023 3 commits
- update to v0.11.0 of rocm-docs-core (#1763) · 0e6ee3f7
  Chris Austen authored May 19, 2023
  
  0e6ee3f7
- Enabling native int32 type support (#1721) · 8d9d5d1c
  Zhuoran Yin authored May 19, 2023
```
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
```
  8d9d5d1c
- Docsupdate (#1748) · 3557ce90
  Chris Austen authored May 18, 2023
```
Co-authored-by: Sam Wu <sam.wu2@amd.com>
Co-authored-by: Paul <pfultz2@yahoo.com>
```
  3557ce90
18 May, 2023 1 commit
- Use action to free space which uses apt remove to remove all the dependencies as well (#1756) · c7ca67ff
  Umang Yadav authored May 18, 2023
  
  c7ca67ff
17 May, 2023 2 commits

adjust docker files to support new rocm 5.5 (#1729) · 5e35957b
Chris Austen authored May 17, 2023
```
Move CI to support the rocm5.5 release
```
5e35957b

scalar unsqueeze broadcast support (#1753) · 2140fe19

shivadbhavsar authored May 16, 2023

Adding support for broadcasted scalars to unsqueeze op.

Specifying steps other than 1 is disallowed in this implementation since we want the output the always be a tensor. We can support varying step sizes if we allow a broadcasted scalar output from this op.

2140fe19

11 May, 2023 1 commit
- Update onnxruntime main 5a43828b3d73028bfd33b3856f82698d9ab02cb1 (#1741) · 177e5dbc
  github-actions[bot] authored May 10, 2023
```
Co-authored-by: causten <causten@users.noreply.github.com>
```
  177e5dbc
09 May, 2023 1 commit
- stop docker when failing to install to continue (#1749) · cea16502
  Chris Austen authored May 09, 2023
  
  cea16502
08 May, 2023 2 commits
- Remove workaround for Sin (#1701) · 89f7ac0d
  Umang Yadav authored May 08, 2023
  
  89f7ac0d
- Dynamic batch C++ API example (#1728) · 7cf05301
  Charlie Lin authored May 08, 2023
```
Example of using the C++ API to run an ONNX model with dynamic batch
```
  7cf05301