Commits · 01d4ae095731846c8eee4d835610184c51da68a7 · gaoqiong / MIGraphX

"vscode:/vscode.git/clone" did not exist on "437bcf2243158154b431ac8a538371971f3f33c2"

09 Jul, 2023 1 commit
- Add MIGraphX version inside MXR file (#1866) · 8428a242
  Umang Yadav authored Jul 09, 2023
  
  8428a242
06 Jul, 2023 1 commit

Enable eval to handle multiple contexts (#1751) · 072fd5cc

Paul Fultz II authored Jul 06, 2023

This is to help enable multi-target execution. We store a vector of targets and contexts. Currently this will only compile a single target, the PR #1672 is needed to enable multiple targets.

This will also serialize the targets and contexts.

When using the execution_environment or prog.get_context() it will always use the context from the first target assuming this is the "primary" target. Although, its unlikely a user would use execution_environment with a multi-target environment.

072fd5cc

30 Jun, 2023 1 commit
- Remove reduce (#1907) · 46d78a0f
  Umang Yadav authored Jun 30, 2023
  
  46d78a0f
26 Jun, 2023 1 commit
- Print max,min,mean and stddev values for TRACE_EVAL = 2 (#1864) · 84a8f450
  Umang Yadav authored Jun 26, 2023
  
  84a8f450
14 Jun, 2023 1 commit

Fix TRACE_EVAL > 1 (#1835) · 5bf067ed

Umang Yadav authored Jun 14, 2023



* add fix for the trace_eval

* Add throw for the debug builds

* Formatting

---------
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

5bf067ed

31 May, 2023 1 commit
- Update pass manager to handle multi-target compilation (#1672) · 9473e3a2
  Umang Yadav authored May 31, 2023
```
partially solves #1656
This PR only handles compilation part of multitarget.
```
  9473e3a2
30 May, 2023 1 commit

Improvements to driver output (#1710) · d32ab85b

Paul Fultz II authored May 30, 2023

Use generate_argument instead of generate_literal for python output as generate_literal doesnt exists
Shorten the names for variables from the main module
Use prefix p_ for parameters
Use shorter variable m for main module in python

d32ab85b

06 Apr, 2023 1 commit

Driver dynamic batch update (#1652) · adccec52

Charlie Lin authored Apr 06, 2023

Examples..

bin/driver verify /codes/onnx_models/resnet50-v1-7/resnet50-v1-7.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim @data "[{min:1, max:4}, 3, 224, 224]"

bin/driver compile /codes/onnx_models/resnet50-v1-7/resnet50-v1-7.onnx --split-single-dyn-dim --default-dyn-dim "{min:1, max:10}" --output resnet50_batch1-10.mxr

bin/driver perf resnet50_batch1-10.mxr --batch 4

adccec52

28 Feb, 2023 1 commit

Select module op (#1569) · a63ee2e0

Charlie Lin authored Feb 28, 2023

Creates the select_module operator that selects one of the submodules passed to it to run based on the submodule parameters.  The submodule is selected by having the exact same static shapes for the arguments to select_module as the parameters in the submodule

a63ee2e0

16 Feb, 2023 1 commit

Add flag for tuning in migraphx-driver (#1519) · cc098f4d

Umang Yadav authored Feb 15, 2023

* Add driver flag "--exhaustive-tune" to enable tuning, add support for the same in C/C++ and python API

cc098f4d

02 Feb, 2023 1 commit
- Dynamic gathernd (#1480) · d478675c
  Brian Pickrell authored Feb 02, 2023
```
Dynamic shape support for gathernd op.
```
  d478675c
14 Dec, 2022 1 commit
- Print program as python (#1490) · 56c43445
  Paul Fultz II authored Dec 14, 2022
```
* Print python code
```
  56c43445
04 Oct, 2022 1 commit
- Stream sync Changset (#1358) · f7d987ba
  Ted Themistokleous authored Oct 04, 2022
```
Stream sync changes and associated API level changes
```
  f7d987ba
06 Sep, 2022 1 commit
- Enable cppcheck rule for 'not', 'or' keywords (#1361) · d37a4df9
  Paul Fultz II authored Sep 06, 2022
```
Using not and or improves readability. The cppcheck rule will help ensure we are doing it consistently.
```
  d37a4df9
21 Aug, 2022 1 commit

Update is_supported (#1334) · 79e15ca9

varunsh authored Aug 21, 2022

* Update is_supported
* Return object from is_supported
* Return by reference in interator

79e15ca9

04 Aug, 2022 1 commit

Dynamic ref convolution op (#1224) · 67f77ac1

Charlie Lin authored Aug 04, 2022



* Dynamic shape handling in shape object

* rewrite empty lens multibroadcast test

* Shape class changes to handle dynamic
* More throw errors for functions that don't make sense for dynamic shape
* Print output changes
* Serialization changes

* Fixing serialization errors

* Remove const on dyn_dim copy getters

* Dynamic shape tests

* Fix serialize errors

* Add dyn_data struct to avoid ambiguous constructor

* Tidy fix: emplace_back() over for loop

* Tidy fix: use move

* Use std::initializer_list in constructor
Reverts the dyn_data struct change
Should get around the ambiguous braced initialization list error

* avoid typedef

* element_space, min,max,opt _lens change

* formatting

* Comments fix

* dynamic bytes() test

* Seralize and reflect changes

* formatting

* Test the dynamic lens functions

* progress

* Formatting

* Dynamic conv draft progress

* Add operator<< tests for coverage

* Coverage update

* Add to conv dynamic batch test

* Dynamic image size test

* Dynamic weight handling

* Dyn image shape test change, fix dyn weight cond

* Comment update

* Dynamic weights shape test and fix

* Use ternary operator

* Tidy fixes

* Handle dynamic graph input shapes in ONNX parser

* Formatting

* Handle dynamic shape for convolution

* formatting

* cppcheck fixes

* Add onnx test files

* Fix typo

* Disable auto_pad for dynamic input shape

* check_shapes object checks for allowing dynamic shapes

* Fix any_of

* Change to maintain const objectness

* Formatting

* Check shapes allow dynamic

* Refactor compute_shape() call into op.compute()
Allows for per operator differences with handling dynamic shape
Fix operation.hpp change to use the generator

* Comment fix

* Refactor normalize_attributes() calls to use max_lens()

* Comment addition

* Update other normalize_attributes() calls

* Change to using constructor and add tests

* Use const member function

* Add more dynamic shape support

* Add tests for error code coverage

* Fix opt shape bug and add shape tests

* capture all by ref

* Fix typo with img shape calculation

* Add more tests

* dynamic auto pad attempt
Linker error with pad_calc.cpp

* Fix parse dyn auto_pad
Should only need to use dynamic auto pad when the image shape or kernel
shape are dynamic. For a dynamic batch size, the auto pad calculation is
the same.

* Fix linking error

* Fix auto_pad bug
Fixed input tensor with auto_pad setting on

* auto_pad onnx tests

* Fix auto_pad calculation, evaluate in ref_conv
add ref_ops tests

* Add shape tests, fix bugs

* Refactor first two output dynamic len calculation

* Conv MLIR test update

* i64 MLIR test fix

* Fix MLIR test typo
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

67f77ac1

08 Jul, 2022 2 commits

Update perf report to show the number of operators and per operator avg time in summary (#1287) · 05b13c9f

Paul Fultz II authored Jul 08, 2022

Show the number of operators and per operator avg time in summary...

Summary:
gpu::gemm: 8.738ms / 73 = 0.119699ms, 64%
gpu::triadd_layernorm: 0.831381ms / 24 = 0.0346409ms, 7%

05b13c9f

Add is_supported and get_target_assignments (#1269) · 8192f37f

varunsh authored Jul 07, 2022

Added is_supported and get_target_assignments methods to the target and program, respectively, to eventually support multi-target compilation and execution.

8192f37f

06 Jul, 2022 1 commit

Verify load and save (#1265) · f2531606

Paul Fultz II authored Jul 05, 2022

*In the verification tests, check that saving and reloading the program is the same program. This also fixes serialization to always load instructions in the same order. There is also fixes for deconv and quant_conv which didn't save the solution id, and was broken for serialization.

f2531606

29 Jun, 2022 1 commit

Update driver models to use json strings (#1244) · ad27d0d6

Paul Fultz II authored Jun 29, 2022

 Compiles significantly faster than constructing all the objects. It also reduces recompiles as well.

ad27d0d6

26 Jun, 2022 1 commit
- Get parent module in the pass manager (#1181) · 3a5c4306
  Paul Fultz II authored Jun 26, 2022
```
* Add function to get a module tree
* Get parent module in the pass manager
```
  3a5c4306
22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
11 Mar, 2022 1 commit

Improve print ins (#1096) · b3b44f5d

Shucai Xiao authored Mar 11, 2022

The module::debug_print(ins) is very slow, which makes the trave_eval==1/2 very slow. The reason is printing an ins involves search the whole module to get the instruction, the print it.  This change is to fix that by calling module::print() to get names of all instructions of a program, then print the instruction by getting its name from a hash map.

b3b44f5d

02 Mar, 2022 1 commit
- Clang format ver10 (#1106) · 9852aaef
  bpickrel authored Mar 02, 2022
```
Update the base version of clang-format from 5.0 to 10.0
```
  9852aaef
02 Feb, 2022 1 commit

Update trace_eval to preview the output buffers (#1073) · b20e3d4d

Paul Fultz II authored Feb 02, 2022

Currently, MIGRAPHX_TRACE_EVAL=2 prints out the entire output buffer, but this can produce a lot of output. To make it easier to inspect and debug, using MIGRAPHX_TRACE_EVAL=2 now only prints 10 elements from the buffer(the first 5 and last 5) and shows any fp classifications found in the buffer(ie nans, infinity, etc). The previous behavior can still be enabled with MIGRAPHX_TRACE_EVAL=3.

b20e3d4d

15 Nov, 2021 1 commit

Update driver's perf report to account for batch size (#1000) · 19f65e7e

kahmed10 authored Nov 15, 2021

Currently we have the option of passing in --batch to the driver to change the batch size when the model has a dynamic dim value. We can use this flag to adjust the perf report's rate.

19f65e7e

15 Oct, 2021 1 commit

Enabling rocTX markers for migraphx-driver via roctx knob (#946) · 4a71ec8c

Cagri authored Oct 14, 2021



Added features:
This enables wrapping each migraphx operator with rocTX markers.
It adds new knob trace to migraphx-driver binary.

Limitation:

rocTX standalone does not output a file, it needs to be used with rocprof. Example command line:

/opt/rocm/bin/rocprof -i ./in.txt --hip-trace --roctx-trace --flush-rate 10ms --timestamp on -d cagri_out --obj-tracking on /opt/rocm/bin/migraphx-driver trace ./resnet50-v2-7.onnx --onnx --gpu
Co-authored-by: Shucai Xiao <shucai@gmail.com>

4a71ec8c

13 Oct, 2021 1 commit

Trace eval segfault (#974) · 337c5ba1

Shucai Xiao authored Oct 13, 2021

 when running a model on GPU, migraphx tries to print out content from gpu memory, which causes a segfault. The solution is to copy the gpu memory content back to CPU before the print.

337c5ba1

16 Sep, 2021 1 commit

Loop operator (#853) · a275f590

Shucai Xiao authored Sep 16, 2021

Add Loop operator for opset version 13.
Notes: 1) Default max iteration number is 10 if no max iteration number is provided
2) To change the max iter number, a user can set the max_loop_iterations in the onnx_option struct when parsing a model.
3) The returned shape of the scan output is from the max_loop_iterations even the actual loop num is less than that. This issue also applies to other operators like NonZero and NonMaxSuppression. A issue #948 is created to track this and to be resolved later.
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a275f590

10 Sep, 2021 1 commit
- Assert the shape for compute and compute_shape are the same (#936) · 8b4c69c5
  Paul Fultz II authored Sep 10, 2021
```
Assert shapes dont change
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
```
  8b4c69c5
07 Sep, 2021 2 commits

qdq for quantization and include subgraph (#891) · b45f7239

Shucai Xiao authored Sep 07, 2021



Add operators, refactor parsers, add rewrite passes, add tests
Add ref implementations
Move broadcasting of scales and zero points to onnx parser
Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type
fp16 and fp8 quantization to include subgraph and parameters
fix unit test to use qdq operators for int8 quantization
Co-authored-by: turneram <alturner@amd.com>

b45f7239

Improve check_context to handle submodules better (#927) · fdaa21ee
Paul Fultz II authored Sep 07, 2021
```
Improve check_context to handle submodules better
Co-authored-by: Shucai Xiao <shucai@gmail.com>
```
fdaa21ee

27 Jun, 2021 1 commit

Inline subgraph (#802) · bc52a8a8

Shucai Xiao authored Jun 27, 2021



* Add definitions for all pointwise operators

* Formatting

* Add cpp generator class

* Formatting

* Move compilation to core

* Formatting

* Add clock to tmp name

* Add dynamic loader

* Formatting

* Add tests for code gen

* Formatting

* Add test for literals

* Formatting

* Use with_char

* Add missing header

* Fix mismerge

* Ignore tidy warning

* Fxx gcc 5 errors

* Apply fixits

* Skip signed bitwise of status

* Remove unused parameters

* Explicitly add c++14 flag

* Fix tidy warning

* unify the compute function signature

* clang format

* make another change

* unify the compute function

* clang format

* remove unnecessary code

* more refinement about the operator compute funciton

* clang format

* add an overload function

* clang format

* add support for axes inputs for sequeeze/unsqueeze/reduce_sum

* clang format

* fix build problems

* backup code changes

* clang format

* Add tuple type to shape class

* Formatting

* fix a bug in parsing quantizelinear operator

* clang format

* fix a cppcheck error

* disable different versions of unit tests for different onnx version

* clang format

* upgrade onnx to 1.8

* update onnx to 1.8.1

* disable two more real models

* clang format

* Make data member private

* Formatting

* Add sub arguments

* Formatting

* Trun clang format off

* Disable clang-format

* fix review comments

* fix the function of assign axes in parsing the squeeze operator

* add unit tests and fix a bug

* clang format

* fix review comments

* clang format

* fix a build error

* backup code changes

* clang format

* add more unit tests and add parsing opset version

* clang format

* Improve visiting tuples

* Formatting

* fix cppcheck error

* adding installing the onnx package

* resolve no protobuf compiler

* add an inline subgraph pass

* clang format

* Add more argument tests

* Formatting

* Handle tuple in load

* Formatting

* code backup

* clang format

* Remove .o files

* Add tuple type to api

* Formatting

* fix build errors

* clang format

* code backup

* code backup

* add unit tests for the inline subgraph

* clang format

* refine the inline subgraph and parse if operator

* clang format

* fix cppcheck issue

* clang format

* add unit test for inline subgraph pass

* clang format

* fix format issue

* remove the context from the if operator

* clang format

* simplify the compute functions

* Fix tidy warnings

* fix cppcheck error

* clang format

* fix cppcheck error

* Fix tidy warnings

* fix a cppcheck error

* clang format

* Add a test for share method

* Formatting

* Add a test cpp_type

* add unit tests for more code coverage

* clang format

* add unit tests to have more code coverage

* clang format

* try a comment in jenkins build

* include the install onnnx line

* code backup

* reorder the dependenciesd installed

* refine dockerfile

* fix review comments

* clang format

* remove unnecessary overload function

* fix cppcheck error

* change back the argument test

* Suppress tidy warning

* add the operator get_tuple_elem

* clang format

* add get_tuple_elem to operator include file

* chang if to support multiple operation outputs

* clang format

* optimize inline subgraph

* clang format

* code backup

* clang format

* fix bug

* refine unit tests for tuple output of the if operator

* clang format

* refine a instruction replacement code

* add a unit test and sort all the unit tests alphabetically

* fix cppcheck error

* add more unit tests for multiple op outputs

* clang format

* fix cppcheck error

* Update pass manager to get modules after every pass

* more unit test to cover more scenarios

* clang format

* fixed a bug in a unit test

* add more tests

* clang format

* add more unit tests to have more code coverage

* fix a bug in a unit test

* Add program overload for module

* Formatting

* Hash modules for quicker lookup of modules

* Bump file version

* Add methods to remove modules

* Formatting

* add the tuple type to the support list

* Eliminate unused modules

* Formatting

* Fix test errors

* Foramtting

* Fix tidy issues

* fix problem related to inline subgraph

* clang format

* fix review comments

* fix review comments

* fix review comments

* fix review comments

* clang format

* fix a unit test

* one more code change

* remove an optimization related to the if operator

* clang format

* fix review comments
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

bc52a8a8

09 Jun, 2021 1 commit

Fixes when using C++ debug assertions (#837) · 98486807

Paul Fultz II authored Jun 09, 2021



* Enable libstdc++ debug mode

* Add is_end function

* Compare addresses in a map or set

* Formatting

* Check end

* Fix comparision of instruction_ref

* Formatting

* Some more iterator fixes

* Formatting

* Fix assert

* Fix invalid iterators

* Fix debug print in program

* Remove debug flag for now

* Set correct bool type
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

98486807

02 Jun, 2021 1 commit

Add perf grouping (#841) · 0a9c18a5

Paul Fultz II authored Jun 02, 2021


Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0a9c18a5

25 May, 2021 1 commit

Print out time when trace_eval is enabled (#836) · c517bda7

Paul Fultz II authored May 25, 2021



* Add timing to trace eval

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

c517bda7

07 May, 2021 1 commit

Update dead_code_elimination to remove unused modules (#820) · 43230d29

Paul Fultz II authored May 07, 2021

* Update pass manager to get modules after every pass

* Add program overload for module

* Formatting

* Hash modules for quicker lookup of modules

* Bump file version

* Add methods to remove modules

* Formatting

* Eliminate unused modules

* Formatting

* Fix test errors

* Foramtting

* Fix tidy issues

43230d29

01 May, 2021 2 commits
- Formatting · 40bab788
  Paul authored Apr 30, 2021
  
  40bab788
- Hash modules for quicker lookup of modules · a851f699
  Paul authored Apr 30, 2021
  
  a851f699
22 Apr, 2021 1 commit

Cpu fusions using post_ops (#781) · f7befe50

Paul Fultz II authored Apr 22, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test

* Add layernorm matcher

* Add gelu_erf matcher

* Formatting

* Add gelu_tanh matcher

* Formatting

* Remove match namespace

* Formatting

* Use matcher instead of string

* Formatting

* Add fusions

* Formatting

* Add post op field

* Formatting

* Make post_ops serializable

* Formatting

* Add eltwise fusions

* Formatting

* Fix null conversions

* Formatting

* Add fuse_ops source files

* Formatting

* Set binary post op index correctly

* Formatting

* Fix serialization bugs

* Check if used once

* Formatting

* Fix error in get_primitive_attr

* Formatting

* Add compile function

* Formatting

* Limit fusions

* Formatting

* Disable with env variable instead of using compile arg

* Formatting

* Fix implicit conversion to bool

* Declar on seperate lines

* Formatting

* Fix cppcheck issues

* Fix ICE in pack_join

* Formatting

* Use const ref

* Make enum hashable

* Formatting

* Add explicit this

* Fix merge issues

* Fix dangling ref

* Formatting

* Add test for compile

* Formatting

* Add more value tests

* Formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

f7befe50