Commits · 350bbea214020f2f9a90129ee6bf6af2b3448d9c · gaoqiong / MIGraphX

14 Sep, 2023 1 commit
- Add chunking of binaries when writing with msgpack (#2068) · f50ba415
  Paul Fultz II authored Sep 14, 2023
  
  f50ba415
08 Sep, 2023 2 commits
- fix tidy · 848a476d
  umang yadav authored Sep 08, 2023
  
  848a476d
- Fix tidy · b888b61d
  umang yadav authored Sep 08, 2023
  
  b888b61d
08 Aug, 2023 1 commit
- Update to Cppcheck 2.11 (#1914) · a359d2c8
  Paul Fultz II authored Aug 08, 2023
  
  a359d2c8
03 Aug, 2023 1 commit
- Add use of implicit deps inside sorting function (#2005) · e46a6a52
  Umang Yadav authored Aug 03, 2023
```
* use implicit deps for the sorting
* use BFS for sorting program
```
  e46a6a52
25 Jul, 2023 1 commit
- Changes from fix_sorting · 95f2cdb9
  umangyadav authored Jul 25, 2023
  
  95f2cdb9
19 Jul, 2023 2 commits
- Use target_id instead of string · b40c020a
  umangyadav authored Jul 19, 2023
  
  b40c020a
- add docstring for the get_target_assignments · 93bcb779
  umangyadav authored Jul 19, 2023
  
  93bcb779
09 Jul, 2023 1 commit
- Add MIGraphX version inside MXR file (#1866) · 8428a242
  Umang Yadav authored Jul 09, 2023
  
  8428a242
06 Jul, 2023 1 commit

Enable eval to handle multiple contexts (#1751) · 072fd5cc

Paul Fultz II authored Jul 06, 2023

This is to help enable multi-target execution. We store a vector of targets and contexts. Currently this will only compile a single target, the PR #1672 is needed to enable multiple targets.

This will also serialize the targets and contexts.

When using the execution_environment or prog.get_context() it will always use the context from the first target assuming this is the "primary" target. Although, its unlikely a user would use execution_environment with a multi-target environment.

072fd5cc

30 Jun, 2023 1 commit
- Remove reduce (#1907) · 46d78a0f
  Umang Yadav authored Jun 30, 2023
  
  46d78a0f
26 Jun, 2023 1 commit
- Print max,min,mean and stddev values for TRACE_EVAL = 2 (#1864) · 84a8f450
  Umang Yadav authored Jun 26, 2023
  
  84a8f450
14 Jun, 2023 1 commit

Fix TRACE_EVAL > 1 (#1835) · 5bf067ed

Umang Yadav authored Jun 14, 2023



* add fix for the trace_eval

* Add throw for the debug builds

* Formatting

---------
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

5bf067ed

31 May, 2023 1 commit
- Update pass manager to handle multi-target compilation (#1672) · 9473e3a2
  Umang Yadav authored May 31, 2023
```
partially solves #1656
This PR only handles compilation part of multitarget.
```
  9473e3a2
30 May, 2023 1 commit

Improvements to driver output (#1710) · d32ab85b

Paul Fultz II authored May 30, 2023

Use generate_argument instead of generate_literal for python output as generate_literal doesnt exists
Shorten the names for variables from the main module
Use prefix p_ for parameters
Use shorter variable m for main module in python

d32ab85b

06 Apr, 2023 1 commit

Driver dynamic batch update (#1652) · adccec52

Charlie Lin authored Apr 06, 2023

Examples..

bin/driver verify /codes/onnx_models/resnet50-v1-7/resnet50-v1-7.onnx --split-single-dyn-dim --batch 3 --dyn-input-dim @data "[{min:1, max:4}, 3, 224, 224]"

bin/driver compile /codes/onnx_models/resnet50-v1-7/resnet50-v1-7.onnx --split-single-dyn-dim --default-dyn-dim "{min:1, max:10}" --output resnet50_batch1-10.mxr

bin/driver perf resnet50_batch1-10.mxr --batch 4

adccec52

28 Feb, 2023 1 commit

Select module op (#1569) · a63ee2e0

Charlie Lin authored Feb 28, 2023

Creates the select_module operator that selects one of the submodules passed to it to run based on the submodule parameters.  The submodule is selected by having the exact same static shapes for the arguments to select_module as the parameters in the submodule

a63ee2e0

16 Feb, 2023 1 commit

Add flag for tuning in migraphx-driver (#1519) · cc098f4d

Umang Yadav authored Feb 15, 2023

* Add driver flag "--exhaustive-tune" to enable tuning, add support for the same in C/C++ and python API

cc098f4d

02 Feb, 2023 1 commit
- Dynamic gathernd (#1480) · d478675c
  Brian Pickrell authored Feb 02, 2023
```
Dynamic shape support for gathernd op.
```
  d478675c
14 Dec, 2022 1 commit
- Print program as python (#1490) · 56c43445
  Paul Fultz II authored Dec 14, 2022
```
* Print python code
```
  56c43445
04 Oct, 2022 1 commit
- Stream sync Changset (#1358) · f7d987ba
  Ted Themistokleous authored Oct 04, 2022
```
Stream sync changes and associated API level changes
```
  f7d987ba
06 Sep, 2022 1 commit
- Enable cppcheck rule for 'not', 'or' keywords (#1361) · d37a4df9
  Paul Fultz II authored Sep 06, 2022
```
Using not and or improves readability. The cppcheck rule will help ensure we are doing it consistently.
```
  d37a4df9
21 Aug, 2022 1 commit

Update is_supported (#1334) · 79e15ca9

varunsh authored Aug 21, 2022

* Update is_supported
* Return object from is_supported
* Return by reference in interator

79e15ca9

04 Aug, 2022 1 commit

Dynamic ref convolution op (#1224) · 67f77ac1

Charlie Lin authored Aug 04, 2022



* Dynamic shape handling in shape object

* rewrite empty lens multibroadcast test

* Shape class changes to handle dynamic
* More throw errors for functions that don't make sense for dynamic shape
* Print output changes
* Serialization changes

* Fixing serialization errors

* Remove const on dyn_dim copy getters

* Dynamic shape tests

* Fix serialize errors

* Add dyn_data struct to avoid ambiguous constructor

* Tidy fix: emplace_back() over for loop

* Tidy fix: use move

* Use std::initializer_list in constructor
Reverts the dyn_data struct change
Should get around the ambiguous braced initialization list error

* avoid typedef

* element_space, min,max,opt _lens change

* formatting

* Comments fix

* dynamic bytes() test

* Seralize and reflect changes

* formatting

* Test the dynamic lens functions

* progress

* Formatting

* Dynamic conv draft progress

* Add operator<< tests for coverage

* Coverage update

* Add to conv dynamic batch test

* Dynamic image size test

* Dynamic weight handling

* Dyn image shape test change, fix dyn weight cond

* Comment update

* Dynamic weights shape test and fix

* Use ternary operator

* Tidy fixes

* Handle dynamic graph input shapes in ONNX parser

* Formatting

* Handle dynamic shape for convolution

* formatting

* cppcheck fixes

* Add onnx test files

* Fix typo

* Disable auto_pad for dynamic input shape

* check_shapes object checks for allowing dynamic shapes

* Fix any_of

* Change to maintain const objectness

* Formatting

* Check shapes allow dynamic

* Refactor compute_shape() call into op.compute()
Allows for per operator differences with handling dynamic shape
Fix operation.hpp change to use the generator

* Comment fix

* Refactor normalize_attributes() calls to use max_lens()

* Comment addition

* Update other normalize_attributes() calls

* Change to using constructor and add tests

* Use const member function

* Add more dynamic shape support

* Add tests for error code coverage

* Fix opt shape bug and add shape tests

* capture all by ref

* Fix typo with img shape calculation

* Add more tests

* dynamic auto pad attempt
Linker error with pad_calc.cpp

* Fix parse dyn auto_pad
Should only need to use dynamic auto pad when the image shape or kernel
shape are dynamic. For a dynamic batch size, the auto pad calculation is
the same.

* Fix linking error

* Fix auto_pad bug
Fixed input tensor with auto_pad setting on

* auto_pad onnx tests

* Fix auto_pad calculation, evaluate in ref_conv
add ref_ops tests

* Add shape tests, fix bugs

* Refactor first two output dynamic len calculation

* Conv MLIR test update

* i64 MLIR test fix

* Fix MLIR test typo
Co-authored-by: Chris Austen <causten@users.noreply.github.com>

67f77ac1

08 Jul, 2022 2 commits

Update perf report to show the number of operators and per operator avg time in summary (#1287) · 05b13c9f

Paul Fultz II authored Jul 08, 2022

Show the number of operators and per operator avg time in summary...

Summary:
gpu::gemm: 8.738ms / 73 = 0.119699ms, 64%
gpu::triadd_layernorm: 0.831381ms / 24 = 0.0346409ms, 7%

05b13c9f

Add is_supported and get_target_assignments (#1269) · 8192f37f

varunsh authored Jul 07, 2022

Added is_supported and get_target_assignments methods to the target and program, respectively, to eventually support multi-target compilation and execution.

8192f37f

06 Jul, 2022 1 commit

Verify load and save (#1265) · f2531606

Paul Fultz II authored Jul 05, 2022

*In the verification tests, check that saving and reloading the program is the same program. This also fixes serialization to always load instructions in the same order. There is also fixes for deconv and quant_conv which didn't save the solution id, and was broken for serialization.

f2531606

29 Jun, 2022 1 commit

Update driver models to use json strings (#1244) · ad27d0d6

Paul Fultz II authored Jun 29, 2022

 Compiles significantly faster than constructing all the objects. It also reduces recompiles as well.

ad27d0d6

26 Jun, 2022 1 commit
- Get parent module in the pass manager (#1181) · 3a5c4306
  Paul Fultz II authored Jun 26, 2022
```
* Add function to get a module tree
* Get parent module in the pass manager
```
  3a5c4306
22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
11 Mar, 2022 1 commit

Improve print ins (#1096) · b3b44f5d

Shucai Xiao authored Mar 11, 2022

The module::debug_print(ins) is very slow, which makes the trave_eval==1/2 very slow. The reason is printing an ins involves search the whole module to get the instruction, the print it.  This change is to fix that by calling module::print() to get names of all instructions of a program, then print the instruction by getting its name from a hash map.

b3b44f5d

02 Mar, 2022 1 commit
- Clang format ver10 (#1106) · 9852aaef
  bpickrel authored Mar 02, 2022
```
Update the base version of clang-format from 5.0 to 10.0
```
  9852aaef
02 Feb, 2022 1 commit

Update trace_eval to preview the output buffers (#1073) · b20e3d4d

Paul Fultz II authored Feb 02, 2022

Currently, MIGRAPHX_TRACE_EVAL=2 prints out the entire output buffer, but this can produce a lot of output. To make it easier to inspect and debug, using MIGRAPHX_TRACE_EVAL=2 now only prints 10 elements from the buffer(the first 5 and last 5) and shows any fp classifications found in the buffer(ie nans, infinity, etc). The previous behavior can still be enabled with MIGRAPHX_TRACE_EVAL=3.

b20e3d4d

15 Nov, 2021 1 commit

Update driver's perf report to account for batch size (#1000) · 19f65e7e

kahmed10 authored Nov 15, 2021

Currently we have the option of passing in --batch to the driver to change the batch size when the model has a dynamic dim value. We can use this flag to adjust the perf report's rate.

19f65e7e

15 Oct, 2021 1 commit

Enabling rocTX markers for migraphx-driver via roctx knob (#946) · 4a71ec8c

Cagri authored Oct 14, 2021



Added features:
This enables wrapping each migraphx operator with rocTX markers.
It adds new knob trace to migraphx-driver binary.

Limitation:

rocTX standalone does not output a file, it needs to be used with rocprof. Example command line:

/opt/rocm/bin/rocprof -i ./in.txt --hip-trace --roctx-trace --flush-rate 10ms --timestamp on -d cagri_out --obj-tracking on /opt/rocm/bin/migraphx-driver trace ./resnet50-v2-7.onnx --onnx --gpu
Co-authored-by: Shucai Xiao <shucai@gmail.com>

4a71ec8c

13 Oct, 2021 1 commit

Trace eval segfault (#974) · 337c5ba1

Shucai Xiao authored Oct 13, 2021

 when running a model on GPU, migraphx tries to print out content from gpu memory, which causes a segfault. The solution is to copy the gpu memory content back to CPU before the print.

337c5ba1

16 Sep, 2021 1 commit

Loop operator (#853) · a275f590

Shucai Xiao authored Sep 16, 2021

Add Loop operator for opset version 13.
Notes: 1) Default max iteration number is 10 if no max iteration number is provided
2) To change the max iter number, a user can set the max_loop_iterations in the onnx_option struct when parsing a model.
3) The returned shape of the scan output is from the max_loop_iterations even the actual loop num is less than that. This issue also applies to other operators like NonZero and NonMaxSuppression. A issue #948 is created to track this and to be resolved later.
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a275f590

10 Sep, 2021 1 commit
- Assert the shape for compute and compute_shape are the same (#936) · 8b4c69c5
  Paul Fultz II authored Sep 10, 2021
```
Assert shapes dont change
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
```
  8b4c69c5
07 Sep, 2021 2 commits

qdq for quantization and include subgraph (#891) · b45f7239

Shucai Xiao authored Sep 07, 2021



Add operators, refactor parsers, add rewrite passes, add tests
Add ref implementations
Move broadcasting of scales and zero points to onnx parser
Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type
fp16 and fp8 quantization to include subgraph and parameters
fix unit test to use qdq operators for int8 quantization
Co-authored-by: turneram <alturner@amd.com>

b45f7239

Improve check_context to handle submodules better (#927) · fdaa21ee
Paul Fultz II authored Sep 07, 2021
```
Improve check_context to handle submodules better
Co-authored-by: Shucai Xiao <shucai@gmail.com>
```
fdaa21ee