Commits · bc7ac032d4e356c667e3d33bd36d09c51566c09e · gaoqiong / MIGraphX

"vscode:/vscode.git/clone" did not exist on "f1b68618281d680add95b9c30635ef644f1f6f25"

31 Oct, 2023 1 commit
- partially merge with uif2-rbuild · bc7ac032
  Artur Wojcik authored Oct 31, 2023
  
  bc7ac032
05 Oct, 2023 1 commit
- update uif2-initial with the recent changes · 80eb522f
  Artur Wojcik authored Oct 05, 2023
  
  80eb522f
02 Oct, 2023 2 commits
- temp2 · 5c426ab4
  Artur Wojcik authored Sep 26, 2023
  
  5c426ab4
- cmake_cpu · 2152dc5b
  Artur Wojcik authored Jun 20, 2023
  
  2152dc5b
08 Jul, 2023 1 commit

export API symbols from dynamic libraries (#1892) · c04fbc92

Artur Wojcik authored Jul 08, 2023

Export API symbols for migraphx, migraphx_ref, migraphx_cpu, migrphx_gpu, migraphx_device, migraphx_tf, and migraphx_onnx. There is a separate PR for migrahx_c.

API symbol exporting affects only Windows. It is transparent on Linux.

c04fbc92

29 Jun, 2023 1 commit
- cmake: move linking to migraphx_all_targets to upper scope (#1890) · bad39242
  Artur Wojcik authored Jun 29, 2023
```
Co-authored-by: Artur Wojcik <artur.wojcik@amd.com>
```
  bad39242
27 Sep, 2022 1 commit
- Add onnx mod operator gpu cpu (#1306) · 40118191
  Ted Themistokleous authored Sep 26, 2022
```
Implement operator for CPU and GPU implementations
```
  40118191
22 Jun, 2022 1 commit
- Update license files (#1248) · e44cecbc
  Ted Themistokleous authored Jun 22, 2022
```
Updated each source file in the repo with the existing license.
```
  e44cecbc
19 Oct, 2021 1 commit
- Link with pthreads in core migraphx library since we use threads there (#975) · 4d82d761
  Paul Fultz II authored Oct 19, 2021
```
pthread linking errors on SLES. 
```
  4d82d761
31 Aug, 2021 1 commit

Changes to support both OneDNN and ZenDNN builds (#929) · 0859fe90

kahmed10 authored Aug 31, 2021



* Add preallocate method

* Add preallocate_param pass

* Preallocate buffers on the cpu

* Formatting

* Preallocate on the gpu

* Add missing cpp file

* Formatting

* Add lifetime function

* Formatting

* Improve handling of exceptions in test driver

* Formatting

* Auto print exception

* Formatting

* Fork each test case

* Formatting

* Exclude gcc 5 debug build

* Fix tidy issues

* Add color

* Formatting

* Create driver class

* Formatting

* Customize test_case names

* Formatting

* Report status from forked processes

* Formatting

* Update the verify driver

* Formatting

* Print out failed tests

* Formatting

* Fix tidy issues

* Formatting

* Expect passing

* Improve failure reporting on non-linux systems

* Fix ifdef

* Always allocate

* Fix tidy warning

* Flush code code cov

* Formatting

* Fix tidy

* Add const

* Check if weak symbols is linked

* Formatting

* initial progress

* formatting

* Add continue flag

* Formatting

* Set exe name

* Use stringstream and use quotes

* rename vars

* formatting

* more testing

* formatting

* Fix bug when using --continue in the tests

* Formatting

* revert gemm

* revert dot file

* rename var

* update cmakelists and deconv compute
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

0859fe90

08 Jul, 2021 1 commit

Preallocate parameters on the CPU and unify preallocations (#840) · 427fc25c

Paul Fultz II authored Jul 08, 2021



* Add preallocate method

* Add preallocate_param pass

* Preallocate buffers on the cpu

* Formatting

* Preallocate on the gpu

* Add missing cpp file

* Formatting

* Add lifetime function

* Formatting

* Always allocate

* Fix tidy warning

* Add const

* Add missing lifetime annotations
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

427fc25c

22 Apr, 2021 1 commit

Cpu fusions using post_ops (#781) · f7befe50

Paul Fultz II authored Apr 22, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test

* Add layernorm matcher

* Add gelu_erf matcher

* Formatting

* Add gelu_tanh matcher

* Formatting

* Remove match namespace

* Formatting

* Use matcher instead of string

* Formatting

* Add fusions

* Formatting

* Add post op field

* Formatting

* Make post_ops serializable

* Formatting

* Add eltwise fusions

* Formatting

* Fix null conversions

* Formatting

* Add fuse_ops source files

* Formatting

* Set binary post op index correctly

* Formatting

* Fix serialization bugs

* Check if used once

* Formatting

* Fix error in get_primitive_attr

* Formatting

* Add compile function

* Formatting

* Limit fusions

* Formatting

* Disable with env variable instead of using compile arg

* Formatting

* Fix implicit conversion to bool

* Declar on seperate lines

* Formatting

* Fix cppcheck issues

* Fix ICE in pack_join

* Formatting

* Use const ref

* Make enum hashable

* Formatting

* Add explicit this

* Fix merge issues

* Fix dangling ref

* Formatting

* Add test for compile

* Formatting

* Add more value tests

* Formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

f7befe50

25 Mar, 2021 1 commit

Add cpu fusion for gelu and layernorm (#761) · 728d083d

Paul Fultz II authored Mar 25, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test

* Add layernorm matcher

* Add gelu_erf matcher

* Formatting

* Add gelu_tanh matcher

* Formatting

* Remove match namespace

* Formatting

* Use matcher instead of string

* Formatting

* Add fusions

* Formatting

* Make input a const ref

* Make this explicit for gcc 5
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

728d083d

26 Feb, 2021 1 commit

Add more supported operators and optimizations for the cpu backend (#746) · a0b570b2

Paul Fultz II authored Feb 26, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a0b570b2

08 Feb, 2021 1 commit

Add a pass to remove unsupported data types (#738) · 3d24a21c

Paul Fultz II authored Feb 07, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

3d24a21c

08 Jan, 2021 1 commit

Revamp CI infrastucture (#706) · ceb4ca09

Paul Fultz II authored Jan 08, 2021



* Add build and test github workflow

* Fix cget command

* Remove def-requirements.txt

* Add tmate session to debug workflow

* Run tmate session after installing dependencies

* Print date periodically

* Add clang tidy action

* Seperate build and run container in two different jobs

* Run bash script

* Remove interactive flag

* Try to mount the files

* Try to use the github workspace

* WIthout double braces

* Use env variable

* Pipe bash script in

* Run using hip-clang

* Use correct path

* Add verbose

* Remove j flag

* Only run for onnx file to debug

* Manually run clang-tidy

* Remove quiet flag

* Print header file

* Printout environment

* Remove extra defines

* Remove fixits and config flag

* Show ldd

* Add tmate session

* Run onnx protobuf first

* Generate proto for tensorflow

* Update cppcheck version

* Fix some cppcheck issues

* Add const

* Cppcheck fixes

* Formatting

* Fix more cppcheck issues

* Run two jobs

* Cache analysis and run format checking

* Fix yaml issues

* Fix yaml issues

* Fix indentation

* Switch to hip-clang for main docker file

* Use hip-clang in the readme

* Fixes for jenkins

* Use ccache to build

* Combine file

* Set restore keys

* Change stage name

* Build with ccache

* Add missing dependency for ccache

* Build debug with codecov

* Fix workflow syntax

* Fix list

* Use quotes

* Got to correct build path

* Install lcov

* Use sudo

* Echo all commands

* Setup tmate

* Add verbose output

* Build with cmake directly

* Add pthread flag

* Remove python config

* Continue on error

* Use on or off for cmake flag

* Use always upload cache

* Verbose output

* Verbose output from build

* Build one target

* Reduce debug symbols

* Increase garbage collection

* Remove dmesg

* Increase it to 20

* Update rocm cmake version

* Remove jobs from jenkins

* Run on all 3 ubuntus

* Remove gcc 5 jobs

* Dont add flag on 16.04

* Only upload coverage on 18.04

* Dont build for ubuntu 20.04

* Use matrix.os

* Use O2 for hip-clang since lower optimizations are broken

* Use rocm 3.0

* Pass ccache as cmake variable instead of env variable

* Build miopen from source

* Show ccache statistics

* Print log information

* Set compression level

* Use hash dir

* Set hashdir

* Install clang ocl from system

* Up compression level

* Add locale

* Increase cache size to 1G

* Lower compression level to 9

* Remove split dwarf

* Remove Og

* Add back Og

* Seperate debug and codecov

* Add missing backlash

* Garbage collect more often

* Add missing locales package

* Use Os

* Install onednn in docker and run tests

* Include target headers in tests

* Increase timeout

* Remove if condtion

* Make flag public

* Suppress memory leaks in onednn

* Use equal

* Add gh annotations

* Update rocm-cmake version

* Add ldconfig
Co-authored-by: Shucai Xiao <shucai@gmail.com>

ceb4ca09

14 Dec, 2020 1 commit

Use dnnl for cpu backend (#688) · 406afeb8

Paul Fultz II authored Dec 14, 2020



* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Add onednn

* Formatting

* Formatting

* Add dnnl header

* Formatting

* Rewrite rnn first

* Formatting

* Call reference implementation

* Formatting

* Make literal data shared

* Formatting

* Add convolution

* Formatting

* Compensate for dilation

* Formatting

* Use name/make_op instead

* Formatting

* Rename gemm header

* Formatting

* Add dnnl convolution/gemm operators

* Formatting

* Add eliminate_contiguous

* Add faster pointwise operators

* Formatting

* Formatting

* Formatting

* Add dnnl op class

* Formatting

* Add add op

* Formatting

* Add concat operator

* Formatting

* Add more ops

* Create descriptor during finalization

* Formatting

* Dont rewrite pooling

* Enable memory coloring

* Formatting

* Add output aliases

* Formatting

* Fix errors

* Formatting

* Convert literals

* Add missing file

* Remove batch_norm

* Formatting

* Use strides

* Formatting

* Add some debug checks

* Formatting

* Fix big in adjusting shape for gemm

* Formatting

* Fix fallback dot operator

* Zero initialize buffers

* Add suport for group convolutions

* Formatting

* Make adjust allocation target independent

* Formatting

* Enable adjust_allocation for gpu/cpu

* Formatting

* Add copy to allocation model

* Formatting

* Add copy operator

* Formatting

* Better handling of output parameters in adjust_allocation

* Formatting

* Build with dnnl

* Make dnnl required

* Fix compile error

* Tidy fixes

* Formatting

* Tidy fixes

* Formatting

* Fix more tidy issues

* Formatting

* Add mul op

* Add mul op

* Set c compiler to clang as well

* Compensate for normalized compute shape

* Formatting

* Fix cppcheck errors

* Formatting

* Add onednn library to hcc

* Guard clang pragmas

* Disable cpu mode for gcc for now

* Leave it enabled it for gcc 7

* Fix cppcheck suppresion

* Fix compile error on gcc 5

* Remove unused code
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

406afeb8

04 Nov, 2020 1 commit

Split cpu and reference implementation (#671) · 500d9441

Paul Fultz II authored Nov 04, 2020



* Add all_targets cmake target

* Rename target

* Add ref target

* Rename tests

* Refactor compiler target

* Formatting

* Verify for every target

* Formatting

* Add verify test suite

* Formatting

* Add initial test programs

* Formatting

* Add rnn tests

* Formatting

* Validate gpu

* Formatting

* Remove old gpu tests

* Fix gpu tests

* Fix ref error

* Fix tidy issues

* Formatting

* Tidy fixes

* Fix header in python api

* Rename to ref

* Use ref in verify_onnx

* Fix tidy issue

* Build with verbose on

* Fix typo

* Remove verbose

* rename some cpu prefix to ref
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>

500d9441

17 Nov, 2019 1 commit
- Update version · d10628ee
  Paul authored Nov 16, 2019
  
  d10628ee
04 Sep, 2019 1 commit
- Set SO version · a79beae7
  Paul authored Sep 04, 2019
  
  a79beae7
14 Nov, 2018 1 commit
- Rename to migraphx · 96358e41
  Paul authored Nov 14, 2018
  
  96358e41
09 Nov, 2018 1 commit
- Migraph from export name · 0d73540b
  Paul authored Nov 09, 2018
  
  0d73540b
08 Nov, 2018 1 commit
- Add initial packaging and installation · d1d0a75d
  Paul authored Nov 08, 2018
  
  d1d0a75d
02 Nov, 2018 1 commit

Remove cpu from names (#102) · 0d0778b7

Shucai Xiao authored Nov 02, 2018

* add the slice test example on gpu.

* change the gpu slice test according to comments.

* rename cpu_lowering to lowering, rename cpu_target to target, so consistent with gpu side.

* fix the format of a file CMakeLists.txt.

* Revert "change the gpu slice test according to comments."

This reverts commit 721bbb180d11811dc914d60fd8a1c91926e3f947.

* Revert "add the slice test example on gpu."

This reverts commit 68dabb05adffd429e5e5d10c3a1def2b06489f63.

* fix a format for the file doc/src/reference/targets.rst

0d0778b7

18 Sep, 2018 2 commits
- comment out install · 633741cd
  mei-ye authored Sep 18, 2018
  
  633741cd
- unstage · 37d659c6
  mei-ye authored Sep 18, 2018
  
  37d659c6
13 Sep, 2018 1 commit
- merge with master · b1e097b3
  mei-ye authored Sep 13, 2018
  
  b1e097b3
11 Sep, 2018 1 commit
- merge to master · 029b1e64
  mei-ye authored Sep 11, 2018
  
  029b1e64
01 Sep, 2018 1 commit
- Revert "Memory coloring" · ef99eda3
  Paul Fultz II authored Sep 01, 2018
  
  ef99eda3
31 Aug, 2018 1 commit
- comment out install · 51f1f524
  mei-ye authored Aug 31, 2018
  
  51f1f524
08 Aug, 2018 1 commit
- staging · dd33a45f
  mei-ye authored Aug 08, 2018
  
  dd33a45f
05 Aug, 2018 1 commit
- Link in threads · c8a8d5d8
  Paul authored Aug 04, 2018
  
  c8a8d5d8
04 Aug, 2018 4 commits
- Try to speed up compilation · 5b1e442e
  Paul authored Aug 04, 2018
  
  5b1e442e
- Fix include path · e1e37208
  Paul authored Aug 04, 2018
  
  e1e37208
- Enable threads · 44e6630c
  Paul authored Aug 03, 2018
  
  44e6630c
- Use blaze to compute matrix multiply · cff83eda
  Paul authored Aug 03, 2018
  
  cff83eda
16 Jul, 2018 1 commit
- Refactor cpu lowering · a47f8e4b
  Paul authored Jul 16, 2018
  
  a47f8e4b
02 Jul, 2018 1 commit
- s/rtg/migraph · eea003a5
  Paul authored Jul 02, 2018
  
  eea003a5
21 May, 2018 1 commit
- Add cpu backend · fd26582a
  Paul authored May 21, 2018
  
  fd26582a