Commits · bc52a8a84a7426fa6c408e9f1a8d4471ec79833f · gaoqiong / MIGraphX

27 Jun, 2021 1 commit

Shucai Xiao authored Jun 27, 2021



* Add definitions for all pointwise operators

* Formatting

* Add cpp generator class

* Formatting

* Move compilation to core

* Formatting

* Add clock to tmp name

* Add dynamic loader

* Formatting

* Add tests for code gen

* Formatting

* Add test for literals

* Formatting

* Use with_char

* Add missing header

* Fix mismerge

* Ignore tidy warning

* Fxx gcc 5 errors

* Apply fixits

* Skip signed bitwise of status

* Remove unused parameters

* Explicitly add c++14 flag

* Fix tidy warning

* unify the compute function signature

* clang format

* make another change

* unify the compute function

* clang format

* remove unnecessary code

* more refinement about the operator compute funciton

* clang format

* add an overload function

* clang format

* add support for axes inputs for sequeeze/unsqueeze/reduce_sum

* clang format

* fix build problems

* backup code changes

* clang format

* Add tuple type to shape class

* Formatting

* fix a bug in parsing quantizelinear operator

* clang format

* fix a cppcheck error

* disable different versions of unit tests for different onnx version

* clang format

* upgrade onnx to 1.8

* update onnx to 1.8.1

* disable two more real models

* clang format

* Make data member private

* Formatting

* Add sub arguments

* Formatting

* Trun clang format off

* Disable clang-format

* fix review comments

* fix the function of assign axes in parsing the squeeze operator

* add unit tests and fix a bug

* clang format

* fix review comments

* clang format

* fix a build error

* backup code changes

* clang format

* add more unit tests and add parsing opset version

* clang format

* Improve visiting tuples

* Formatting

* fix cppcheck error

* adding installing the onnx package

* resolve no protobuf compiler

* add an inline subgraph pass

* clang format

* Add more argument tests

* Formatting

* Handle tuple in load

* Formatting

* code backup

* clang format

* Remove .o files

* Add tuple type to api

* Formatting

* fix build errors

* clang format

* code backup

* code backup

* add unit tests for the inline subgraph

* clang format

* refine the inline subgraph and parse if operator

* clang format

* fix cppcheck issue

* clang format

* add unit test for inline subgraph pass

* clang format

* fix format issue

* remove the context from the if operator

* clang format

* simplify the compute functions

* Fix tidy warnings

* fix cppcheck error

* clang format

* fix cppcheck error

* Fix tidy warnings

* fix a cppcheck error

* clang format

* Add a test for share method

* Formatting

* Add a test cpp_type

* add unit tests for more code coverage

* clang format

* add unit tests to have more code coverage

* clang format

* try a comment in jenkins build

* include the install onnnx line

* code backup

* reorder the dependenciesd installed

* refine dockerfile

* fix review comments

* clang format

* remove unnecessary overload function

* fix cppcheck error

* change back the argument test

* Suppress tidy warning

* add the operator get_tuple_elem

* clang format

* add get_tuple_elem to operator include file

* chang if to support multiple operation outputs

* clang format

* optimize inline subgraph

* clang format

* code backup

* clang format

* fix bug

* refine unit tests for tuple output of the if operator

* clang format

* refine a instruction replacement code

* add a unit test and sort all the unit tests alphabetically

* fix cppcheck error

* add more unit tests for multiple op outputs

* clang format

* fix cppcheck error

* Update pass manager to get modules after every pass

* more unit test to cover more scenarios

* clang format

* fixed a bug in a unit test

* add more tests

* clang format

* add more unit tests to have more code coverage

* fix a bug in a unit test

* Add program overload for module

* Formatting

* Hash modules for quicker lookup of modules

* Bump file version

* Add methods to remove modules

* Formatting

* add the tuple type to the support list

* Eliminate unused modules

* Formatting

* Fix test errors

* Foramtting

* Fix tidy issues

* fix problem related to inline subgraph

* clang format

* fix review comments

* fix review comments

* fix review comments

* fix review comments

* clang format

* fix a unit test

* one more code change

* remove an optimization related to the if operator

* clang format

* fix review comments
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

bc52a8a8

09 Jun, 2021 1 commit

Asym pad refactor (#791) · 9a5e0c06

kahmed10 authored Jun 09, 2021



* alternative impl

* formatting

* add gpu pass to insert pad

* formatting

* update onnx test, still need cleanup

* formatting

* update tf_test

* modify existing tests

* formatting

* remove print

* code cleanup

* formatting

* code cleanup

* formatting

* fix tidy and cppcheck

* remove variable

* add test

* formatting

* add test and address comments

* formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

9a5e0c06

03 May, 2021 1 commit

Reduce types generated for hip kernels (#814) · 3becd974

Paul Fultz II authored May 03, 2021



* Remove unused data types

* Formatting

* Reduce types generated for hip kernels

* Formatting

* Fix onnx tests

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

3becd974

29 Apr, 2021 1 commit

MLIR MIOpen Dialect integration (phase 1) (#768) (#769) · 56584fa2

SJW authored Apr 29, 2021



* MLIR MIOpen Dialect integration (phase 1) (#768)

* Added Findmlir.cmake (using environment variables to import)

* Added mlir_conv pass to GPU target

  * Apply to any gpu::convolution if supported by MLIR

  * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution

  * Capture binary in dictionary for matching convolutions

  * Build a code_object_op with the binary and execution dimensions

  * Substitute for the gpu::convolution

* Changed the parameters for the code_object to reflect the generated MLIR kernel

* Expanded out MemRefDescriptor fields in param list

* Also updated for MLIR C-API changes

* * fixed global_size calculation

* MLIR MIOpen Dialect integration (phase 1) (#768)

* Added Findmlir.cmake (using environment variables to import)

* Added mlir_conv pass to GPU target

  * Apply to any gpu::convolution if supported by MLIR

  * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution

  * Capture binary in dictionary for matching convolutions

  * Build a code_object_op with the binary and execution dimensions

  * Substitute for the gpu::convolution

* Changed the parameters for the code_object to reflect the generated MLIR kernel

* Expanded out MemRefDescriptor fields in param list

* Also updated for MLIR C-API changes

* * Added command line option: --enable_mlir

* * fixed command line switch

* updated for new MLIR API changes

* * Added cget llvm-project-mlir to import MIIR API libraries into Dockerfile
  * removed cmake Findmlir

* updated for changes in MIIR C-API

* * updated CMakeLists.txt to allow disable of MLIR import

* fixed memory leaks and removed copies

* updated for 5D memrefs

* * formatting

* * fixed review comments

* * fixed merge issues

* hip gcnDeviceName now includes specifiers at the end
  * use major/minor values instead

* * disable MLIR by default

* * removed command-line switch --enable-mlir

* * fix unused when MLIR disabled

* * enable jenkins enable/test MLIR

* * format

* * fixed clang-tidy

* * added new type
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

56584fa2

26 Feb, 2021 1 commit

Add more supported operators and optimizations for the cpu backend (#746) · a0b570b2

Paul Fultz II authored Feb 26, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a0b570b2

05 Feb, 2021 1 commit

Normalize compute methods (#723) · 549cfe72

Paul Fultz II authored Feb 05, 2021



* Normalize compute functions

* Formatting

* Save normalization flag to the file

* Formatting

* Remove tuned functions

* Formatting

* Use in_index
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

549cfe72

14 Dec, 2020 1 commit

Use dnnl for cpu backend (#688) · 406afeb8

Paul Fultz II authored Dec 14, 2020



* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Add onednn

* Formatting

* Formatting

* Add dnnl header

* Formatting

* Rewrite rnn first

* Formatting

* Call reference implementation

* Formatting

* Make literal data shared

* Formatting

* Add convolution

* Formatting

* Compensate for dilation

* Formatting

* Use name/make_op instead

* Formatting

* Rename gemm header

* Formatting

* Add dnnl convolution/gemm operators

* Formatting

* Add eliminate_contiguous

* Add faster pointwise operators

* Formatting

* Formatting

* Formatting

* Add dnnl op class

* Formatting

* Add add op

* Formatting

* Add concat operator

* Formatting

* Add more ops

* Create descriptor during finalization

* Formatting

* Dont rewrite pooling

* Enable memory coloring

* Formatting

* Add output aliases

* Formatting

* Fix errors

* Formatting

* Convert literals

* Add missing file

* Remove batch_norm

* Formatting

* Use strides

* Formatting

* Add some debug checks

* Formatting

* Fix big in adjusting shape for gemm

* Formatting

* Fix fallback dot operator

* Zero initialize buffers

* Add suport for group convolutions

* Formatting

* Make adjust allocation target independent

* Formatting

* Enable adjust_allocation for gpu/cpu

* Formatting

* Add copy to allocation model

* Formatting

* Add copy operator

* Formatting

* Better handling of output parameters in adjust_allocation

* Formatting

* Build with dnnl

* Make dnnl required

* Fix compile error

* Tidy fixes

* Formatting

* Tidy fixes

* Formatting

* Fix more tidy issues

* Formatting

* Add mul op

* Add mul op

* Set c compiler to clang as well

* Compensate for normalized compute shape

* Formatting

* Fix cppcheck errors

* Formatting

* Add onednn library to hcc

* Guard clang pragmas

* Disable cpu mode for gcc for now

* Leave it enabled it for gcc 7

* Fix cppcheck suppresion

* Fix compile error on gcc 5

* Remove unused code
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

406afeb8

16 Nov, 2020 1 commit

Normalize ops (#667) · 8443ecd1

Shucai Xiao authored Nov 16, 2020



* add a pass to normalize ops

* clang format

* add unit tests

* clang format

* code backup

* clang format

* code backup

* clang format

* add support for slice in the normalize_op function

* clang format

* add operation method api for whether we need to call normalize_op

* clang format

* fix review comments

* clang format

* rename a function namejJ

* clang format

* change compute_shape to normalize_compute_shape for corresponding operators

* clang format

* remove unnecessary code

* fix various issues

* clang format

* add attributes to operators having axis attributes

* clang format

* fixed jenkins build error

* clang format

* fix a bug related to slice

* clang format

* code backup

* clang format

* code backup

* clang format

* rename a file

* fix cppcheck error

* some code refinement

* clang format

* change attributes to enum

* clang format

* refine the enum

* clang format

* remove unnecessary code

* add unit tests for more code coverage and fixed a bug

* clang format

* remove unnecessary changes

* change normalize_axes to normalize

* clang format

* revert back the changes in broadcast.hpp

* rename normalize_axes to normalize

* fix review comments

* clang format

* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Formatting

* Try to avoid ambiguous assign in value class

* fixed a build error

* clang format

* add the normalize_ops pass to the ref target

* refactor program to module to normalize_ops pass
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

8443ecd1

10 Nov, 2020 1 commit

Add flag to enable cpu backend (#680) · d39e51ed

Paul Fultz II authored Nov 10, 2020

* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Formatting

* Enable cpu backend for gcc builds

d39e51ed

08 Oct, 2020 1 commit

Add build flag for fast math (#639) · a5065265

kahmed10 authored Oct 08, 2020



* add flag

* formatting

* remove env variable

* fix api expression

* add api test

* add api test

* add op test

* formatting

* fix function name

* fix syntax

* formatting

* modify test

* remove test and update doc

* move test to new file

* formatting

* revert test files

* rewrite check

* New
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

a5065265

14 Sep, 2020 1 commit

Some perf improvements to bert (#627) · 9f283810

Paul Fultz II authored Sep 14, 2020



* Fuse gemm in fuse ops

* Formatting

* Add const ref

* Remove assert

* Skip already fused gemms

* Skip already fused gemm

* Formatting

* Use float_equal

* Avoid non-standard shapes for inputs

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

9f283810

10 Sep, 2020 1 commit

Add load/save function for program (#623) · 63c5582a

Paul Fultz II authored Sep 09, 2020



* Add save/load functions

* Formatting

* Add loading and saving to the driver

* Formatting

* Add return

* Serialize the context with the program

* Formatting

* Add python API

* Formatting

* Add c/c++ apis

* Formatting

* Add tests

* Formatting

* Fix tidy error

* Fix python doc

* Restore python code

* Add function name to errors

* Formatting

* Use lvalue for writing

* Serialize context

* Fix convolution and pooling operator for miopen

* Formatting

* Add const ref

* Set target name to gpu

* Add target tests

* Formatting

* Move register target to cpp file

* Fix target test

* Use make_target in driver

* Formatting

* Use make_target for the API

* Formatting

* Add cpu include

* Increase timeout

* Add more tests

* Formatting
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

63c5582a

13 Aug, 2020 1 commit

integrate onnx backend test suit to migraphx (#574) · d612e976

Shucai Xiao authored Aug 13, 2020



* initial progress

* formatting

* add pooling changes

* formatting

* change eliminate_pad

* formatting

* rename var

* fomratting

* update op shape test and compute

* formatting

* revert conv constructor

* formatting

* change initializer

* formatting

* fix tidy

* change quant conv and shape check

* add tests and fixes

* formatting

* fix type

* fix conv test

* formatting

* add pooling and bn tests

* formatting

* add inconsistent attr tests

* fix padding issue

* formatting

* progress on 1d to 2d

* formatting

* change compute and compile functions

* formatting

* fix duplicate

* fix conflict

* fix issue with 1d conv

* formatting

* add check for 3d limit

* rename function

* formatting

* update to MIOPen 2.3

* add support for nd pooling

* formatting

* test miopen 2.4

* change function name

* rename functions

* formatting

* add op_shape test

* add gpu ops tests

* formatting

* initial progress

* formatting

* add pkg-config

* add to support asymmetric padding of averagepool

* clang format

* fix bug for average pooling

* clang format

* fix a bug

* add unit tests for the asymmetric padding of averagepool

* clang format

* change functions

* formatting

* additional code refinement

* clang format

* check existing tests

* formatting

* change to copy_backward

* formatting

* change for loop to transform

* formatting

* add tests

* formatting

* remove comment

* add more tests

* remove an optimization for pooling

* clang format

* add and fix unit tests

* clang format

* update gpu miopen calls

* formatting

* initial progress

* add cpu impl and tests

* formatting

* add NOLINT

* add 3d test

* formatting

* add more op_shape tests

* test diff miopen version

* add submodule onnx

* add pooling shape tests

* fix error msg

* add onnx_test_backend

* reorganize python code

* temp disable test

* fix cppcheck error

* fix cppcheck error

* code backup

* add support device choice

* refine onnx backend test

* revert to miopen 2.4

* fix review comments

* fix review comments

* clang format

* fixed review comments

* clang format

* fix cppcheck error

* copy onnx_backend_test to dest when building

* add testdata folder

* fix bounds

* formatting

* code backup

* code backup

* remove unnecessary file

* fix various bugs

* remove unnecessary changes

* remove unnecessary submodule

* remove unnecessary lines

* fix algorithm

* formatting

* refine onnx backend unit tests

* pin numpy version

* fix build issue

* fixed a filename to be copied

* add the onnx dependency in docker image

* ensure results are copied back correctly

* specify onnx version

* update excluded tests

* remove unnecessary log info

* turn on more unit tests

* restrict onnx backend test to python 3.x

* clang format

* refine retrieving the input parameters

* clang format

* fix program input parameter names

* clang format

* avoid running onnx test in python 2.x

* fix cppcheck error

* fix python2.7 backend unit tests error

* clang format

* resolve the issue of ensure data copy to be completed

* clang format

* fix review comments

* fix onnx backend unit test error

* another change to make onnx backend test pass

* clang format

* fix onnx backend test error

* clang format

* disable onnx backend test to try

* build try

* update Dockerfile to try onnx backend test

* remove unnecessary code

* fix a bug in copying program

* clang format

* update dockerfile to include onnx

* fix review comments

* add the pytest module to the container

* exclude real model to avoid to be downloaded

* resolve the sync device for data copy from gpu to cpu

* clang format

* fix review comments

* clang format

* move sync_device after memory_coloring
Co-authored-by: Khalique <15948690+kahmed10@users.noreply.github.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

d612e976

10 Jul, 2020 1 commit

Optimize multiply across slices (#568) · e66968a2

Paul Fultz II authored Jul 10, 2020



* Add initial optimization when using a mul over a sliced convolution

* Formatting

* Add more tests

* Formatting

* Convert to an assert

* Check if used once

* Formatting

* Add test with horiz fusion

* Formatting

* Optimize nested slice

* Formatting

* Fix test

* Add const refs

* Remove unnecessary assert
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

e66968a2

20 May, 2020 1 commit

Rnn variable seq lengths (#517) · 90200619

Shucai Xiao authored May 19, 2020



* code backup

* clang format

* fix compiling errors

* clang format

* rename a few files

* rename a few files

* fix variable bugs

* clang format

* add an operator to shift input sequences

* clang format

* fixed a bug

* clang format

* fixed a bug

* clang format

* code backup

* clang format

* code backup

* clang format

* code backup

* clang format

* refine code related lstm operator optimization

* clang format

* fix various bugs

* clang format

* fixed a bug in rewrite_lstm

* clang format

* fixed another bug

* refine two operator names

* clang format

* refine file names

* fix cppcheck error

* clang format

* fix cppcheck error

* clang format

* fix cppcheck error

* fixed review comments

* clang format

* add unit tests

* clang format

* add unit tests

* clang format

* refine unit tests for better coverage

* clang format

* fixed a bug

* fix cppcheck error

* fix review comments

* clang format

* rename two operators according to review comments

* clang format

* fix review comments

* clang format

* fix review comments

* clang format

* fix review comments

* fix a cppcheck error

* clang format

* fix review comments

* clang format
Co-authored-by: Shucai Xiao <scxiao@prj47-rack-99.local.lan>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

90200619

08 May, 2020 1 commit

Horizontal fusions of gemms and convolutions (#472) · 1a4ff504

Paul Fultz II authored May 08, 2020



* Add decompose pass

* Add decompose test

* Formatting

* Add remap

* Formatting

* Add compute method for dot

* Formatting

* Add finder for horizontal fusion

* Formatting

* Formatting

* Reuse predicate

* Add gemm fusions

* Formatting

* Add some fixes for convolution

* Formatting

* Fix shape tests

* Formatting

* Reuse axis equal

* Add initial split fusion

* Formatting

* Update offset

* Workaround outputs that cant accept nonstandard shapes

* Formatting

* Add check for split concat

* Formatting

* Add missing headers

* Formatting

* Add tests

* Formatting

* Add more testing

* Formatting

* Fix when there is duplicate splits in inputs

* Formatting

* Fix mismatch iterators

* Add tests for dot fusions

* Formatting

* Add test for convolution

* Formatting

* Fix tidy issues

* Add more tests

* Formatting

* Ignore build directory for codecov

* Add test for groups

* Formatting

* Add more tests for groups

* Formatting

* Add test for missing end slice

* Add newline

* Remove unused function

* Add support for when beta is not 1

* Formatting

* Add test for scalar

* Add one more scalar test
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

1a4ff504

15 Nov, 2019 1 commit

Add option to do offload copying automatically (#403) · 81b0ff5d

Paul Fultz II authored Nov 15, 2019

* Add compiler options

* Add copy operators

* Formatting

* Use run_passes in tests

* Formatting

* Use run_pass in schedule test

* Formatting

* Add compile_options to get_passes in target

* Formatting

* Offload copy option

* Formatting

* Copy using pinned memory

* Formatting

* Improve performance of gpu copying

* Formatting

* Dont copy

* Formatting

* Always make an extra copy

* Formatting

* Remove unused write op

* Add missing include

* Remove copy_to_gpu function in python api

* Make offload copy disabled by default on C++

* Formatting

* Fix tidy issues

* Formatting

* Fix namespace

* Fix python tests

* Turn clang format off since its broken

* Fix compile error on gcc 5

* Remove commented code

81b0ff5d

04 Nov, 2019 1 commit

Fix accuraccy issue in resnet50 (#395) · 78c83426

Paul Fultz II authored Nov 04, 2019

* Fix bug in eliminate_concat

* Formatting

* Skip context_free operators

* Formatting

* Fix unit test

* Formatting

78c83426

30 Oct, 2019 1 commit

Enable scheduler for 1 stream (#399) · ca17bcd6

Paul Fultz II authored Oct 30, 2019

* Enable scheduler for 1 stream

* Formatting

* Improve performance of sorting

* Formatting

* Adjust the weight calculation

* Formatting

* Simplify formula

* Formatting

* Avoid division by zero

* Fix scheduler test

* Check for either 1 or 2

* Check for waits when order may change

* Formatting

ca17bcd6

28 Aug, 2019 1 commit
- Fix bug in cse pass · 1290f3ba
  Paul authored Aug 28, 2019
  
  1290f3ba
26 Aug, 2019 2 commits
- clang format · 8fbd2874
  Shucai Xiao authored Aug 26, 2019
  
  8fbd2874
- refine int8 quantization interface · 12ff93ac
  Shucai Xiao authored Aug 26, 2019
  
  12ff93ac
16 Aug, 2019 1 commit
- Replace with literal · b324b9ed
  Paul authored Aug 16, 2019
  
  b324b9ed
15 Aug, 2019 1 commit
- Propogate const for contiguous · a6c9ad80
  Paul authored Aug 14, 2019
  
  a6c9ad80
12 Aug, 2019 1 commit
- Add a pass to transform pooling to reduce_mean · 379ef733
  Paul authored Aug 12, 2019
  
  379ef733
06 Aug, 2019 1 commit
- add a pass for packing int8 op inputs · e60aff63
  Shucai Xiao authored Aug 06, 2019
  
  e60aff63
10 Jul, 2019 1 commit
- Rename batchnorm pass · 5c7bee3a
  Paul authored Jul 10, 2019
  
  5c7bee3a
09 Jul, 2019 1 commit
- Check for adds beign used once · db3f1478
  Paul authored Jul 09, 2019
  
  db3f1478
28 Jun, 2019 1 commit
- Change pass ordering to improve performance · 692274e5
  Paul authored Jun 28, 2019
  
  692274e5
10 May, 2019 1 commit
- remove unnecessary files. · 121e750b
  Shucai Xiao authored May 10, 2019
  
  121e750b
09 May, 2019 1 commit
- temp changes. · 03afa098
  Shucai Xiao authored May 09, 2019
  
  03afa098
06 May, 2019 1 commit
- first implementation of calling GPU int8 gemm correctly. · ab768083
  Shucai Xiao authored May 06, 2019
  
  ab768083
17 Apr, 2019 2 commits
- add a pass to adjust gpu memory allocation · bbd97e94
  Shucai Xiao authored Apr 17, 2019
  
  bbd97e94
- Make reshape require standard shape input · b49d8e66
  Paul authored Apr 17, 2019
  
  b49d8e66
16 Apr, 2019 1 commit
- add a pass to resolve the problem that hip_allocation shape is different from... · a0e4cdb6
  Shucai Xiao authored Apr 16, 2019
```
add a pass to resolve the problem that hip_allocation shape is different from instruction output shape.
```
  a0e4cdb6
13 Apr, 2019 1 commit
- Rename const prop pass · 751baff3
  Paul authored Apr 12, 2019
  
  751baff3
28 Mar, 2019 1 commit
- renamed to eliminate_pad, changed symmetric function · faddc14e
  Khalique authored Mar 28, 2019
  
  faddc14e
26 Mar, 2019 2 commits
- Add an env var to enable the scheduler · 458ec149
  Paul authored Mar 26, 2019
  
  458ec149
- initial progress on pad_rewrite, fixes inceptionv3 onnx perf · 9f25ffb7
  Khalique authored Mar 25, 2019
  
  9f25ffb7
19 Mar, 2019 1 commit
- added tests, adjusted pass to eliminate identities only · fd150551
  Khalique authored Mar 19, 2019
  
  fd150551