Commits · c7419a9c561ac87e858406ea510ec6640a9efcdc · gaoqiong / MIGraphX

"vscode:/vscode.git/clone" did not exist on "3fc706744ab32b768111971e8f60680161451c20"

09 Feb, 2022 1 commit
- Enable pointwise fusion by default (#1082) · c7419a9c
  Paul Fultz II authored Feb 09, 2022
```
There is now a MIGRAPHX_DISABLE_POINTWISE_FUSION to disable it
```
  c7419a9c
11 Nov, 2021 1 commit

Conditionally enable pointwise fusion (#992) · 157935ff

Paul Fultz II authored Nov 10, 2021

This enables the pointwise fusions using the MIGRAPHX_ENABLE_POINTWISE_FUSION env variable. Its disabled by default since MIOpen fusions need to be refactored.

This also adds a compile_ops pass to compile the pointwise modules. All tests except test_gpu_fast_math passes with MIGRAPHX_ENABLE_POINTWISE_FUSION=1 set.

157935ff

08 Oct, 2021 1 commit

Remove alpha and beta from `dot` and `quant_dot` (#961) · 21193e87

Umang Yadav authored Oct 08, 2021

Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.

Aim is to have the definition of dot operator as C = A . B without having alpha or beta.

In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.

21193e87

17 Sep, 2021 2 commits

Revert "Remove alpha and beta attributes from dot operator (#945)" (#957) · 985f58b0
Paul Fultz II authored Sep 17, 2021
```
This reverts commit 9e43cb8b.
```
985f58b0

Remove alpha and beta attributes from dot operator (#945) · 9e43cb8b

Umang Yadav authored Sep 17, 2021

This PR aims to remove alpha and beta attributes from dot operator completely.

Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.

Aim is to have the definition of dot operator as C = A . B without having alpha or beta.

9e43cb8b

10 Sep, 2021 1 commit
- Assert the shape for compute and compute_shape are the same (#936) · 8b4c69c5
  Paul Fultz II authored Sep 10, 2021
```
Assert shapes dont change
Co-authored-by: Shucai Xiao <Shucai.Xiao@amd.com>
```
  8b4c69c5
18 Aug, 2021 1 commit

Optimize Q/DQ Format Pass (#889) · 0b5f33b6

turneram authored Aug 18, 2021

* Add operators, refactor parsers, add rewrite passes, add tests

* Add ref implementations

* Move broadcasting of scales and zero points to onnx parser

* Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type

* Switch certain variables to int64_t

* Fix overflow in implicit constant conversion

* Remove operators.hpp from includes in tf_test.cpp

* Add conversion for int32 input to quantizelinear and add test case; remove operators.hpp from onnx_test.cpp includes

* Switch dequantizelinear math from int32 to float

* Remove changes to operators.hpp

* Simplify apply_quantizelinear

* Add verify test for int32 data

* Add rewrite_quantization back to CMakeLists

* Add passes to insert qdq after add_bias is applied, replace quant_ops, and remove remaining qdq pairs

* Renaming, refactoring, cleaning up code, adding formal test, and adding passes to targets

* Renaming, review comments, begin adding more specific tests

* Add more specific unit tests

* Fix failing test on CI

* Correct matcher and update qop rewriting, update tests and add more tests

* Update matcher, clean up simplify_qdq, tweak tests

* Add tests, remove pass from CPU target, update dot parameters, clean up simplify_qdq

* Fix correctness bug in ref q/dq implementations; edit gemm parser to make beta always 0.0

* Remove unused variables in onnx gemm tests

0b5f33b6

15 Jul, 2021 1 commit

Quantize linear ops (#843) · 3282e01a

turneram authored Jul 15, 2021

* Add operators, refactor parsers, add rewrite passes, add tests

* Formatting

* Fix cppcheck

* Review comments

* Formatting

* Combine rewrite passes

* Formatting

* Add ref implementations

* Formatting

* Review comments

* Formatting

* Tidy warnings

* Apply review comments

* Formatting

* Fix CI error

* Formatting

* Increase code coverage

* Formatting

* Move broadcasting of scales and zero points to onnx parser

* Formatting

* Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type

* Formatting

* Increase code coverage

* Formatting

* Switch certain variables to int64_t

* Formatting

* Fix overflow in implicit constant conversion

* Formatting

* Increase code coverage

* Formatting

* Remove operators.hpp from includes in tf_test.cpp

* Formatting

* Add conversion for int32 input to quantizelinear and add test case; remove operators.hpp from onnx_test.cpp includes

* Formatting

* Switch dequantizelinear math from int32 to float

* Formatting

* Remove changes to operators.hpp

* Simplify apply_quantizelinear

* Formatting

* Add verify test for int32 data

* Add rewrite_quantization back to CMakeLists

3282e01a

08 Jul, 2021 1 commit

Preallocate parameters on the CPU and unify preallocations (#840) · 427fc25c

Paul Fultz II authored Jul 08, 2021



* Add preallocate method

* Add preallocate_param pass

* Preallocate buffers on the cpu

* Formatting

* Preallocate on the gpu

* Add missing cpp file

* Formatting

* Add lifetime function

* Formatting

* Always allocate

* Fix tidy warning

* Add const

* Add missing lifetime annotations
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

427fc25c

27 Jun, 2021 1 commit

Inline subgraph (#802) · bc52a8a8

Shucai Xiao authored Jun 27, 2021



* Add definitions for all pointwise operators

* Formatting

* Add cpp generator class

* Formatting

* Move compilation to core

* Formatting

* Add clock to tmp name

* Add dynamic loader

* Formatting

* Add tests for code gen

* Formatting

* Add test for literals

* Formatting

* Use with_char

* Add missing header

* Fix mismerge

* Ignore tidy warning

* Fxx gcc 5 errors

* Apply fixits

* Skip signed bitwise of status

* Remove unused parameters

* Explicitly add c++14 flag

* Fix tidy warning

* unify the compute function signature

* clang format

* make another change

* unify the compute function

* clang format

* remove unnecessary code

* more refinement about the operator compute funciton

* clang format

* add an overload function

* clang format

* add support for axes inputs for sequeeze/unsqueeze/reduce_sum

* clang format

* fix build problems

* backup code changes

* clang format

* Add tuple type to shape class

* Formatting

* fix a bug in parsing quantizelinear operator

* clang format

* fix a cppcheck error

* disable different versions of unit tests for different onnx version

* clang format

* upgrade onnx to 1.8

* update onnx to 1.8.1

* disable two more real models

* clang format

* Make data member private

* Formatting

* Add sub arguments

* Formatting

* Trun clang format off

* Disable clang-format

* fix review comments

* fix the function of assign axes in parsing the squeeze operator

* add unit tests and fix a bug

* clang format

* fix review comments

* clang format

* fix a build error

* backup code changes

* clang format

* add more unit tests and add parsing opset version

* clang format

* Improve visiting tuples

* Formatting

* fix cppcheck error

* adding installing the onnx package

* resolve no protobuf compiler

* add an inline subgraph pass

* clang format

* Add more argument tests

* Formatting

* Handle tuple in load

* Formatting

* code backup

* clang format

* Remove .o files

* Add tuple type to api

* Formatting

* fix build errors

* clang format

* code backup

* code backup

* add unit tests for the inline subgraph

* clang format

* refine the inline subgraph and parse if operator

* clang format

* fix cppcheck issue

* clang format

* add unit test for inline subgraph pass

* clang format

* fix format issue

* remove the context from the if operator

* clang format

* simplify the compute functions

* Fix tidy warnings

* fix cppcheck error

* clang format

* fix cppcheck error

* Fix tidy warnings

* fix a cppcheck error

* clang format

* Add a test for share method

* Formatting

* Add a test cpp_type

* add unit tests for more code coverage

* clang format

* add unit tests to have more code coverage

* clang format

* try a comment in jenkins build

* include the install onnnx line

* code backup

* reorder the dependenciesd installed

* refine dockerfile

* fix review comments

* clang format

* remove unnecessary overload function

* fix cppcheck error

* change back the argument test

* Suppress tidy warning

* add the operator get_tuple_elem

* clang format

* add get_tuple_elem to operator include file

* chang if to support multiple operation outputs

* clang format

* optimize inline subgraph

* clang format

* code backup

* clang format

* fix bug

* refine unit tests for tuple output of the if operator

* clang format

* refine a instruction replacement code

* add a unit test and sort all the unit tests alphabetically

* fix cppcheck error

* add more unit tests for multiple op outputs

* clang format

* fix cppcheck error

* Update pass manager to get modules after every pass

* more unit test to cover more scenarios

* clang format

* fixed a bug in a unit test

* add more tests

* clang format

* add more unit tests to have more code coverage

* fix a bug in a unit test

* Add program overload for module

* Formatting

* Hash modules for quicker lookup of modules

* Bump file version

* Add methods to remove modules

* Formatting

* add the tuple type to the support list

* Eliminate unused modules

* Formatting

* Fix test errors

* Foramtting

* Fix tidy issues

* fix problem related to inline subgraph

* clang format

* fix review comments

* fix review comments

* fix review comments

* fix review comments

* clang format

* fix a unit test

* one more code change

* remove an optimization related to the if operator

* clang format

* fix review comments
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

bc52a8a8

09 Jun, 2021 1 commit

Asym pad refactor (#791) · 9a5e0c06

kahmed10 authored Jun 09, 2021



* alternative impl

* formatting

* add gpu pass to insert pad

* formatting

* update onnx test, still need cleanup

* formatting

* update tf_test

* modify existing tests

* formatting

* remove print

* code cleanup

* formatting

* code cleanup

* formatting

* fix tidy and cppcheck

* remove variable

* add test

* formatting

* add test and address comments

* formatting
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

9a5e0c06

03 May, 2021 1 commit

Reduce types generated for hip kernels (#814) · 3becd974

Paul Fultz II authored May 03, 2021



* Remove unused data types

* Formatting

* Reduce types generated for hip kernels

* Formatting

* Fix onnx tests

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

3becd974

29 Apr, 2021 1 commit

MLIR MIOpen Dialect integration (phase 1) (#768) (#769) · 56584fa2

SJW authored Apr 29, 2021



* MLIR MIOpen Dialect integration (phase 1) (#768)

* Added Findmlir.cmake (using environment variables to import)

* Added mlir_conv pass to GPU target

  * Apply to any gpu::convolution if supported by MLIR

  * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution

  * Capture binary in dictionary for matching convolutions

  * Build a code_object_op with the binary and execution dimensions

  * Substitute for the gpu::convolution

* Changed the parameters for the code_object to reflect the generated MLIR kernel

* Expanded out MemRefDescriptor fields in param list

* Also updated for MLIR C-API changes

* * fixed global_size calculation

* MLIR MIOpen Dialect integration (phase 1) (#768)

* Added Findmlir.cmake (using environment variables to import)

* Added mlir_conv pass to GPU target

  * Apply to any gpu::convolution if supported by MLIR

  * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution

  * Capture binary in dictionary for matching convolutions

  * Build a code_object_op with the binary and execution dimensions

  * Substitute for the gpu::convolution

* Changed the parameters for the code_object to reflect the generated MLIR kernel

* Expanded out MemRefDescriptor fields in param list

* Also updated for MLIR C-API changes

* * Added command line option: --enable_mlir

* * fixed command line switch

* updated for new MLIR API changes

* * Added cget llvm-project-mlir to import MIIR API libraries into Dockerfile
  * removed cmake Findmlir

* updated for changes in MIIR C-API

* * updated CMakeLists.txt to allow disable of MLIR import

* fixed memory leaks and removed copies

* updated for 5D memrefs

* * formatting

* * fixed review comments

* * fixed merge issues

* hip gcnDeviceName now includes specifiers at the end
  * use major/minor values instead

* * disable MLIR by default

* * removed command-line switch --enable-mlir

* * fix unused when MLIR disabled

* * enable jenkins enable/test MLIR

* * format

* * fixed clang-tidy

* * added new type
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

56584fa2

26 Feb, 2021 1 commit

Add more supported operators and optimizations for the cpu backend (#746) · a0b570b2

Paul Fultz II authored Feb 26, 2021



* Add eliminate_data_type pass

* Formatting

* Auto convert quant ops

* Formatting

* Flip the order of decompose

* Compute max size differently

* Formatting

* Clamp values in convert

* Formatting

* Fix loss of precision in reduce

* Formatting

* Fix bugs in reduction

* Fix accumulator type in reference softmax implementation

* Formatting

* Update convert test

* Remove unused variables

* Remove unnecessary quant_dot check

* Formatting

* Add tests

* Formatting

* Remove unused code

* Remove duplicate ops

* Remove blaze dependency

* Use set since shape::type_t is no hashable on gcc 5

* Formatting

* Add dnnl binary op

* Formatting

* Add binary and eltwise

* Formatting

* Add softmax

* Formatting

* Remove unused operators

* Add missing files

* Formatting

* Add lrn

* Formatting

* Add deconvolution

* Formatting

* Change allocate default

* Add reorder

* Formatting

* Add reductions

* Formatting

* Sort lines

* Change literals in another loop

* Add pow operator

* Formatting

* Add pow operator

* Formatting

* Make sure shapes are packed

* Allow broadcasted inputs

* Remove unused operators

* Simplify functions

* Remove softmax

* Add sub and erf functions

* Formatting

* Fix bug

* Formatting

* Improve parallism

* Formatting

* Allow multiple batch dimensions

* Formatting

* Move literal transforms out of lowering

* Formatting

* Add gather operator

* Sort lines

* Add early exit for carry

* Formatting

* Add missing concat

* Rename macro

* Fix deep nesting

* Formatting

* Fix cppcheck issues

* Remov else

* Move attribute to typedef

* Formatting

* Disable maybe-uninitialized warning since its broken on gcc

* Add constexpr default constructor

* Formatting

* Fix compiler warnings

* Fix adjust_allocation test
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

a0b570b2

05 Feb, 2021 1 commit

Normalize compute methods (#723) · 549cfe72

Paul Fultz II authored Feb 05, 2021



* Normalize compute functions

* Formatting

* Save normalization flag to the file

* Formatting

* Remove tuned functions

* Formatting

* Use in_index
Co-authored-by: Shucai Xiao <shucai@gmail.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

549cfe72

14 Dec, 2020 1 commit

Use dnnl for cpu backend (#688) · 406afeb8

Paul Fultz II authored Dec 14, 2020



* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Add onednn

* Formatting

* Formatting

* Add dnnl header

* Formatting

* Rewrite rnn first

* Formatting

* Call reference implementation

* Formatting

* Make literal data shared

* Formatting

* Add convolution

* Formatting

* Compensate for dilation

* Formatting

* Use name/make_op instead

* Formatting

* Rename gemm header

* Formatting

* Add dnnl convolution/gemm operators

* Formatting

* Add eliminate_contiguous

* Add faster pointwise operators

* Formatting

* Formatting

* Formatting

* Add dnnl op class

* Formatting

* Add add op

* Formatting

* Add concat operator

* Formatting

* Add more ops

* Create descriptor during finalization

* Formatting

* Dont rewrite pooling

* Enable memory coloring

* Formatting

* Add output aliases

* Formatting

* Fix errors

* Formatting

* Convert literals

* Add missing file

* Remove batch_norm

* Formatting

* Use strides

* Formatting

* Add some debug checks

* Formatting

* Fix big in adjusting shape for gemm

* Formatting

* Fix fallback dot operator

* Zero initialize buffers

* Add suport for group convolutions

* Formatting

* Make adjust allocation target independent

* Formatting

* Enable adjust_allocation for gpu/cpu

* Formatting

* Add copy to allocation model

* Formatting

* Add copy operator

* Formatting

* Better handling of output parameters in adjust_allocation

* Formatting

* Build with dnnl

* Make dnnl required

* Fix compile error

* Tidy fixes

* Formatting

* Tidy fixes

* Formatting

* Fix more tidy issues

* Formatting

* Add mul op

* Add mul op

* Set c compiler to clang as well

* Compensate for normalized compute shape

* Formatting

* Fix cppcheck errors

* Formatting

* Add onednn library to hcc

* Guard clang pragmas

* Disable cpu mode for gcc for now

* Leave it enabled it for gcc 7

* Fix cppcheck suppresion

* Fix compile error on gcc 5

* Remove unused code
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

406afeb8

16 Nov, 2020 1 commit

Normalize ops (#667) · 8443ecd1

Shucai Xiao authored Nov 16, 2020



* add a pass to normalize ops

* clang format

* add unit tests

* clang format

* code backup

* clang format

* code backup

* clang format

* add support for slice in the normalize_op function

* clang format

* add operation method api for whether we need to call normalize_op

* clang format

* fix review comments

* clang format

* rename a function namejJ

* clang format

* change compute_shape to normalize_compute_shape for corresponding operators

* clang format

* remove unnecessary code

* fix various issues

* clang format

* add attributes to operators having axis attributes

* clang format

* fixed jenkins build error

* clang format

* fix a bug related to slice

* clang format

* code backup

* clang format

* code backup

* clang format

* rename a file

* fix cppcheck error

* some code refinement

* clang format

* change attributes to enum

* clang format

* refine the enum

* clang format

* remove unnecessary code

* add unit tests for more code coverage and fixed a bug

* clang format

* remove unnecessary changes

* change normalize_axes to normalize

* clang format

* revert back the changes in broadcast.hpp

* rename normalize_axes to normalize

* fix review comments

* clang format

* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Formatting

* Try to avoid ambiguous assign in value class

* fixed a build error

* clang format

* add the normalize_ops pass to the ref target

* refactor program to module to normalize_ops pass
Co-authored-by: Paul <pfultz2@yahoo.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

8443ecd1

10 Nov, 2020 1 commit

Add flag to enable cpu backend (#680) · d39e51ed

Paul Fultz II authored Nov 10, 2020

* Add flag to enable cpu backend

* Make buffers shared

* Enable optimizations

* Formatting

* Enable cpu backend for gcc builds

d39e51ed

08 Oct, 2020 1 commit

Add build flag for fast math (#639) · a5065265

kahmed10 authored Oct 08, 2020



* add flag

* formatting

* remove env variable

* fix api expression

* add api test

* add api test

* add op test

* formatting

* fix function name

* fix syntax

* formatting

* modify test

* remove test and update doc

* move test to new file

* formatting

* revert test files

* rewrite check

* New
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

a5065265

14 Sep, 2020 1 commit

Some perf improvements to bert (#627) · 9f283810

Paul Fultz II authored Sep 14, 2020



* Fuse gemm in fuse ops

* Formatting

* Add const ref

* Remove assert

* Skip already fused gemms

* Skip already fused gemm

* Formatting

* Use float_equal

* Avoid non-standard shapes for inputs

* Formatting
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

9f283810

10 Sep, 2020 1 commit

Add load/save function for program (#623) · 63c5582a

Paul Fultz II authored Sep 09, 2020



* Add save/load functions

* Formatting

* Add loading and saving to the driver

* Formatting

* Add return

* Serialize the context with the program

* Formatting

* Add python API

* Formatting

* Add c/c++ apis

* Formatting

* Add tests

* Formatting

* Fix tidy error

* Fix python doc

* Restore python code

* Add function name to errors

* Formatting

* Use lvalue for writing

* Serialize context

* Fix convolution and pooling operator for miopen

* Formatting

* Add const ref

* Set target name to gpu

* Add target tests

* Formatting

* Move register target to cpp file

* Fix target test

* Use make_target in driver

* Formatting

* Use make_target for the API

* Formatting

* Add cpu include

* Increase timeout

* Add more tests

* Formatting
Co-authored-by: Shucai Xiao <shucai.xiao@amd.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

63c5582a

13 Aug, 2020 1 commit

integrate onnx backend test suit to migraphx (#574) · d612e976

Shucai Xiao authored Aug 13, 2020



* initial progress

* formatting

* add pooling changes

* formatting

* change eliminate_pad

* formatting

* rename var

* fomratting

* update op shape test and compute

* formatting

* revert conv constructor

* formatting

* change initializer

* formatting

* fix tidy

* change quant conv and shape check

* add tests and fixes

* formatting

* fix type

* fix conv test

* formatting

* add pooling and bn tests

* formatting

* add inconsistent attr tests

* fix padding issue

* formatting

* progress on 1d to 2d

* formatting

* change compute and compile functions

* formatting

* fix duplicate

* fix conflict

* fix issue with 1d conv

* formatting

* add check for 3d limit

* rename function

* formatting

* update to MIOPen 2.3

* add support for nd pooling

* formatting

* test miopen 2.4

* change function name

* rename functions

* formatting

* add op_shape test

* add gpu ops tests

* formatting

* initial progress

* formatting

* add pkg-config

* add to support asymmetric padding of averagepool

* clang format

* fix bug for average pooling

* clang format

* fix a bug

* add unit tests for the asymmetric padding of averagepool

* clang format

* change functions

* formatting

* additional code refinement

* clang format

* check existing tests

* formatting

* change to copy_backward

* formatting

* change for loop to transform

* formatting

* add tests

* formatting

* remove comment

* add more tests

* remove an optimization for pooling

* clang format

* add and fix unit tests

* clang format

* update gpu miopen calls

* formatting

* initial progress

* add cpu impl and tests

* formatting

* add NOLINT

* add 3d test

* formatting

* add more op_shape tests

* test diff miopen version

* add submodule onnx

* add pooling shape tests

* fix error msg

* add onnx_test_backend

* reorganize python code

* temp disable test

* fix cppcheck error

* fix cppcheck error

* code backup

* add support device choice

* refine onnx backend test

* revert to miopen 2.4

* fix review comments

* fix review comments

* clang format

* fixed review comments

* clang format

* fix cppcheck error

* copy onnx_backend_test to dest when building

* add testdata folder

* fix bounds

* formatting

* code backup

* code backup

* remove unnecessary file

* fix various bugs

* remove unnecessary changes

* remove unnecessary submodule

* remove unnecessary lines

* fix algorithm

* formatting

* refine onnx backend unit tests

* pin numpy version

* fix build issue

* fixed a filename to be copied

* add the onnx dependency in docker image

* ensure results are copied back correctly

* specify onnx version

* update excluded tests

* remove unnecessary log info

* turn on more unit tests

* restrict onnx backend test to python 3.x

* clang format

* refine retrieving the input parameters

* clang format

* fix program input parameter names

* clang format

* avoid running onnx test in python 2.x

* fix cppcheck error

* fix python2.7 backend unit tests error

* clang format

* resolve the issue of ensure data copy to be completed

* clang format

* fix review comments

* fix onnx backend unit test error

* another change to make onnx backend test pass

* clang format

* fix onnx backend test error

* clang format

* disable onnx backend test to try

* build try

* update Dockerfile to try onnx backend test

* remove unnecessary code

* fix a bug in copying program

* clang format

* update dockerfile to include onnx

* fix review comments

* add the pytest module to the container

* exclude real model to avoid to be downloaded

* resolve the sync device for data copy from gpu to cpu

* clang format

* fix review comments

* clang format

* move sync_device after memory_coloring
Co-authored-by: Khalique <15948690+kahmed10@users.noreply.github.com>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>
Co-authored-by: Paul Fultz II <pfultz2@yahoo.com>

d612e976

10 Jul, 2020 1 commit

Optimize multiply across slices (#568) · e66968a2

Paul Fultz II authored Jul 10, 2020



* Add initial optimization when using a mul over a sliced convolution

* Formatting

* Add more tests

* Formatting

* Convert to an assert

* Check if used once

* Formatting

* Add test with horiz fusion

* Formatting

* Optimize nested slice

* Formatting

* Fix test

* Add const refs

* Remove unnecessary assert
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

e66968a2

20 May, 2020 1 commit

Rnn variable seq lengths (#517) · 90200619

Shucai Xiao authored May 19, 2020



* code backup

* clang format

* fix compiling errors

* clang format

* rename a few files

* rename a few files

* fix variable bugs

* clang format

* add an operator to shift input sequences

* clang format

* fixed a bug

* clang format

* fixed a bug

* clang format

* code backup

* clang format

* code backup

* clang format

* code backup

* clang format

* refine code related lstm operator optimization

* clang format

* fix various bugs

* clang format

* fixed a bug in rewrite_lstm

* clang format

* fixed another bug

* refine two operator names

* clang format

* refine file names

* fix cppcheck error

* clang format

* fix cppcheck error

* clang format

* fix cppcheck error

* fixed review comments

* clang format

* add unit tests

* clang format

* add unit tests

* clang format

* refine unit tests for better coverage

* clang format

* fixed a bug

* fix cppcheck error

* fix review comments

* clang format

* rename two operators according to review comments

* clang format

* fix review comments

* clang format

* fix review comments

* clang format

* fix review comments

* fix a cppcheck error

* clang format

* fix review comments

* clang format
Co-authored-by: Shucai Xiao <scxiao@prj47-rack-99.local.lan>
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

90200619

08 May, 2020 1 commit

Horizontal fusions of gemms and convolutions (#472) · 1a4ff504

Paul Fultz II authored May 08, 2020



* Add decompose pass

* Add decompose test

* Formatting

* Add remap

* Formatting

* Add compute method for dot

* Formatting

* Add finder for horizontal fusion

* Formatting

* Formatting

* Reuse predicate

* Add gemm fusions

* Formatting

* Add some fixes for convolution

* Formatting

* Fix shape tests

* Formatting

* Reuse axis equal

* Add initial split fusion

* Formatting

* Update offset

* Workaround outputs that cant accept nonstandard shapes

* Formatting

* Add check for split concat

* Formatting

* Add missing headers

* Formatting

* Add tests

* Formatting

* Add more testing

* Formatting

* Fix when there is duplicate splits in inputs

* Formatting

* Fix mismatch iterators

* Add tests for dot fusions

* Formatting

* Add test for convolution

* Formatting

* Fix tidy issues

* Add more tests

* Formatting

* Ignore build directory for codecov

* Add test for groups

* Formatting

* Add more tests for groups

* Formatting

* Add test for missing end slice

* Add newline

* Remove unused function

* Add support for when beta is not 1

* Formatting

* Add test for scalar

* Add one more scalar test
Co-authored-by: mvermeulen <5479696+mvermeulen@users.noreply.github.com>

1a4ff504

15 Nov, 2019 1 commit

Add option to do offload copying automatically (#403) · 81b0ff5d

Paul Fultz II authored Nov 15, 2019

* Add compiler options

* Add copy operators

* Formatting

* Use run_passes in tests

* Formatting

* Use run_pass in schedule test

* Formatting

* Add compile_options to get_passes in target

* Formatting

* Offload copy option

* Formatting

* Copy using pinned memory

* Formatting

* Improve performance of gpu copying

* Formatting

* Dont copy

* Formatting

* Always make an extra copy

* Formatting

* Remove unused write op

* Add missing include

* Remove copy_to_gpu function in python api

* Make offload copy disabled by default on C++

* Formatting

* Fix tidy issues

* Formatting

* Fix namespace

* Fix python tests

* Turn clang format off since its broken

* Fix compile error on gcc 5

* Remove commented code

81b0ff5d

04 Nov, 2019 1 commit

Fix accuraccy issue in resnet50 (#395) · 78c83426

Paul Fultz II authored Nov 04, 2019

* Fix bug in eliminate_concat

* Formatting

* Skip context_free operators

* Formatting

* Fix unit test

* Formatting

78c83426

30 Oct, 2019 1 commit

Enable scheduler for 1 stream (#399) · ca17bcd6

Paul Fultz II authored Oct 30, 2019

* Enable scheduler for 1 stream

* Formatting

* Improve performance of sorting

* Formatting

* Adjust the weight calculation

* Formatting

* Simplify formula

* Formatting

* Avoid division by zero

* Fix scheduler test

* Check for either 1 or 2

* Check for waits when order may change

* Formatting

ca17bcd6

28 Aug, 2019 1 commit
- Fix bug in cse pass · 1290f3ba
  Paul authored Aug 28, 2019
  
  1290f3ba
26 Aug, 2019 2 commits
- clang format · 8fbd2874
  Shucai Xiao authored Aug 26, 2019
  
  8fbd2874
- refine int8 quantization interface · 12ff93ac
  Shucai Xiao authored Aug 26, 2019
  
  12ff93ac
16 Aug, 2019 1 commit
- Replace with literal · b324b9ed
  Paul authored Aug 16, 2019
  
  b324b9ed
15 Aug, 2019 1 commit
- Propogate const for contiguous · a6c9ad80
  Paul authored Aug 14, 2019
  
  a6c9ad80
12 Aug, 2019 1 commit
- Add a pass to transform pooling to reduce_mean · 379ef733
  Paul authored Aug 12, 2019
  
  379ef733
06 Aug, 2019 1 commit
- add a pass for packing int8 op inputs · e60aff63
  Shucai Xiao authored Aug 06, 2019
  
  e60aff63
10 Jul, 2019 1 commit
- Rename batchnorm pass · 5c7bee3a
  Paul authored Jul 10, 2019
  
  5c7bee3a
09 Jul, 2019 1 commit
- Check for adds beign used once · db3f1478
  Paul authored Jul 09, 2019
  
  db3f1478
28 Jun, 2019 1 commit
- Change pass ordering to improve performance · 692274e5
  Paul authored Jun 28, 2019
  
  692274e5
10 May, 2019 1 commit
- remove unnecessary files. · 121e750b
  Shucai Xiao authored May 10, 2019
  
  121e750b
09 May, 2019 1 commit
- temp changes. · 03afa098
  Shucai Xiao authored May 09, 2019
  
  03afa098