1. 09 Mar, 2022 1 commit
  2. 08 Feb, 2022 1 commit
  3. 10 Jan, 2022 1 commit
  4. 30 Nov, 2021 1 commit
  5. 09 Nov, 2021 1 commit
  6. 08 Oct, 2021 1 commit
    • Umang Yadav's avatar
      Remove alpha and beta from `dot` and `quant_dot` (#961) · 21193e87
      Umang Yadav authored
      Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.
      
      Aim is to have the definition of dot operator as C = A . B without having alpha or beta.
      
      In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.
      21193e87
  7. 17 Sep, 2021 2 commits
    • Paul Fultz II's avatar
      985f58b0
    • Umang Yadav's avatar
      Remove alpha and beta attributes from dot operator (#945) · 9e43cb8b
      Umang Yadav authored
      This PR aims to remove alpha and beta attributes from dot operator completely.
      
      Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs.
      
      Aim is to have the definition of dot operator as C = A . B without having alpha or beta.
      
      In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.
      9e43cb8b
  8. 09 Jun, 2021 1 commit
    • kahmed10's avatar
      Asym pad refactor (#791) · 9a5e0c06
      kahmed10 authored
      
      
      * alternative impl
      
      * formatting
      
      * add gpu pass to insert pad
      
      * formatting
      
      * update onnx test, still need cleanup
      
      * formatting
      
      * update tf_test
      
      * modify existing tests
      
      * formatting
      
      * remove print
      
      * code cleanup
      
      * formatting
      
      * code cleanup
      
      * formatting
      
      * fix tidy and cppcheck
      
      * remove variable
      
      * add test
      
      * formatting
      
      * add test and address comments
      
      * formatting
      Co-authored-by: default avatarShucai Xiao <shucai@gmail.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      9a5e0c06
  9. 25 Mar, 2021 1 commit
    • Paul Fultz II's avatar
      Add cpu fusion for gelu and layernorm (#761) · 728d083d
      Paul Fultz II authored
      
      
      * Add eliminate_data_type pass
      
      * Formatting
      
      * Auto convert quant ops
      
      * Formatting
      
      * Flip the order of decompose
      
      * Compute max size differently
      
      * Formatting
      
      * Clamp values in convert
      
      * Formatting
      
      * Fix loss of precision in reduce
      
      * Formatting
      
      * Fix bugs in reduction
      
      * Fix accumulator type in reference softmax implementation
      
      * Formatting
      
      * Update convert test
      
      * Remove unused variables
      
      * Remove unnecessary quant_dot check
      
      * Formatting
      
      * Add tests
      
      * Formatting
      
      * Remove unused code
      
      * Remove duplicate ops
      
      * Remove blaze dependency
      
      * Use set since shape::type_t is no hashable on gcc 5
      
      * Formatting
      
      * Add dnnl binary op
      
      * Formatting
      
      * Add binary and eltwise
      
      * Formatting
      
      * Add softmax
      
      * Formatting
      
      * Remove unused operators
      
      * Add missing files
      
      * Formatting
      
      * Add lrn
      
      * Formatting
      
      * Add deconvolution
      
      * Formatting
      
      * Change allocate default
      
      * Add reorder
      
      * Formatting
      
      * Add reductions
      
      * Formatting
      
      * Sort lines
      
      * Change literals in another loop
      
      * Add pow operator
      
      * Formatting
      
      * Add pow operator
      
      * Formatting
      
      * Make sure shapes are packed
      
      * Allow broadcasted inputs
      
      * Remove unused operators
      
      * Simplify functions
      
      * Remove softmax
      
      * Add sub and erf functions
      
      * Formatting
      
      * Fix bug
      
      * Formatting
      
      * Improve parallism
      
      * Formatting
      
      * Allow multiple batch dimensions
      
      * Formatting
      
      * Move literal transforms out of lowering
      
      * Formatting
      
      * Add gather operator
      
      * Sort lines
      
      * Add early exit for carry
      
      * Formatting
      
      * Add missing concat
      
      * Rename macro
      
      * Fix deep nesting
      
      * Formatting
      
      * Fix cppcheck issues
      
      * Remov else
      
      * Move attribute to typedef
      
      * Formatting
      
      * Disable maybe-uninitialized warning since its broken on gcc
      
      * Add constexpr default constructor
      
      * Formatting
      
      * Fix compiler warnings
      
      * Fix adjust_allocation test
      
      * Add layernorm matcher
      
      * Add gelu_erf matcher
      
      * Formatting
      
      * Add gelu_tanh matcher
      
      * Formatting
      
      * Remove match namespace
      
      * Formatting
      
      * Use matcher instead of string
      
      * Formatting
      
      * Add fusions
      
      * Formatting
      
      * Make input a const ref
      
      * Make this explicit for gcc 5
      Co-authored-by: default avatarShucai Xiao <shucai@gmail.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      728d083d
  10. 08 Jan, 2021 1 commit
    • Paul Fultz II's avatar
      Revamp CI infrastucture (#706) · ceb4ca09
      Paul Fultz II authored
      
      
      * Add build and test github workflow
      
      * Fix cget command
      
      * Remove def-requirements.txt
      
      * Add tmate session to debug workflow
      
      * Run tmate session after installing dependencies
      
      * Print date periodically
      
      * Add clang tidy action
      
      * Seperate build and run container in two different jobs
      
      * Run bash script
      
      * Remove interactive flag
      
      * Try to mount the files
      
      * Try to use the github workspace
      
      * WIthout double braces
      
      * Use env variable
      
      * Pipe bash script in
      
      * Run using hip-clang
      
      * Use correct path
      
      * Add verbose
      
      * Remove j flag
      
      * Only run for onnx file to debug
      
      * Manually run clang-tidy
      
      * Remove quiet flag
      
      * Print header file
      
      * Printout environment
      
      * Remove extra defines
      
      * Remove fixits and config flag
      
      * Show ldd
      
      * Add tmate session
      
      * Run onnx protobuf first
      
      * Generate proto for tensorflow
      
      * Update cppcheck version
      
      * Fix some cppcheck issues
      
      * Add const
      
      * Cppcheck fixes
      
      * Formatting
      
      * Fix more cppcheck issues
      
      * Run two jobs
      
      * Cache analysis and run format checking
      
      * Fix yaml issues
      
      * Fix yaml issues
      
      * Fix indentation
      
      * Switch to hip-clang for main docker file
      
      * Use hip-clang in the readme
      
      * Fixes for jenkins
      
      * Use ccache to build
      
      * Combine file
      
      * Set restore keys
      
      * Change stage name
      
      * Build with ccache
      
      * Add missing dependency for ccache
      
      * Build debug with codecov
      
      * Fix workflow syntax
      
      * Fix list
      
      * Use quotes
      
      * Got to correct build path
      
      * Install lcov
      
      * Use sudo
      
      * Echo all commands
      
      * Setup tmate
      
      * Add verbose output
      
      * Build with cmake directly
      
      * Add pthread flag
      
      * Remove python config
      
      * Continue on error
      
      * Use on or off for cmake flag
      
      * Use always upload cache
      
      * Verbose output
      
      * Verbose output from build
      
      * Build one target
      
      * Reduce debug symbols
      
      * Increase garbage collection
      
      * Remove dmesg
      
      * Increase it to 20
      
      * Update rocm cmake version
      
      * Remove jobs from jenkins
      
      * Run on all 3 ubuntus
      
      * Remove gcc 5 jobs
      
      * Dont add flag on 16.04
      
      * Only upload coverage on 18.04
      
      * Dont build for ubuntu 20.04
      
      * Use matrix.os
      
      * Use O2 for hip-clang since lower optimizations are broken
      
      * Use rocm 3.0
      
      * Pass ccache as cmake variable instead of env variable
      
      * Build miopen from source
      
      * Show ccache statistics
      
      * Print log information
      
      * Set compression level
      
      * Use hash dir
      
      * Set hashdir
      
      * Install clang ocl from system
      
      * Up compression level
      
      * Add locale
      
      * Increase cache size to 1G
      
      * Lower compression level to 9
      
      * Remove split dwarf
      
      * Remove Og
      
      * Add back Og
      
      * Seperate debug and codecov
      
      * Add missing backlash
      
      * Garbage collect more often
      
      * Add missing locales package
      
      * Use Os
      
      * Install onednn in docker and run tests
      
      * Include target headers in tests
      
      * Increase timeout
      
      * Remove if condtion
      
      * Make flag public
      
      * Suppress memory leaks in onednn
      
      * Use equal
      
      * Add gh annotations
      
      * Update rocm-cmake version
      
      * Add ldconfig
      Co-authored-by: default avatarShucai Xiao <shucai@gmail.com>
      ceb4ca09
  11. 06 Jan, 2021 1 commit
    • Shucai Xiao's avatar
      Module impl (#678) · c9b86f1c
      Shucai Xiao authored
      
      
      * add an api get_main_module
      
      * clang format
      
      * modify onnx unit test for module
      
      * clang format
      
      * refactor ops unit test with the get_main_module
      
      * clang format
      
      * code backup
      
      * clang format
      
      * refine module c api
      
      * add python api for module
      
      * clang format
      
      * fix a python api issue
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * refine unit tests changes
      
      * clang format
      
      * code backup
      
      * code backup
      
      * clang format
      
      * defer some changes to later PRs
      
      * change return of get_main_module from ref to pointer
      
      * clang format
      
      * add unit tests for the get_main_module_api
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * add more unit tests for more code change coverage
      
      * clang format
      
      * fixed a unit test error
      
      * clang format
      
      * fix unit test
      
      * clang format
      
      * code backup
      
      * code change for more code coverage
      
      * change program to module in various passes and matcher
      
      * clang format
      
      * modify the pass API
      
      * code backup
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * Add option to no generate a destroy method
      
      * Formatting
      
      * fix some review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * clang format
      
      * code backup
      
      * code backup
      
      * clang format
      
      * fix cppcheck errors
      
      * clang format
      
      * clang format
      
      * fix build errors
      
      * clang format
      
      * modify gpu unit tests to using module
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * Add flag to enable cpu backend
      
      * Make buffers shared
      
      * Enable optimizations
      
      * Formatting
      
      * fix review comments
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * fix a bug related to a unit test
      
      * clang format
      
      * clang format
      
      * fix a build error
      
      * remove unnecessary code
      
      * remove unnecessary files
      
      * code backup
      
      * clang format
      
      * remove the compile function from the module class
      
      * clang format
      
      * clang format
      
      * remove the context parameter from the from_value method of the module class
      
      * code refinement
      
      * clang format
      
      * merge changes from develop branch
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * fix a build error
      
      * fixed a merge error
      
      * fix cppcheck error
      
      * fixed review comments
      
      * clang format
      
      * fix cppcheck error
      
      * fix a cppcheck error
      
      * fix cppcheck error
      
      * fix build error caused by merge
      
      * Add missing has_op function
      
      * Formatting
      
      * merge changes from develop branch
      
      * fix a cppcheck error
      
      * fixed some review comments
      
      * clang format
      
      * remove the begin/end function of the program class
      
      * clang format
      
      * refine code and fix cppcheck error
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * add unit tests for more code coverage
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix a build error in debug mode
      
      * clang format
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      c9b86f1c
  12. 26 Nov, 2020 1 commit
  13. 20 Nov, 2020 1 commit
    • Paul Fultz II's avatar
      Fuse skip layernorm (#683) · 1bfb147d
      Paul Fultz II authored
      
      
      * Unify the vectorized and non-vectorized path
      
      * Formatting
      
      * Make fusion easily extendable
      
      * Add skip layernorm fusion
      
      * Formatting
      
      * Call correct layernorm function
      
      * Fix compile errors
      
      * Add DCE
      
      * Add test for skip layernorm
      
      * Formatting
      
      * Remove unused typedef
      
      * Formatting
      
      * Fix tidy issues
      
      * Formatting
      Co-authored-by: default avatarShucai Xiao <shucai.xiao@amd.com>
      1bfb147d
  14. 11 Nov, 2020 1 commit
  15. 28 Oct, 2020 1 commit
  16. 08 Oct, 2020 1 commit
    • kahmed10's avatar
      Add build flag for fast math (#639) · a5065265
      kahmed10 authored
      
      
      * add flag
      
      * formatting
      
      * remove env variable
      
      * fix api expression
      
      * add api test
      
      * add api test
      
      * add op test
      
      * formatting
      
      * fix function name
      
      * fix syntax
      
      * formatting
      
      * modify test
      
      * remove test and update doc
      
      * move test to new file
      
      * formatting
      
      * revert test files
      
      * rewrite check
      
      * New
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      a5065265
  17. 14 Sep, 2020 1 commit
  18. 25 Aug, 2020 1 commit
    • Paul Fultz II's avatar
      Improve layernorm performance (#613) · 56b3bf58
      Paul Fultz II authored
      * Use increment instead of division to compute register offset
      
      * Formatting
      
      * Limit layernorm to 1024 elements
      
      * Formatting
      
      * Add verification to driver
      
      * Formatting
      
      * Remove early return
      
      * Use block_size 256
      
      * Vectorize the kernel
      
      * Formatting
      
      * Convert to vector type
      
      * Add layernorm tests
      
      * Formatting
      
      * Formatting
      
      * Refactor layernorm to run both algos
      
      * Formatting
      
      * Fix compile error
      
      * Fix tidy warnings
      
      * Formatting
      
      * Add layernorm function
      
      * Formatting
      56b3bf58
  19. 21 Aug, 2020 1 commit
  20. 19 Aug, 2020 1 commit
  21. 18 Aug, 2020 1 commit
    • Paul Fultz II's avatar
      Register all operators in migraphx (#604) · e8be8548
      Paul Fultz II authored
      * Register ops for main migraphx
      
      * Formatting
      
      * Register cpu ops
      
      * Formatting
      
      * Show list of operators in the driver
      
      * Formatting
      
      * Simplify regiter
      
      * Try to register gpu ops
      
      * Fix compiler errors
      
      * Register rest of the gpu operators
      
      * Add some tests
      
      * Formatting
      
      * Fix gcc compiler warnings
      
      * Formatting
      
      * Fix tidy warnings
      
      * Fix compile error
      
      * Use correct op name
      
      * Register layer norm
      
      * Use const ref
      
      * Make run const
      e8be8548
  22. 14 Aug, 2020 1 commit
    • kahmed10's avatar
      Layernorm onnx support (#599) · 2c5d5fee
      kahmed10 authored
      
      
      * fix pad calc
      
      * bert tf passes correctness
      
      * formatting
      
      * add test
      
      * formatting
      
      * remove comment
      
      * add inline
      
      * formatting
      
      * fix order for literal
      
      * formatting
      
      * test no mul_add
      
      * formatting
      
      * debug layernorm
      
      * debug layernorm
      
      * manual merge
      
      * more progress
      
      * formatting
      
      * remove miopen batchnorm
      
      * remove headers
      
      * Fix compile error with no dpp reductions
      
      * fix indices
      
      * formatting
      
      * change matcher
      
      * formatting
      
      * remove binds
      
      * formatting
      
      * disable tf matcher
      
      * formatting
      
      * use fast div
      
      * formatting
      
      * fix matcher
      
      * formatting
      
      * remove comment
      
      * move find_matches
      
      * add assert
      
      * formatting
      
      * fix deepcode issue
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      Co-authored-by: default avatarShucai Xiao <shucai.xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      2c5d5fee
  23. 12 Aug, 2020 1 commit
  24. 08 Jun, 2020 1 commit
    • kahmed10's avatar
      Enable read support for n-dimensional ops (#537) · cb722cf9
      kahmed10 authored
      
      
      * initial progress
      
      * formatting
      
      * add pooling changes
      
      * formatting
      
      * change eliminate_pad
      
      * formatting
      
      * rename var
      
      * fomratting
      
      * update op shape test and compute
      
      * formatting
      
      * revert conv constructor
      
      * formatting
      
      * change initializer
      
      * formatting
      
      * fix tidy
      
      * change quant conv and shape check
      
      * add tests and fixes
      
      * formatting
      
      * fix type
      
      * fix conv test
      
      * formatting
      
      * add pooling and bn tests
      
      * formatting
      
      * add inconsistent attr tests
      
      * fix padding issue
      
      * formatting
      
      * fix review comments, remove duplicate test
      
      * formatting
      
      * fix variable
      
      * fix assert bug
      
      * fix attr check
      
      * remove std
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      cb722cf9
  25. 03 Jun, 2020 1 commit
    • Shucai Xiao's avatar
      Bert fuse slice reshape trans contiguous (#542) · 93be5e2b
      Shucai Xiao authored
      
      
      * fix pad calc
      
      * Add decompose pass
      
      * Add decompose test
      
      * Formatting
      
      * bert tf passes correctness
      
      * formatting
      
      * Add remap
      
      * Formatting
      
      * add test
      
      * formatting
      
      * remove comment
      
      * Add compute method for dot
      
      * Formatting
      
      * add inline
      
      * Add finder for horizontal fusion
      
      * Formatting
      
      * Formatting
      
      * Reuse predicate
      
      * formatting
      
      * fix order for literal
      
      * formatting
      
      * add test for gelu
      
      * formatting
      
      * added add_gelu fusion
      
      * Add gemm fusions
      
      * Formatting
      
      * add files
      
      * formatting
      
      * test no mul_add
      
      * formatting
      
      * progress on div
      
      * formatting
      
      * continue work on pass
      
      * remove layernorm opt
      
      * revert reduce file
      
      * Add some fixes for convolution
      
      * Formatting
      
      * Fix shape tests
      
      * Formatting
      
      * Reuse axis equal
      
      * Add initial split fusion
      
      * Formatting
      
      * Update offset
      
      * Workaround outputs that cant accept nonstandard shapes
      
      * Formatting
      
      * Add check for split concat
      
      * Formatting
      
      * Add missing headers
      
      * Formatting
      
      * Add tests
      
      * Formatting
      
      * add optimization for bert
      
      * code backup for bert optimization
      
      * continue testing
      
      * formatting
      
      * fix matcher
      
      * formatting
      
      * add gelu_fn and tests
      
      * formatting
      
      * fix matcher, remove extra tests
      
      * formatting
      
      * fix matcher
      
      * add missing files
      
      * add find_layernorm
      
      * add add_transpose to cmake file
      
      * code backup for the contigous fusion
      
      * refine ops fusion
      
      * clang format
      
      * fixed bug in previous optimization
      
      * clang format
      
      * add more optimization
      
      * remove unnecessary code
      
      * refinement of the fustion code
      
      * clang format
      
      * fixed a bug
      
      * add used_once
      
      * formatting
      
      * start on new gelu
      
      * formatting
      
      * add matchers in fuse_ops
      
      * formatting
      
      * add dce to fix add_gelu
      
      * add simplify_rsqrt and test
      
      * formatting
      
      * debugging value for matcher
      
      * formatting
      
      * add more to matchers
      
      * formatting
      
      * fix errors
      
      * remove onnx gen
      
      * add any_arg, change matchers to use either_arg
      
      * formatting
      
      * clang format
      
      * formatting
      
      * add used_once
      
      * formatting
      
      * code cleanup
      
      * clang format
      
      * fixed a bug
      
      * remove unnecessary code
      
      * refine comments
      
      * optimize bert to remove more contiguous
      
      * clang format
      
      * remove unnecessary code
      
      * add unit tests for bert optimization
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * refine a fusion of reshape and slice
      
      * clang format
      
      * fix cppcheck error
      
      * fix review comments
      
      * add the fusion of slice and transpose
      
      * clang format
      
      * add another optimization to fuse slice and transpose
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      Co-authored-by: default avatarKhalique <15948690+kahmed10@users.noreply.github.com>
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      Co-authored-by: default avatarShucai Xiao <scxiao@prj47-rack-99.local.lan>
      93be5e2b
  26. 15 May, 2020 1 commit
    • kahmed10's avatar
      Add gelu optimization (#521) · 0079028a
      kahmed10 authored
      
      
      * fix pad calc
      
      * bert tf passes correctness
      
      * formatting
      
      * add test
      
      * formatting
      
      * remove comment
      
      * add inline
      
      * formatting
      
      * fix order for literal
      
      * formatting
      
      * add test for gelu
      
      * formatting
      
      * added add_gelu fusion
      
      * add files
      
      * formatting
      
      * remove layernorm opt
      
      * revert reduce file
      
      * add gelu_fn and tests
      
      * formatting
      
      * fix matcher, remove extra tests
      
      * formatting
      
      * fix matcher
      
      * add used_once
      
      * formatting
      
      * start on new gelu
      
      * formatting
      
      * add matchers in fuse_ops
      
      * formatting
      
      * add dce to fix add_gelu
      
      * add simplify_rsqrt and test
      
      * formatting
      
      * debugging value for matcher
      
      * formatting
      
      * add more to matchers
      
      * formatting
      
      * fix errors
      
      * remove onnx gen
      
      * add any_arg, change matchers to use either_arg
      
      * formatting
      
      * formatting
      
      * add used_once
      
      * formatting
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      0079028a
  27. 08 May, 2020 1 commit
    • Paul Fultz II's avatar
      Horizontal fusions of gemms and convolutions (#472) · 1a4ff504
      Paul Fultz II authored
      
      
      * Add decompose pass
      
      * Add decompose test
      
      * Formatting
      
      * Add remap
      
      * Formatting
      
      * Add compute method for dot
      
      * Formatting
      
      * Add finder for horizontal fusion
      
      * Formatting
      
      * Formatting
      
      * Reuse predicate
      
      * Add gemm fusions
      
      * Formatting
      
      * Add some fixes for convolution
      
      * Formatting
      
      * Fix shape tests
      
      * Formatting
      
      * Reuse axis equal
      
      * Add initial split fusion
      
      * Formatting
      
      * Update offset
      
      * Workaround outputs that cant accept nonstandard shapes
      
      * Formatting
      
      * Add check for split concat
      
      * Formatting
      
      * Add missing headers
      
      * Formatting
      
      * Add tests
      
      * Formatting
      
      * Add more testing
      
      * Formatting
      
      * Fix when there is duplicate splits in inputs
      
      * Formatting
      
      * Fix mismatch iterators
      
      * Add tests for dot fusions
      
      * Formatting
      
      * Add test for convolution
      
      * Formatting
      
      * Fix tidy issues
      
      * Add more tests
      
      * Formatting
      
      * Ignore build directory for codecov
      
      * Add test for groups
      
      * Formatting
      
      * Add more tests for groups
      
      * Formatting
      
      * Add test for missing end slice
      
      * Add newline
      
      * Remove unused function
      
      * Add support for when beta is not 1
      
      * Formatting
      
      * Add test for scalar
      
      * Add one more scalar test
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      1a4ff504
  28. 29 Mar, 2020 1 commit
  29. 20 Dec, 2019 1 commit
    • Shucai Xiao's avatar
      Improve operators for onnxruntime (#405) · 992666e6
      Shucai Xiao authored
      
      
      * improve unsqueeze to support negative axis and parsing scalar
      
      * clang format
      
      * add a test example for the negative axis of unsqueeze
      
      * improve the squeeze operator to support negative axis
      
      * clang format
      
      * fixed a small bug in the lrn implementation
      
      * clang format
      
      * support negative axis in argmax and argmin
      
      * clang format
      
      * improve flatten to support negative axis
      
      * clang format
      
      * change softmax/logsoftmax to support negative axis
      
      * clang format
      
      * improve transpose by adding default perm
      
      * clang format
      
      * add one more dimens for tensor size
      
      * add one more dimens for tensor size
      
      * disable conv ops fusion for non-symmetric cases
      
      * clang format
      
      * fixed review comments
      
      * move computing axis from the device function to the compute function
      
      * clang format
      
      * move computing axis from device function to the operator computing function
      
      * clang format
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      992666e6
  30. 09 Oct, 2019 1 commit
    • Paul Fultz II's avatar
      Fix bug in bert accuraccy (#385) · a797f890
      Paul Fultz II authored
      * Fix bug in bert accuraccy
      
      * Formatting
      
      * add another test
      
      * Fix add and overflow
      
      * Formatting
      
      * Fix bug in shape_for_each
      
      * Use front instead of iterator
      
      * Use result.front()
      
      * Split add_unary files
      
      * Formatting
      
      * Fix incorrect last index
      
      * Remove comment
      
      * Inline function
      
      * Fix carry check
      
      * Fix metadata errors
      
      * Formatting
      
      * Reflow
      
      * Reflow
      a797f890
  31. 04 Oct, 2019 1 commit
    • kahmed10's avatar
      Add_clip fusion (#370) · 1398bcc1
      kahmed10 authored
      * initial testing of add_clip fusion
      
      * formatting
      
      * clipped relu fusion
      
      * formatting
      
      * remove some executables, add fusion test
      
      * formatting
      
      * remove clipped_relu code
      
      * fix clang-tidy
      
      * revert changes to cmake files
      
      * remove fusion from weight map
      
      * formatting
      
      * fix syntax error
      
      * formatting
      
      * fix syntax error
      
      * fix syntax error
      
      * formatting
      1398bcc1
  32. 19 Sep, 2019 1 commit
  33. 16 Sep, 2019 1 commit
  34. 13 Aug, 2019 3 commits
  35. 01 Aug, 2019 2 commits
  36. 24 Jul, 2019 1 commit