1. 25 Feb, 2021 1 commit
  2. 06 Jan, 2021 1 commit
    • Shucai Xiao's avatar
      Module impl (#678) · c9b86f1c
      Shucai Xiao authored
      
      
      * add an api get_main_module
      
      * clang format
      
      * modify onnx unit test for module
      
      * clang format
      
      * refactor ops unit test with the get_main_module
      
      * clang format
      
      * code backup
      
      * clang format
      
      * refine module c api
      
      * add python api for module
      
      * clang format
      
      * fix a python api issue
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * refine unit tests changes
      
      * clang format
      
      * code backup
      
      * code backup
      
      * clang format
      
      * defer some changes to later PRs
      
      * change return of get_main_module from ref to pointer
      
      * clang format
      
      * add unit tests for the get_main_module_api
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * add more unit tests for more code change coverage
      
      * clang format
      
      * fixed a unit test error
      
      * clang format
      
      * fix unit test
      
      * clang format
      
      * code backup
      
      * code change for more code coverage
      
      * change program to module in various passes and matcher
      
      * clang format
      
      * modify the pass API
      
      * code backup
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * Add option to no generate a destroy method
      
      * Formatting
      
      * fix some review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * clang format
      
      * code backup
      
      * code backup
      
      * clang format
      
      * fix cppcheck errors
      
      * clang format
      
      * clang format
      
      * fix build errors
      
      * clang format
      
      * modify gpu unit tests to using module
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * Add flag to enable cpu backend
      
      * Make buffers shared
      
      * Enable optimizations
      
      * Formatting
      
      * fix review comments
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * fix a bug related to a unit test
      
      * clang format
      
      * clang format
      
      * fix a build error
      
      * remove unnecessary code
      
      * remove unnecessary files
      
      * code backup
      
      * clang format
      
      * remove the compile function from the module class
      
      * clang format
      
      * clang format
      
      * remove the context parameter from the from_value method of the module class
      
      * code refinement
      
      * clang format
      
      * merge changes from develop branch
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * fix a build error
      
      * fixed a merge error
      
      * fix cppcheck error
      
      * fixed review comments
      
      * clang format
      
      * fix cppcheck error
      
      * fix a cppcheck error
      
      * fix cppcheck error
      
      * fix build error caused by merge
      
      * Add missing has_op function
      
      * Formatting
      
      * merge changes from develop branch
      
      * fix a cppcheck error
      
      * fixed some review comments
      
      * clang format
      
      * remove the begin/end function of the program class
      
      * clang format
      
      * refine code and fix cppcheck error
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * add unit tests for more code coverage
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix a build error in debug mode
      
      * clang format
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      c9b86f1c
  3. 14 Dec, 2020 1 commit
    • Paul Fultz II's avatar
      Use dnnl for cpu backend (#688) · 406afeb8
      Paul Fultz II authored
      
      
      * Add flag to enable cpu backend
      
      * Make buffers shared
      
      * Enable optimizations
      
      * Add onednn
      
      * Formatting
      
      * Formatting
      
      * Add dnnl header
      
      * Formatting
      
      * Rewrite rnn first
      
      * Formatting
      
      * Call reference implementation
      
      * Formatting
      
      * Make literal data shared
      
      * Formatting
      
      * Add convolution
      
      * Formatting
      
      * Compensate for dilation
      
      * Formatting
      
      * Use name/make_op instead
      
      * Formatting
      
      * Rename gemm header
      
      * Formatting
      
      * Add dnnl convolution/gemm operators
      
      * Formatting
      
      * Add eliminate_contiguous
      
      * Add faster pointwise operators
      
      * Formatting
      
      * Formatting
      
      * Formatting
      
      * Add dnnl op class
      
      * Formatting
      
      * Add add op
      
      * Formatting
      
      * Add concat operator
      
      * Formatting
      
      * Add more ops
      
      * Create descriptor during finalization
      
      * Formatting
      
      * Dont rewrite pooling
      
      * Enable memory coloring
      
      * Formatting
      
      * Add output aliases
      
      * Formatting
      
      * Fix errors
      
      * Formatting
      
      * Convert literals
      
      * Add missing file
      
      * Remove batch_norm
      
      * Formatting
      
      * Use strides
      
      * Formatting
      
      * Add some debug checks
      
      * Formatting
      
      * Fix big in adjusting shape for gemm
      
      * Formatting
      
      * Fix fallback dot operator
      
      * Zero initialize buffers
      
      * Add suport for group convolutions
      
      * Formatting
      
      * Make adjust allocation target independent
      
      * Formatting
      
      * Enable adjust_allocation for gpu/cpu
      
      * Formatting
      
      * Add copy to allocation model
      
      * Formatting
      
      * Add copy operator
      
      * Formatting
      
      * Better handling of output parameters in adjust_allocation
      
      * Formatting
      
      * Build with dnnl
      
      * Make dnnl required
      
      * Fix compile error
      
      * Tidy fixes
      
      * Formatting
      
      * Tidy fixes
      
      * Formatting
      
      * Fix more tidy issues
      
      * Formatting
      
      * Add mul op
      
      * Add mul op
      
      * Set c compiler to clang as well
      
      * Compensate for normalized compute shape
      
      * Formatting
      
      * Fix cppcheck errors
      
      * Formatting
      
      * Add onednn library to hcc
      
      * Guard clang pragmas
      
      * Disable cpu mode for gcc for now
      
      * Leave it enabled it for gcc 7
      
      * Fix cppcheck suppresion
      
      * Fix compile error on gcc 5
      
      * Remove unused code
      Co-authored-by: default avatarShucai Xiao <shucai.xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      406afeb8
  4. 11 Nov, 2020 1 commit
  5. 09 Nov, 2020 1 commit
    • Paul Fultz II's avatar
      Add hip compilation (#664) · f71af72a
      Paul Fultz II authored
      
      
      * Add compiler flags
      
      * Add missing include
      
      * Add filesystem header
      
      * Formatting
      
      * Add tmp_dir to run
      
      * Formatting
      
      * Kernel compilation and launching
      
      * Formatting
      
      * Seperate pack_args
      
      * Formatting
      
      * Add alignment tests
      
      * Formatting
      
      * Add compile test
      
      * Formatting
      
      * Complete compile test
      
      * Formatting
      
      * Use is_regular_file free function
      
      * Fix is_regular_file call
      
      * Fix tidy issues
      
      * Fix tidy
      
      * Fix tidy issue
      
      * Print size in read_buffer to debug issue on jenkins
      
      * Add hip flags before src file
      
      * Fix reading output files
      
      * Fix unsued variable warning
      
      * Formatting
      
      * Formatting
      
      * Disable tidy check
      Co-authored-by: default avatarShucai Xiao <shucai.xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      f71af72a
  6. 04 Nov, 2020 1 commit
    • Paul Fultz II's avatar
      Split cpu and reference implementation (#671) · 500d9441
      Paul Fultz II authored
      
      
      * Add all_targets cmake target
      
      * Rename target
      
      * Add ref target
      
      * Rename tests
      
      * Refactor compiler target
      
      * Formatting
      
      * Verify for every target
      
      * Formatting
      
      * Add verify test suite
      
      * Formatting
      
      * Add initial test programs
      
      * Formatting
      
      * Add rnn tests
      
      * Formatting
      
      * Validate gpu
      
      * Formatting
      
      * Remove old gpu tests
      
      * Fix gpu tests
      
      * Fix ref error
      
      * Fix tidy issues
      
      * Formatting
      
      * Tidy fixes
      
      * Fix header in python api
      
      * Rename to ref
      
      * Use ref in verify_onnx
      
      * Fix tidy issue
      
      * Build with verbose on
      
      * Fix typo
      
      * Remove verbose
      
      * rename some cpu prefix to ref
      Co-authored-by: default avatarShucai Xiao <Shucai.Xiao@amd.com>
      500d9441
  7. 15 Oct, 2020 1 commit
    • turneram's avatar
      Added greater and less operators (#660) · 48ffbfa5
      turneram authored
      
      
      * Added greater and less operators
      
      * Fixed ops_test.cpp
      
      * Set commutative to false for less, greater
      
      * Refactored parse_equal/less/greater into parse_compare_op
      
      * Removed unnecessary function attributes() from greater.hpp/less.hpp
      
      * Added op_name arguments
      
      * Removed local settings
      
      * Formatting
      
      * Missing comma
      
      * Formatting
      
      * Formatting
      
      * Formatting
      
      * Formatting
      
      * Formatting
      
      * Missing space
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      48ffbfa5
  8. 09 Oct, 2020 1 commit
    • Paul Fultz II's avatar
      Add parallel stream analysis (#629) · 1d98fbb4
      Paul Fultz II authored
      * Add intial multi stream analysis
      
      * Formatting
      
      * Add more tests
      
      * Formatting
      
      * Remove comment
      
      * Analyze streams on the gpu
      
      * Formatting
      
      * Fix nstream
      
      * Formatting
      
      * Add test for return
      
      * Formatting
      
      * Make sure return has a stream assignment
      
      * Formatting
      
      * Fix asserts and checks
      
      * Improve error message for out-of-order sequence
      
      * Formatting
      1d98fbb4
  9. 08 Oct, 2020 1 commit
    • kahmed10's avatar
      Add build flag for fast math (#639) · a5065265
      kahmed10 authored
      
      
      * add flag
      
      * formatting
      
      * remove env variable
      
      * fix api expression
      
      * add api test
      
      * add api test
      
      * add op test
      
      * formatting
      
      * fix function name
      
      * fix syntax
      
      * formatting
      
      * modify test
      
      * remove test and update doc
      
      * move test to new file
      
      * formatting
      
      * revert test files
      
      * rewrite check
      
      * New
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      a5065265
  10. 30 Sep, 2020 1 commit
    • Paul Fultz II's avatar
      Add hip clang builds to jenkins (#651) · f28a62ea
      Paul Fultz II authored
      * Make global variables const
      
      * Tidy fixes
      
      * Disable some lints
      
      * Formatting
      
      * Fix tidy const
      
      * Formatting
      
      * Add missing const keywords
      
      * Formatting
      
      * More fixes
      
      * Fix remaining tidy issues
      
      * Formatting
      
      * Fix rocblas function call
      
      * Formatting
      
      * Fix nodiscard warnings
      
      * Formatting
      
      * Use named parameters
      
      * Remove overload
      
      * Add overload
      
      * Remove noncps
      
      * Use named param for node
      
      * Add auto register header
      
      * Use named parameters
      
      * Refactor jenkinsfile
      
      * Fix shadow
      
      * Add missing body variable
      
      * Add more const methods
      
      * Add hip-clang docker builds
      
      * Remove comments
      
      * Add clang-format
      
      * Add more const
      
      * Formatting
      
      * Rename stage
      
      * Disable check
      
      * Add another const
      
      * Add python 2 dev packages
      
      * Add sphinx to dockerfile
      f28a62ea
  11. 31 Aug, 2020 1 commit
    • Shucai Xiao's avatar
      Pooling ceil mode (#615) · 9dabe26b
      Shucai Xiao authored
      
      
      * support pooling ceil_mode
      
      * clang format
      
      * add unit test for pooling ceil mode
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * add more unit tests and fixed a bug in cpu pooling implementation
      
      * clang format
      
      * add one more unit test
      
      * clang format
      
      * fix cppcheck error
      
      * fix cppcheck error
      
      * fix cppcheck error
      
      * fix review comments
      
      * clang format
      
      * remove the padding_mode attribute in pooling
      
      * clang format
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix a cppcheck error
      
      * fix review comments
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      9dabe26b
  12. 27 Aug, 2020 2 commits
    • Shucai Xiao's avatar
      Context serialization (#607) · 6e1f9f20
      Shucai Xiao authored
      
      
      * Add initial serialization
      
      * Formatting
      
      * Add unit tests
      
      * Formatting
      
      * Add tests for serialization
      
      * Formatting
      
      * Use or not and
      
      * Add value test
      
      * Formatting
      
      * Add more tests
      
      * Add shape serialization
      
      * Formatting
      
      * Add serializtion for literal and argument
      
      * Formatting
      
      * Add from and to value to operatation
      
      * Formatting
      
      * Serialize empty types
      
      * Formatting
      
      * Tidy fixes
      
      * Formatting
      
      * Fix tidy issues
      
      * Formatting
      
      * Reformat value type macro
      
      * Formatting
      
      * Handle enum types
      
      * Formatting
      
      * Use const ref
      
      * Update
      
      * Add tests for to_value/from_value
      
      * Formatting
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * remove the from/to_value method for the generate context struct
      
      * clang format
      
      * code backup
      
      * Dont print literal data in hip_copy_literal
      
      * clang format
      
      * add unit test to have better coverage
      
      * remove unnecessary code
      
      * remove unnecessary code
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      6e1f9f20
    • Shucai Xiao's avatar
      Bool type and equal operator (#603) · 59b80d4e
      Shucai Xiao authored
      
      
      * add bool type
      
      * code backup
      
      * code backup
      
      * clang format
      
      * fix build warnings
      
      * clang format
      
      * add the equal operator
      
      * add the equal operator
      
      * clang format
      
      * remove unnecessary code
      
      * refine unit tests
      
      * clang format
      
      * fix review comments and a bug
      
      * clang format
      
      * additional changes
      
      * clang format
      
      * fix cppcheck error
      
      * add bool type in c api
      
      * fix cppcheck error
      
      * fix review comments
      
      * fix cppcheck error
      
      * fix a build error related to gcc
      
      * fix cppcheck error
      
      * fix cppcheck error
      
      * added the equal operator to register list
      
      * add parsing boolean type
      
      * clang format
      
      * fix bool type issue for python output
      
      * clang format
      
      * add support for automatic multibroadcast of the equal operator
      
      * additional unit tests for more code coverage
      
      * clang format
      
      * missing an onnx file
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      59b80d4e
  13. 25 Aug, 2020 1 commit
    • Paul Fultz II's avatar
      Improve layernorm performance (#613) · 56b3bf58
      Paul Fultz II authored
      * Use increment instead of division to compute register offset
      
      * Formatting
      
      * Limit layernorm to 1024 elements
      
      * Formatting
      
      * Add verification to driver
      
      * Formatting
      
      * Remove early return
      
      * Use block_size 256
      
      * Vectorize the kernel
      
      * Formatting
      
      * Convert to vector type
      
      * Add layernorm tests
      
      * Formatting
      
      * Formatting
      
      * Refactor layernorm to run both algos
      
      * Formatting
      
      * Fix compile error
      
      * Fix tidy warnings
      
      * Formatting
      
      * Add layernorm function
      
      * Formatting
      56b3bf58
  14. 14 Aug, 2020 1 commit
    • kahmed10's avatar
      Layernorm onnx support (#599) · 2c5d5fee
      kahmed10 authored
      
      
      * fix pad calc
      
      * bert tf passes correctness
      
      * formatting
      
      * add test
      
      * formatting
      
      * remove comment
      
      * add inline
      
      * formatting
      
      * fix order for literal
      
      * formatting
      
      * test no mul_add
      
      * formatting
      
      * debug layernorm
      
      * debug layernorm
      
      * manual merge
      
      * more progress
      
      * formatting
      
      * remove miopen batchnorm
      
      * remove headers
      
      * Fix compile error with no dpp reductions
      
      * fix indices
      
      * formatting
      
      * change matcher
      
      * formatting
      
      * remove binds
      
      * formatting
      
      * disable tf matcher
      
      * formatting
      
      * use fast div
      
      * formatting
      
      * fix matcher
      
      * formatting
      
      * remove comment
      
      * move find_matches
      
      * add assert
      
      * formatting
      
      * fix deepcode issue
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      Co-authored-by: default avatarShucai Xiao <shucai.xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      2c5d5fee
  15. 21 Jul, 2020 1 commit
  16. 16 Jul, 2020 1 commit
    • kahmed10's avatar
      Nd deconv cpu (#565) · 98ade977
      kahmed10 authored
      
      
      * initial progress
      
      * formatting
      
      * check existing tests
      
      * formatting
      
      * change for loop to transform
      
      * formatting
      
      * add tests
      
      * formatting
      
      * remove comment
      
      * add more tests
      
      * update gpu miopen calls
      
      * formatting
      
      * initial progress
      
      * add cpu impl and tests
      
      * formatting
      
      * add NOLINT
      
      * add 3d test
      
      * formatting
      
      * add more op_shape tests
      
      * fix error msg
      
      * fix bounds
      
      * formatting
      
      * fix algorithm
      
      * formatting
      
      * pin numpy version
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      98ade977
  17. 10 Jul, 2020 1 commit
    • Shucai Xiao's avatar
      Gpu batchnorm (#564) · 70ba8213
      Shucai Xiao authored
      
      
      * Initial cpu conv-nd
      
      * Formatting
      
      * Make index signed
      
      * Formatting
      
      * Assert the indices are greater than 0
      
      * Use equal instead of lexicographical_compare
      
      * Formatting
      
      * change the batchnorm cpu implementation to support multiple input dimensions
      
      * clang format
      
      * add unit tests for cpu batch_norm nd implementation
      
      * clang format
      
      * support nd batchnormalization
      
      * clang format
      
      * add rewrite batch_norm unit tests
      
      * clang format
      
      * remove a unit test
      
      * Fix tidy errors
      
      * Formatting
      
      * Handle different types
      
      * Formatting
      
      * Fix nested visits
      
      * Formatting
      
      * Add 3d conv test
      
      * Formatting
      
      * revert unnecessary changes
      
      * remove a print line
      
      * Fix ICE
      
      * Formatting
      
      * fix the per_activation mode of 2d
      
      * clang format
      
      * code clean up
      
      * clang format
      
      * add 1d and 3d gpu unit test
      
      * clang format
      
      * add unit test for rewrite_batchnorm
      
      * clang format
      
      * additional refinement
      
      * fix review comments
      
      * added a unit test to have more code coverage
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      70ba8213
  18. 09 Jul, 2020 1 commit
  19. 08 Jul, 2020 1 commit
    • kahmed10's avatar
      Nd pooling gpu (#551) · d1258e80
      kahmed10 authored
      
      
      * initial progress
      
      * formatting
      
      * add pooling changes
      
      * formatting
      
      * change eliminate_pad
      
      * formatting
      
      * rename var
      
      * fomratting
      
      * update op shape test and compute
      
      * formatting
      
      * revert conv constructor
      
      * formatting
      
      * change initializer
      
      * formatting
      
      * fix tidy
      
      * change quant conv and shape check
      
      * add tests and fixes
      
      * formatting
      
      * fix type
      
      * fix conv test
      
      * formatting
      
      * add pooling and bn tests
      
      * formatting
      
      * add inconsistent attr tests
      
      * fix padding issue
      
      * formatting
      
      * progress on 1d to 2d
      
      * formatting
      
      * change compute and compile functions
      
      * formatting
      
      * fix duplicate
      
      * fix conflict
      
      * fix issue with 1d conv
      
      * formatting
      
      * add check for 3d limit
      
      * rename function
      
      * formatting
      
      * update to MIOPen 2.3
      
      * add support for nd pooling
      
      * formatting
      
      * test miopen 2.4
      
      * change function name
      
      * rename functions
      
      * formatting
      
      * add op_shape test
      
      * add gpu ops tests
      
      * formatting
      
      * add pkg-config
      
      * change functions
      
      * formatting
      
      * change to copy_backward
      
      * formatting
      
      * test diff miopen version
      
      * add pooling shape tests
      
      * temp disable test
      
      * revert to miopen 2.4
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      d1258e80
  20. 23 Jun, 2020 1 commit
    • Shucai Xiao's avatar
      Neg operator (#557) · 866cca5b
      Shucai Xiao authored
      * add the neg operator
      
      * clang format
      
      * add missing operator
      
      * fixed a cppcheck error
      
      * change to use the neg operator
      
      * clang format
      866cca5b
  21. 28 May, 2020 1 commit
    • Shucai Xiao's avatar
      Separate gpu unittests to multiple files (#541) · 218e20fc
      Shucai Xiao authored
      
      
      * code backup
      
      * clang format
      
      * fix compiling errors
      
      * clang format
      
      * rename a few files
      
      * rename a few files
      
      * fix variable bugs
      
      * clang format
      
      * add an operator to shift input sequences
      
      * clang format
      
      * fixed a bug
      
      * clang format
      
      * fixed a bug
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * refine code related lstm operator optimization
      
      * clang format
      
      * fix various bugs
      
      * clang format
      
      * fixed a bug in rewrite_lstm
      
      * clang format
      
      * fixed another bug
      
      * refine two operator names
      
      * clang format
      
      * refine file names
      
      * fix cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * fixed review comments
      
      * clang format
      
      * add unit tests
      
      * clang format
      
      * add unit tests
      
      * clang format
      
      * refine unit tests for better coverage
      
      * clang format
      
      * fixed a bug
      
      * fix cppcheck error
      
      * fix review comments
      
      * clang format
      
      * rename two operators according to review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * fix a cppcheck error
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * add an operator to simplify code
      
      * clang format
      
      * clang format
      
      * fixed a bug and add unit tests
      
      * clang format
      
      * add more unit tests
      
      * clang format
      
      * add more unit tests
      
      * clang format
      
      * add more unit tests
      
      * clang format
      
      * refine a unit test
      
      * clang format
      
      * refine a unit test
      
      * add more unit tests and refine some existing tests for the rnn operator improvements
      
      * clang format
      
      * additional changes to simplify code further
      
      * clang format
      
      * refine a test case to refine cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * separate rnn tests out to reduce file size
      
      * clang format
      
      * code cleanup
      
      * refine unit tests
      
      * fix clang tidy error
      
      * clang format
      Co-authored-by: default avatarShucai Xiao <scxiao@prj47-rack-99.local.lan>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      218e20fc
  22. 26 May, 2020 1 commit
    • Shucai Xiao's avatar
      Add support variable seq lens for the RNN and GRU operators (#535) · d7b8164c
      Shucai Xiao authored
      
      
      * code backup
      
      * clang format
      
      * fix compiling errors
      
      * clang format
      
      * rename a few files
      
      * rename a few files
      
      * fix variable bugs
      
      * clang format
      
      * add an operator to shift input sequences
      
      * clang format
      
      * fixed a bug
      
      * clang format
      
      * fixed a bug
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * refine code related lstm operator optimization
      
      * clang format
      
      * fix various bugs
      
      * clang format
      
      * fixed a bug in rewrite_lstm
      
      * clang format
      
      * fixed another bug
      
      * refine two operator names
      
      * clang format
      
      * refine file names
      
      * fix cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * fixed review comments
      
      * clang format
      
      * add unit tests
      
      * clang format
      
      * add unit tests
      
      * clang format
      
      * refine unit tests for better coverage
      
      * clang format
      
      * fixed a bug
      
      * fix cppcheck error
      
      * fix review comments
      
      * clang format
      
      * rename two operators according to review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * fix a cppcheck error
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * add an operator to simplify code
      
      * clang format
      
      * clang format
      
      * fixed a bug and add unit tests
      
      * clang format
      
      * add more unit tests
      
      * clang format
      
      * add more unit tests
      
      * clang format
      
      * add more unit tests
      
      * clang format
      
      * refine a unit test
      
      * clang format
      
      * refine a unit test
      
      * add more unit tests and refine some existing tests for the rnn operator improvements
      
      * clang format
      
      * additional changes to simplify code further
      
      * clang format
      
      * refine a test case to refine cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * add more unit tests
      
      * clang format
      Co-authored-by: default avatarShucai Xiao <scxiao@prj47-rack-99.local.lan>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      d7b8164c
  23. 20 May, 2020 1 commit
    • Shucai Xiao's avatar
      Rnn variable seq lengths (#517) · 90200619
      Shucai Xiao authored
      
      
      * code backup
      
      * clang format
      
      * fix compiling errors
      
      * clang format
      
      * rename a few files
      
      * rename a few files
      
      * fix variable bugs
      
      * clang format
      
      * add an operator to shift input sequences
      
      * clang format
      
      * fixed a bug
      
      * clang format
      
      * fixed a bug
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * refine code related lstm operator optimization
      
      * clang format
      
      * fix various bugs
      
      * clang format
      
      * fixed a bug in rewrite_lstm
      
      * clang format
      
      * fixed another bug
      
      * refine two operator names
      
      * clang format
      
      * refine file names
      
      * fix cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * fixed review comments
      
      * clang format
      
      * add unit tests
      
      * clang format
      
      * add unit tests
      
      * clang format
      
      * refine unit tests for better coverage
      
      * clang format
      
      * fixed a bug
      
      * fix cppcheck error
      
      * fix review comments
      
      * clang format
      
      * rename two operators according to review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix review comments
      
      * fix a cppcheck error
      
      * clang format
      
      * fix review comments
      
      * clang format
      Co-authored-by: default avatarShucai Xiao <scxiao@prj47-rack-99.local.lan>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      90200619
  24. 15 May, 2020 1 commit
    • kahmed10's avatar
      Add gelu optimization (#521) · 0079028a
      kahmed10 authored
      
      
      * fix pad calc
      
      * bert tf passes correctness
      
      * formatting
      
      * add test
      
      * formatting
      
      * remove comment
      
      * add inline
      
      * formatting
      
      * fix order for literal
      
      * formatting
      
      * add test for gelu
      
      * formatting
      
      * added add_gelu fusion
      
      * add files
      
      * formatting
      
      * remove layernorm opt
      
      * revert reduce file
      
      * add gelu_fn and tests
      
      * formatting
      
      * fix matcher, remove extra tests
      
      * formatting
      
      * fix matcher
      
      * add used_once
      
      * formatting
      
      * start on new gelu
      
      * formatting
      
      * add matchers in fuse_ops
      
      * formatting
      
      * add dce to fix add_gelu
      
      * add simplify_rsqrt and test
      
      * formatting
      
      * debugging value for matcher
      
      * formatting
      
      * add more to matchers
      
      * formatting
      
      * fix errors
      
      * remove onnx gen
      
      * add any_arg, change matchers to use either_arg
      
      * formatting
      
      * formatting
      
      * add used_once
      
      * formatting
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      0079028a
  25. 11 May, 2020 1 commit
  26. 17 Apr, 2020 1 commit
  27. 14 Apr, 2020 1 commit
  28. 08 Apr, 2020 1 commit
  29. 29 Mar, 2020 1 commit
  30. 07 Mar, 2020 1 commit
  31. 06 Mar, 2020 1 commit
    • Shucai Xiao's avatar
      Support multi program outputs (#436) · 5592b921
      Shucai Xiao authored
      
      
      * Add initial api
      
      * Formatting
      
      * Add more api
      
      * Formatting
      
      * Add auto api generation
      
      * Formatting
      
      * Fix some compilation errors
      
      * Change handle struct
      
      * Formatting
      
      * Fix reamining compilation errors
      
      * Formatting
      
      * fixed a bug related to number of outputs
      
      * Simplify using ctype
      
      * Formatting
      
      * Initial c++ generation
      
      * Formatting
      
      * Add C++header
      
      * Formatting
      
      * Add test
      
      * Formatting
      
      * Add initial tests
      
      * Formatting
      
      * Try to fix formatting
      
      * Cleanup formatting
      
      * Formatting
      
      * Fix constructors on the same line
      
      * Fix tests
      
      * Formatting
      
      * Fix tidy issues
      
      * Fix tidy issues
      
      * Fix naming issue
      
      * Add onnx API to parse buffer
      
      * Formatting
      
      * Add arguments api
      
      * Formatting
      
      * Fix verify parameters
      
      * Fix cppcheck issues
      
      * Formatting
      
      * Add method to get output shapes and bytes
      
      * Formatting
      
      * Try formatting
      
      * Formatting
      
      * Improve the test coverage
      
      * Formatting
      
      * Add print method
      
      * Formatting
      
      * Fix cppcheck issue
      
      * Fix package dependency
      
      * code backup for support multiple outputs
      
      * clang format
      
      * change migraphx api to support multiple program outputs
      
      * clang format
      
      * change api implementation
      
      * clang format
      
      * clang format
      
      * fix a build error
      
      * additional changes
      
      * clang format
      
      * change api for correct automatic generation
      
      * clang format
      
      * fix unit test error
      
      * fix unit test error
      
      * fix unit tests error
      
      * support multiple program outputs
      
      * clang format
      
      * remove @ from the add_return name
      
      * Add nolint
      
      * Try fix formatting
      
      * Formatting
      
      * formatting
      
      * formatting
      
      * Fix formatting
      
      * code cleanup
      
      * clang format
      
      * fix cppcheck error
      
      * fix a cppcheck error
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * record graph output name
      
      * clang format
      
      * refine print the add_return instruction
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * refine the name of the add_return instruction
      
      * fixed a bug related to workspace
      
      * fixed two small bugs
      
      * clang format
      
      * add more unit tests for multiple program outputs
      
      * clang format
      
      * change an error info
      
      * clang format
      
      * fix cppcheck error
      
      * add unit test for better code coverage
      
      * change to reduce code change
      
      * clang format
      
      * remove storing program output
      
      * fix cppcheck error
      
      * fix review comments
      
      * clang format
      
      * clang format
      
      * remove unnecessary change
      
      * resolve an assert error
      
      * clang format
      
      * change the output name with prefix '#'
      
      * changes in quantization function to support the returns instructin
      
      * clang format
      
      * refine unit tests
      
      * clang format
      
      * refine profiling print out report
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      Co-authored-by: default avatarKhalique <15948690+kahmed10@users.noreply.github.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      5592b921
  32. 15 Feb, 2020 1 commit
    • Shucai Xiao's avatar
      Change api to multiple prog outputs (only API change) (#433) · 1b692d0f
      Shucai Xiao authored
      
      
      * Add initial api
      
      * Formatting
      
      * Add more api
      
      * Formatting
      
      * Add auto api generation
      
      * Formatting
      
      * Fix some compilation errors
      
      * Change handle struct
      
      * Formatting
      
      * Fix reamining compilation errors
      
      * Formatting
      
      * Simplify using ctype
      
      * Formatting
      
      * Initial c++ generation
      
      * Formatting
      
      * Add C++header
      
      * Formatting
      
      * Add test
      
      * Formatting
      
      * Add initial tests
      
      * Formatting
      
      * Try to fix formatting
      
      * Cleanup formatting
      
      * Formatting
      
      * Fix constructors on the same line
      
      * Fix tests
      
      * Formatting
      
      * Fix tidy issues
      
      * Fix tidy issues
      
      * Fix naming issue
      
      * Add onnx API to parse buffer
      
      * Formatting
      
      * Add arguments api
      
      * Formatting
      
      * Fix verify parameters
      
      * Fix cppcheck issues
      
      * Formatting
      
      * Add method to get output shapes and bytes
      
      * Formatting
      
      * Try formatting
      
      * Formatting
      
      * Improve the test coverage
      
      * Formatting
      
      * Add print method
      
      * Formatting
      
      * Fix cppcheck issue
      
      * Fix package dependency
      
      * change migraphx api to support multiple program outputs
      
      * clang format
      
      * change api implementation
      
      * clang format
      
      * fix a build error
      
      * change api for correct automatic generation
      
      * clang format
      
      * Add nolint
      
      * Try fix formatting
      
      * Formatting
      
      * formatting
      
      * formatting
      
      * Fix formatting
      
      * code cleanup
      
      * clang format
      
      * fix cppcheck error
      
      * fix review comments
      
      * clang format
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      Co-authored-by: default avatarkahmed10 <15948690+kahmed10@users.noreply.github.com>
      1b692d0f
  33. 10 Feb, 2020 1 commit
    • Shucai Xiao's avatar
      Add additional simple operators (MatMulInteger, ConvInteger, Asinh, Acosh, and Atanh (#431) · a023ec19
      Shucai Xiao authored
      
      
      * Add initial api
      
      * Formatting
      
      * Add more api
      
      * Formatting
      
      * add more operators (asinh, acosh, atanh, MatMulInteger, ConvInteger)
      
      * clang format
      
      * add unit tests for new operators
      
      * clang format
      
      * Add auto api generation
      
      * Formatting
      
      * Fix some compilation errors
      
      * Change handle struct
      
      * Formatting
      
      * Fix reamining compilation errors
      
      * Formatting
      
      * Simplify using ctype
      
      * Formatting
      
      * Initial c++ generation
      
      * Formatting
      
      * Add C++header
      
      * Formatting
      
      * Add test
      
      * Formatting
      
      * Add initial tests
      
      * Formatting
      
      * Try to fix formatting
      
      * Cleanup formatting
      
      * Formatting
      
      * Fix constructors on the same line
      
      * Fix tests
      
      * Formatting
      
      * Fix tidy issues
      
      * Fix tidy issues
      
      * Fix naming issue
      
      * Add onnx API to parse buffer
      
      * Formatting
      
      * Add arguments api
      
      * Formatting
      
      * Fix verify parameters
      
      * Fix cppcheck issues
      
      * Formatting
      
      * Add method to get output shapes and bytes
      
      * Formatting
      
      * Try formatting
      
      * Formatting
      
      * Improve the test coverage
      
      * Formatting
      
      * Add print method
      
      * Formatting
      
      * Fix cppcheck issue
      
      * Fix package dependency
      
      * Add nolint
      
      * Try fix formatting
      
      * Formatting
      
      * formatting
      
      * formatting
      
      * Fix formatting
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      Co-authored-by: default avatarkahmed10 <15948690+kahmed10@users.noreply.github.com>
      a023ec19
  34. 17 Jan, 2020 1 commit
    • Shucai Xiao's avatar
      Reduce operators (#427) · e320f89f
      Shucai Xiao authored
      * add reduce operators as required by onnxruntime
      
      * clang format
      
      * remove a test since it can cause overflow
      
      * resolve cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      e320f89f
  35. 20 Dec, 2019 1 commit
    • Shucai Xiao's avatar
      Improve operators for onnxruntime (#405) · 992666e6
      Shucai Xiao authored
      
      
      * improve unsqueeze to support negative axis and parsing scalar
      
      * clang format
      
      * add a test example for the negative axis of unsqueeze
      
      * improve the squeeze operator to support negative axis
      
      * clang format
      
      * fixed a small bug in the lrn implementation
      
      * clang format
      
      * support negative axis in argmax and argmin
      
      * clang format
      
      * improve flatten to support negative axis
      
      * clang format
      
      * change softmax/logsoftmax to support negative axis
      
      * clang format
      
      * improve transpose by adding default perm
      
      * clang format
      
      * add one more dimens for tensor size
      
      * add one more dimens for tensor size
      
      * disable conv ops fusion for non-symmetric cases
      
      * clang format
      
      * fixed review comments
      
      * move computing axis from the device function to the compute function
      
      * clang format
      
      * move computing axis from device function to the operator computing function
      
      * clang format
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      992666e6
  36. 18 Nov, 2019 1 commit
    • Shucai Xiao's avatar
      Improve concat gather (#402) · 0045d0b7
      Shucai Xiao authored
      * improve gather implementation to handle negative input indices
      
      * clang format
      
      * clang format
      
      * improve concat to support neg axis input
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * code cleanup
      
      * clang format
      
      * fix review comments
      
      * clang format
      0045d0b7
  37. 15 Nov, 2019 1 commit
    • Paul Fultz II's avatar
      Add option to do offload copying automatically (#403) · 81b0ff5d
      Paul Fultz II authored
      * Add compiler options
      
      * Add copy operators
      
      * Formatting
      
      * Use run_passes in tests
      
      * Formatting
      
      * Use run_pass in schedule test
      
      * Formatting
      
      * Add compile_options to get_passes in target
      
      * Formatting
      
      * Offload copy option
      
      * Formatting
      
      * Copy using pinned memory
      
      * Formatting
      
      * Improve performance of gpu copying
      
      * Formatting
      
      * Dont copy
      
      * Formatting
      
      * Always make an extra copy
      
      * Formatting
      
      * Remove unused write op
      
      * Add missing include
      
      * Remove copy_to_gpu function in python api
      
      * Make offload copy disabled by default on C++
      
      * Formatting
      
      * Fix tidy issues
      
      * Formatting
      
      * Fix namespace
      
      * Fix python tests
      
      * Turn clang format off since its broken
      
      * Fix compile error on gcc 5
      
      * Remove commented code
      81b0ff5d
  38. 04 Nov, 2019 1 commit
  39. 16 Oct, 2019 1 commit
    • Paul Fultz II's avatar
      Fuse the add of two convolutions (#386) · 65ea9194
      Paul Fultz II authored
      * Fuse convolution adds
      
      * Formatting
      
      * Fuse more 1x1 convs
      
      * Add some tests
      
      * Formatting
      
      * Add test for 1x1
      
      * Add verification for add-conv fusions
      
      * Fix stride calculation
      
      * Formatting
      
      * Add more tests
      
      * Rename tests
      65ea9194