1. 25 Jun, 2021 5 commits
  2. 16 Jun, 2021 1 commit
    • Shucai Xiao's avatar
      Resize linear mode support (#819) · 4fe71058
      Shucai Xiao authored
      
      
      * backup implementation of resize enhancement
      
      * clang format
      
      * code backup for the resize
      
      * clang format
      
      * fix build error for resize operator
      
      * clang format
      
      * tmp code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * remove changes in parse_resize
      
      * remove unnecessary changes
      
      * clang format
      
      * add unit test for the bug
      
      * clang format
      
      * remove print code
      
      * remove a semi-colon
      
      * clang format
      
      * fix a tidy error
      
      * clang format
      
      * add contiguous for nonstd input for the resize operator
      
      * clang format
      
      * code backup
      
      * clang format
      
      * fix build error
      
      * code backup
      
      * clang format
      
      * code backup
      
      * code backup
      
      * clang format
      
      * add unit tests for resize_linear
      
      * clang format
      
      * refine a function name
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * fix cppcheck error
      
      * fix review comments
      
      * clang format
      
      * backup code changes
      
      * clang format
      
      * add unit tests for resize operator
      
      * clang format
      
      * remove an unused header file
      
      * remove an unused header file
      
      * remove unrelated unit tests
      
      * refine parsing resize inputs
      
      * clang format
      
      * fix cppcheck error
      
      * fix cppcheck error
      
      * remove unnecessary code
      
      * clang format
      
      * fix cppcheck error
      
      * clang format
      
      * fixed a bug
      
      * clang format
      
      * fix review comments
      
      * clang format
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      4fe71058
  3. 15 Jun, 2021 1 commit
    • Shucai Xiao's avatar
      Int8 gemm support (#811) · 39bc6161
      Shucai Xiao authored
      
      
      * add a flag to indicate int8x4 input format
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * remove log info
      
      * remove unnecessary changes
      
      * fix cppcheck error
      
      * add unit tests to have more code coverage
      
      * clang format
      
      * add debug info
      
      * remove log info
      
      * fix cppcheck error
      
      * clang format
      
      * clang format
      
      * add one more unit tests for more scenarios
      
      * fix cppcheck error
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * rename p to m
      
      * fix review comments
      
      * refine unit tests
      
      * clang format
      
      * refine unit tests and fixed a bug
      
      * clang format
      
      * fix build error related to rocm4.2
      
      * fix a bug related to alpha and beta
      
      * refine two unit tests related to int8_gemm
      
      * fix cppcheck error
      
      * refine unit test to pass on mi100
      
      * add unit test for packing int8 args
      
      * clang format
      
      * change unit tests back
      
      * disable some unit tests for gpu
      
      * clang format
      
      * refine unit tests to run on mi100
      
      * clang format
      
      * refine unit tests
      
      * refine unit tests
      
      * clang format
      
      * change back a unit test
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      39bc6161
  4. 11 Jun, 2021 1 commit
  5. 10 Jun, 2021 2 commits
    • Cagri Eryilmaz's avatar
      Update parse slice (#847) · ade97057
      Cagri Eryilmaz authored
      
      
      * init reverseOp branch: ref op + ref test. WIP
      
      * first passing basic test
      
      * cleanup
      
      * additional axis implementation
      
      * additional test
      
      * ref op implementation vec to int for axis
      
      * ref op test change for axis
      
      * initial gpu files and test
      
      * updates to implementation and test
      
      * fixed some issues
      
      * clang format
      
      * cleanup
      
      * formatting
      
      * removing comments
      
      * changes to parse_slice.cpp debug copy
      
      * cleanup + additional axis  for reverse instruction
      
      * formatting
      
      * remove local size, back to default
      
      * update tests: replace with std functions
      
      * multiple axis for reverse op
      
      * fix a build error
      
      * clang format
      
      * changes to parse_slice.cpp debug copy
      
      * cleanup + additional axis  for reverse instruction
      
      * formatting
      
      * axes update to parse slice
      
      * typo
      
      * more tests
      
      * fix a bug for the reverse device function
      
      * clang format
      
      * fix a bug
      
      * clang format
      
      * ref test updates, multiaxis
      
      * formatting
      
      * formatting, cleanup bool op
      
      * casting for tidy warning
      
      * tidy fix
      
      * remove bool, add steps, check only negative axis
      
      * clang-format
      
      * step op for parse slice
      
      * cleanup & format
      
      * missing axis for logsoftmax_nonstd_input_test
      
      * updated onnx file for logsoftmax_nonstd_input_test
      
      * updates to parse slice. tests for slice+reverse, slice+step+reverse
      
      * removing tests for slice+step+reverse as step requires normalization, will move it to other branch. removed related lines and tests
      
      * duplicate test removal
      
      * some refinement of the code
      
      * clang format
      
      * undefined behavior fix
      
      * undef behavior v2
      
      * formatting
      
      * formatting & updates
      
      * change to parse slice
      
      * update to parse_slice for undef/asan + test update
      
      * formatting
      
      * remove header, no if
      
      * assertions + change the loop from axis to steps for logsoftmax test segfault
      Co-authored-by: default avatarShucai Xiao <Shucai.Xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      ade97057
    • Paul Fultz II's avatar
      Dont match or bind to global instructions (#826) · c72a047f
      Paul Fultz II authored
      
      
      * Add optional header
      
      * Formatting
      
      * Use optional in the matcher
      
      * Foramtting
      
      * Remove program from tests
      
      * Formatting
      
      * Dont bind or match non-local variables
      
      * Formatting
      
      * Fix gcc 5 error
      
      * Format
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      c72a047f
  6. 09 Jun, 2021 2 commits
  7. 08 Jun, 2021 1 commit
    • Cagri Eryilmaz's avatar
      Reverse Op (#846) · 9c54fc4f
      Cagri Eryilmaz authored
      
      
      * init reverseOp branch: ref op + ref test. WIP
      
      * first passing basic test
      
      * cleanup
      
      * additional axis implementation
      
      * additional test
      
      * ref op implementation vec to int for axis
      
      * ref op test change for axis
      
      * initial gpu files and test
      
      * updates to implementation and test
      
      * fixed some issues
      
      * clang format
      
      * cleanup
      
      * formatting
      
      * removing comments
      
      * remove local size, back to default
      
      * update tests: replace with std functions
      
      * multiple axis for reverse op
      
      * fix a build error
      
      * clang format
      
      * more tests
      
      * fix a bug for the reverse device function
      
      * clang format
      
      * fix a bug
      
      * clang format
      
      * ref test updates, multiaxis
      
      * formatting
      Co-authored-by: default avatarShucai Xiao <Shucai.Xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      9c54fc4f
  8. 02 Jun, 2021 1 commit
  9. 26 May, 2021 1 commit
    • Shucai Xiao's avatar
      Step op (#839) · 04065c64
      Shucai Xiao authored
      
      
      * add the operator step
      
      * clang formatJ
      
      * add unit tests
      
      * clang format
      
      * add more unit test for step op
      
      * clang format
      
      * add more unit tests
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * rename two unit tests
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      04065c64
  10. 25 May, 2021 1 commit
  11. 24 May, 2021 2 commits
    • Shucai Xiao's avatar
      Bug split optimization (#817) · b847e868
      Shucai Xiao authored
      
      
      * backup implementation of resize enhancement
      
      * clang format
      
      * code backup for the resize
      
      * clang format
      
      * fix build error for resize operator
      
      * clang format
      
      * tmp code backup
      
      * clang format
      
      * remove changes in parse_resize
      
      * remove unnecessary changes
      
      * clang format
      
      * add unit test for the bug
      
      * clang format
      
      * remove print code
      
      * remove a semi-colon
      
      * clang format
      
      * fix a tidy error
      
      * fix review comments
      
      * clang format
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      b847e868
    • Paul Fultz II's avatar
      Compute dominators (#525) · 7ab06956
      Paul Fultz II authored
      
      
      * rename merge_from to merge_to
      
      * refine comments
      
      * code backup
      
      * clang format
      
      * The first version that can reduce scratch memory usage
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * fixed a bug related to removing gemm copy
      
      * clang format
      
      * code backup
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix unit test failure
      
      * code backup
      
      * clang format
      
      * code base for further investigation
      
      * code with both the forward and backward approach to compute the conflict table
      
      * clang format
      
      * clang format
      
      * backup changes
      
      * remove unnecessary file
      
      * remove unnecessary code
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format'
      
      * fix a bug in the code
      
      * clang format
      
      * code backup
      
      * clang format
      
      * remove unused code
      
      * remove unused code
      
      * rename some functions
      
      * remove print code
      
      * code backup
      
      * add dominator to scheduling
      
      * add dominator algorithm to remove unnecessary conflicts
      
      * Remove comment
      
      * Use erase_if instead
      
      * Formatting
      
      * Code clean up:
      
      * Formatting
      
      * Add dominator info class
      
      * Formatting
      
      * Add dom_info
      
      * Formatting
      
      * Add test case and fix some bugs
      
      * Formatting
      
      * Add unit test for scheduler
      
      * Formatting
      
      * Use index map instead of distance
      
      * Formatting
      
      * Add memory coloring test
      
      * Check for conflict in memory coloring
      
      * Formatting
      
      * Use 1 stream by default
      
      * Update to use modules
      
      * Formatting
      
      * Skip live on entry check
      
      * Formatting
      
      * Formatting
      
      * Fix tidy warning
      
      * Fix tidy warning
      
      * Formatting
      
      * Add nolint
      
      * Use C++17 to build everything when using clang
      
      * Remove input names
      
      * Formatting
      
      * Remove input names
      
      * Keep order of params
      
      * Formatting
      Co-authored-by: default avatarShucai Xiao <Shucai.Xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      7ab06956
  12. 23 May, 2021 1 commit
  13. 11 May, 2021 1 commit
  14. 07 May, 2021 1 commit
    • Paul Fultz II's avatar
      Update dead_code_elimination to remove unused modules (#820) · 43230d29
      Paul Fultz II authored
      * Update pass manager to get modules after every pass
      
      * Add program overload for module
      
      * Formatting
      
      * Hash modules for quicker lookup of modules
      
      * Bump file version
      
      * Add methods to remove modules
      
      * Formatting
      
      * Eliminate unused modules
      
      * Formatting
      
      * Fix test errors
      
      * Foramtting
      
      * Fix tidy issues
      43230d29
  15. 06 May, 2021 1 commit
  16. 03 May, 2021 2 commits
  17. 01 May, 2021 2 commits
  18. 29 Apr, 2021 1 commit
    • SJW's avatar
      MLIR MIOpen Dialect integration (phase 1) (#768) (#769) · 56584fa2
      SJW authored
      
      
      * MLIR MIOpen Dialect integration (phase 1) (#768)
      
      * Added Findmlir.cmake (using environment variables to import)
      
      * Added mlir_conv pass to GPU target
      
        * Apply to any gpu::convolution if supported by MLIR
      
        * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution
      
        * Capture binary in dictionary for matching convolutions
      
        * Build a code_object_op with the binary and execution dimensions
      
        * Substitute for the gpu::convolution
      
      * Changed the parameters for the code_object to reflect the generated MLIR kernel
      
      * Expanded out MemRefDescriptor fields in param list
      
      * Also updated for MLIR C-API changes
      
      * * fixed global_size calculation
      
      * MLIR MIOpen Dialect integration (phase 1) (#768)
      
      * Added Findmlir.cmake (using environment variables to import)
      
      * Added mlir_conv pass to GPU target
      
        * Apply to any gpu::convolution if supported by MLIR
      
        * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution
      
        * Capture binary in dictionary for matching convolutions
      
        * Build a code_object_op with the binary and execution dimensions
      
        * Substitute for the gpu::convolution
      
      * Changed the parameters for the code_object to reflect the generated MLIR kernel
      
      * Expanded out MemRefDescriptor fields in param list
      
      * Also updated for MLIR C-API changes
      
      * * Added command line option: --enable_mlir
      
      * * fixed command line switch
      
      * updated for new MLIR API changes
      
      * * Added cget llvm-project-mlir to import MIIR API libraries into Dockerfile
        * removed cmake Findmlir
      
      * updated for changes in MIIR C-API
      
      * * updated CMakeLists.txt to allow disable of MLIR import
      
      * fixed memory leaks and removed copies
      
      * updated for 5D memrefs
      
      * * formatting
      
      * * fixed review comments
      
      * * fixed merge issues
      
      * hip gcnDeviceName now includes specifiers at the end
        * use major/minor values instead
      
      * * disable MLIR by default
      
      * * removed command-line switch --enable-mlir
      
      * * fix unused when MLIR disabled
      
      * * enable jenkins enable/test MLIR
      
      * * format
      
      * * fixed clang-tidy
      
      * * added new type
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      56584fa2
  19. 27 Apr, 2021 2 commits
    • Paul Fultz II's avatar
      Add tuple type to shape (#800) · 66aa4cc8
      Paul Fultz II authored
      
      
      * Add definitions for all pointwise operators
      
      * Formatting
      
      * Add cpp generator class
      
      * Formatting
      
      * Move compilation to core
      
      * Formatting
      
      * Add clock to tmp name
      
      * Add dynamic loader
      
      * Formatting
      
      * Add tests for code gen
      
      * Formatting
      
      * Add test for literals
      
      * Formatting
      
      * Use with_char
      
      * Add missing header
      
      * Fix mismerge
      
      * Ignore tidy warning
      
      * Fxx gcc 5 errors
      
      * Apply fixits
      
      * Skip signed bitwise of status
      
      * Remove unused parameters
      
      * Explicitly add c++14 flag
      
      * Fix tidy warning
      
      * Add tuple type to shape class
      
      * Formatting
      
      * Make data member private
      
      * Formatting
      
      * Add sub arguments
      
      * Formatting
      
      * Trun clang format off
      
      * Disable clang-format
      
      * Improve visiting tuples
      
      * Formatting
      
      * Add more argument tests
      
      * Formatting
      
      * Handle tuple in load
      
      * Formatting
      
      * Remove .o files
      
      * Add tuple type to api
      
      * Formatting
      
      * Fix tidy warnings
      
      * Fix tidy warnings
      
      * Add a test for share method
      
      * Formatting
      
      * Add a test cpp_type
      
      * Suppress tidy warning
      Co-authored-by: default avatarShucai Xiao <Shucai.Xiao@amd.com>
      66aa4cc8
    • Paul Fultz II's avatar
      8fcb7409
  20. 26 Apr, 2021 1 commit
    • turneram's avatar
      Prefix scan operator (#797) · e8ae23b1
      turneram authored
      
      
      * Add scan struct; add initial tests; initial algorithm by cases; refactor into one algorithm; clean up code
      
      * Rename; restructure; begin adding additional attributes
      
      * refactor to use shape_for_each; temporarily drop reverse mode
      
      * Add back reverse mode with shape_for_each_reverse; update tests; add axis bounds check
      
      * Begin adding to onnx parser
      
      * Add to onnx parser
      
      * Fix onnx test
      
      * Fix CI warnings
      
      * Update algorithm to use slice+par_for; update gen_onnx; remove .o files; remove redundant axis normalizing
      
      * Add exclusive mode
      
      * Add reverse mode
      
      * Remove .pyc file
      
      * Fix warning
      
      * Remove shape_for_each_reverse; clean up pointer usage for exclusive cases
      
      * Remove unused variable
      
      * Fix onnx test
      
      * Add test case to op_shape_test
      
      * Formatting
      
      * Formatting
      
      * Fix tidy warning
      
      * Formatting
      
      * Formatting
      
      * Formatting
      
      * Increase code coverage
      
      * Formatting
      
      * refine the script for creating the cumsum onnx file
      
      * Alphabetize includes for operators.hpp
      
      * Revise onnx test
      
      * Remove redundant bounds check
      
      * Formatting and style
      
      * Alphabetize tests
      
      * Remove duplicate tests from merge
      
      * Fix tidy warning for sub_test
      Co-authored-by: default avatarShucai Xiao <Shucai.Xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      e8ae23b1
  21. 23 Apr, 2021 2 commits
    • Shucai Xiao's avatar
      Onnx 1.8 support (#798) · 658cdab0
      Shucai Xiao authored
      
      
      * add support for axes inputs for sequeeze/unsqueeze/reduce_sum
      
      * clang format
      
      * fix build problems
      
      * backup code changes
      
      * clang format
      
      * fix a bug in parsing quantizelinear operator
      
      * clang format
      
      * fix a cppcheck error
      
      * disable different versions of unit tests for different onnx version
      
      * clang format
      
      * upgrade onnx to 1.8
      
      * update onnx to 1.8.1
      
      * disable two more real models
      
      * clang format
      
      * fix review comments
      
      * fix the function of assign axes in parsing the squeeze operator
      
      * add unit tests and fix a bug
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * fix a build error
      
      * backup code changes
      
      * clang format
      
      * add more unit tests and add parsing opset version
      
      * clang format
      
      * fix cppcheck error
      
      * adding installing the onnx package
      
      * resolve no protobuf compiler
      
      * fix cppcheck error
      
      * add unit tests for more code coverage
      
      * clang format
      
      * try a comment in jenkins build
      
      * include the install onnnx line
      
      * code backup
      
      * reorder the dependenciesd installed
      
      * refine dockerfile
      
      * fix review comments
      
      * clang format
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      658cdab0
    • Shucai Xiao's avatar
      Optimize resize and where operators (#784) · 17485202
      Shucai Xiao authored
      
      
      * code backup
      
      * clang format
      
      * add a matcher related to the special resize case for optimization
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * remove unnecessary code
      
      * add optimization for the where op
      
      * clang format
      
      * fix cppcheck error
      
      * add a unit test for optimize resize
      
      * clang format
      
      * remove unnecessary header include
      
      * code backup
      
      * clang format
      
      * add unit tests for optimizing resize
      
      * clang format
      
      * add more unit test for optimizing where op
      
      * clang format
      
      * remove unnecessary code
      
      * add one more optimzation to remove contiguous
      
      * clang format
      
      * add a pointwise requirement
      
      * clang format
      
      * fix cppcheck error
      
      * add one more unit test
      
      * fixed a bug
      
      * clang format
      
      * remove unnecessary code
      
      * clang format
      
      * fix a build error
      
      * fix review comments
      
      * clang format
      
      * fix a review comments
      
      * clang format
      
      * code refinement
      
      * clang format
      
      * refine more code
      
      * refine more code
      
      * fix a bug related to reshape_cont optimization
      
      * clang format
      
      * fix a review comment
      
      * removed an unnecessary comment
      
      * refine code according to comments
      
      * clang format
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      17485202
  22. 22 Apr, 2021 1 commit
    • Paul Fultz II's avatar
      Cpu fusions using post_ops (#781) · f7befe50
      Paul Fultz II authored
      
      
      * Add eliminate_data_type pass
      
      * Formatting
      
      * Auto convert quant ops
      
      * Formatting
      
      * Flip the order of decompose
      
      * Compute max size differently
      
      * Formatting
      
      * Clamp values in convert
      
      * Formatting
      
      * Fix loss of precision in reduce
      
      * Formatting
      
      * Fix bugs in reduction
      
      * Fix accumulator type in reference softmax implementation
      
      * Formatting
      
      * Update convert test
      
      * Remove unused variables
      
      * Remove unnecessary quant_dot check
      
      * Formatting
      
      * Add tests
      
      * Formatting
      
      * Remove unused code
      
      * Remove duplicate ops
      
      * Remove blaze dependency
      
      * Use set since shape::type_t is no hashable on gcc 5
      
      * Formatting
      
      * Add dnnl binary op
      
      * Formatting
      
      * Add binary and eltwise
      
      * Formatting
      
      * Add softmax
      
      * Formatting
      
      * Remove unused operators
      
      * Add missing files
      
      * Formatting
      
      * Add lrn
      
      * Formatting
      
      * Add deconvolution
      
      * Formatting
      
      * Change allocate default
      
      * Add reorder
      
      * Formatting
      
      * Add reductions
      
      * Formatting
      
      * Sort lines
      
      * Change literals in another loop
      
      * Add pow operator
      
      * Formatting
      
      * Add pow operator
      
      * Formatting
      
      * Make sure shapes are packed
      
      * Allow broadcasted inputs
      
      * Remove unused operators
      
      * Simplify functions
      
      * Remove softmax
      
      * Add sub and erf functions
      
      * Formatting
      
      * Fix bug
      
      * Formatting
      
      * Improve parallism
      
      * Formatting
      
      * Allow multiple batch dimensions
      
      * Formatting
      
      * Move literal transforms out of lowering
      
      * Formatting
      
      * Add gather operator
      
      * Sort lines
      
      * Add early exit for carry
      
      * Formatting
      
      * Add missing concat
      
      * Rename macro
      
      * Fix deep nesting
      
      * Formatting
      
      * Fix cppcheck issues
      
      * Remov else
      
      * Move attribute to typedef
      
      * Formatting
      
      * Disable maybe-uninitialized warning since its broken on gcc
      
      * Add constexpr default constructor
      
      * Formatting
      
      * Fix compiler warnings
      
      * Fix adjust_allocation test
      
      * Add layernorm matcher
      
      * Add gelu_erf matcher
      
      * Formatting
      
      * Add gelu_tanh matcher
      
      * Formatting
      
      * Remove match namespace
      
      * Formatting
      
      * Use matcher instead of string
      
      * Formatting
      
      * Add fusions
      
      * Formatting
      
      * Add post op field
      
      * Formatting
      
      * Make post_ops serializable
      
      * Formatting
      
      * Add eltwise fusions
      
      * Formatting
      
      * Fix null conversions
      
      * Formatting
      
      * Add fuse_ops source files
      
      * Formatting
      
      * Set binary post op index correctly
      
      * Formatting
      
      * Fix serialization bugs
      
      * Check if used once
      
      * Formatting
      
      * Fix error in get_primitive_attr
      
      * Formatting
      
      * Add compile function
      
      * Formatting
      
      * Limit fusions
      
      * Formatting
      
      * Disable with env variable instead of using compile arg
      
      * Formatting
      
      * Fix implicit conversion to bool
      
      * Declar on seperate lines
      
      * Formatting
      
      * Fix cppcheck issues
      
      * Fix ICE in pack_join
      
      * Formatting
      
      * Use const ref
      
      * Make enum hashable
      
      * Formatting
      
      * Add explicit this
      
      * Fix merge issues
      
      * Fix dangling ref
      
      * Formatting
      
      * Add test for compile
      
      * Formatting
      
      * Add more value tests
      
      * Formatting
      Co-authored-by: default avatarShucai Xiao <shucai@gmail.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      f7befe50
  23. 21 Apr, 2021 1 commit
  24. 19 Apr, 2021 1 commit
    • Paul Fultz II's avatar
      Add code generation for pointwise operators (#780) · 35d1bcc2
      Paul Fultz II authored
      * Add definitions for all pointwise operators
      
      * Formatting
      
      * Add cpp generator class
      
      * Formatting
      
      * Move compilation to core
      
      * Formatting
      
      * Add clock to tmp name
      
      * Add dynamic loader
      
      * Formatting
      
      * Add tests for code gen
      
      * Formatting
      
      * Add test for literals
      
      * Formatting
      
      * Use with_char
      
      * Add missing header
      
      * Fix mismerge
      
      * Ignore tidy warning
      
      * Fxx gcc 5 errors
      
      * Apply fixits
      
      * Skip signed bitwise of status
      
      * Remove unused parameters
      
      * Explicitly add c++14 flag
      
      * Fix tidy warning
      
      * Remove .o files
      35d1bcc2
  25. 13 Apr, 2021 1 commit
  26. 09 Apr, 2021 1 commit
    • Paul Fultz II's avatar
      Upgrade docker to rocm 4.1 and drop hcc (#795) · 6d937d80
      Paul Fultz II authored
      * Fix tidy warnings for 4.1
      
      * Formatting
      
      * Upgrade to 4.1 in docker
      
      * Remove hcc build and enable ubsan on clang debug
      
      * Add missing openmp package
      
      * Construct directly
      
      * Construct directly
      
      * Upgrade rocm-cmake version
      6d937d80
  27. 08 Apr, 2021 1 commit
  28. 07 Apr, 2021 1 commit
  29. 05 Apr, 2021 1 commit
    • Paul Fultz II's avatar
      Propagate data layout in the operators (#777) · abe4ec3e
      Paul Fultz II authored
      
      
      * Add method to compute shape with same layout
      
      * Formatting
      
      * Fix permutation with ambiguous layouts
      
      * Formatting
      
      * Propagate layout for pointwise operators
      
      * Formatting
      
      * Propagate layout for more operators
      
      * Formatting
      
      * Sort with lens
      
      * Formatting
      
      * Simplify permutation sorting
      
      * Formatting
      
      * Propagate layout for concat operator
      
      * Formatting
      
      * Use copy
      
      * Formatting
      
      * Remove header
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      abe4ec3e