- 31 Aug, 2021 1 commit
-
-
Shucai Xiao authored
* fix two asserts for debug build * add unit test for copy parameters * clang format * add a unit test for reorder_dims * change tranpose to always require perm not be empty * clang format * remove an unnecessary line * fix tidy error * fix review comments
-
- 24 Aug, 2021 1 commit
-
-
Umang Yadav authored
* rename broadcast and multibroadcast output_lens attribute to out_lens attribute, and change tests and source code to reflect the same * change the reshape attribute from dims to out_lens * change transpose attribute's name from dims to perm to reflect better meaning * use permutation instead of perm for transpose clang formaating * use dims instead of out_lens for reshape clang formatting
-
- 19 Aug, 2021 1 commit
-
-
Paul Fultz II authored
* Enable warnings when jit compiling * Formatting
-
- 18 Aug, 2021 2 commits
-
-
turneram authored
* Add operators, refactor parsers, add rewrite passes, add tests * Add ref implementations * Move broadcasting of scales and zero points to onnx parser * Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type * Switch certain variables to int64_t * Fix overflow in implicit constant conversion * Remove operators.hpp from includes in tf_test.cpp * Add conversion for int32 input to quantizelinear and add test case; remove operators.hpp from onnx_test.cpp includes * Switch dequantizelinear math from int32 to float * Remove changes to operators.hpp * Simplify apply_quantizelinear * Add verify test for int32 data * Add rewrite_quantization back to CMakeLists * Add passes to insert qdq after add_bias is applied, replace quant_ops, and remove remaining qdq pairs * Renaming, refactoring, cleaning up code, adding formal test, and adding passes to targets * Renaming, review comments, begin adding more specific tests * Add more specific unit tests * Fix failing test on CI * Correct matcher and update qop rewriting, update tests and add more tests * Update matcher, clean up simplify_qdq, tweak tests * Add tests, remove pass from CPU target, update dot parameters, clean up simplify_qdq * Fix correctness bug in ref q/dq implementations; edit gemm parser to make beta always 0.0 * Remove unused variables in onnx gemm tests
-
turneram authored
Co-authored-by:Chris Austen <causten@users.noreply.github.com>
-
- 10 Aug, 2021 1 commit
-
-
Paul Fultz II authored
* Add hiprtc compile option * Add cross compile test * Update error reporting * Add tests for errors and warnings * Fix tidy warning * Add comment to ifdefs * Skip null character at end of log * Assert there is null at the end
-
- 09 Aug, 2021 1 commit
-
-
Cagri Eryilmaz authored
* check for divisor encodable or not, fallback if needed * verify test for retinaface case
-
- 05 Aug, 2021 1 commit
-
-
Paul Fultz II authored
* Add method to compile pointwise * Formatting * Add lambda * Add semicolon * Rename variable * Add driver to run jit kernels * Formatting * Add context * Formatting * Make seperate driver folder * Add more general gpu driver * Formatting * Print out wll time * Formatting * Run multiple times and skip first run * Formatting * Seperate time_op * Run an op for comparison * Formatting * Add debug asserts * Formatting * Change parameer name * Formatting * Fix argument order * Formatting * Add preloading * Formatting * Allow a different data type * Formatting * Pipeline transformations * Formatting * Add vectorization * Formatting * Reduce dims * Formatting * Compile with launch params as constant * Formatting * Make sure buffer can be vecotrized * Formatting * Enable vectorization and preloading * Formatting * Add print header * Formatting * Avoid allocating to large of LDS * Formatting * Add some vec functions to a seperate header * Formatting * Add stride loops * Formatting * Improve the transform pipeline * Formatting * Add const * Fix shape check * Formatting * Just check stride axis is zero * Remove extra finc_vector_axis overload * Simplify some mroe functions * Formatting * Remove some more extra functions * Formatting * Simplify more decltypes * Add another const * Fix test * Get buffer pointer different for older compilers Co-authored-by:
Shucai Xiao <shucai@gmail.com> Co-authored-by:
Chris Austen <causten@users.noreply.github.com>
-
- 15 Jul, 2021 1 commit
-
-
turneram authored
* Add operators, refactor parsers, add rewrite passes, add tests * Formatting * Fix cppcheck * Review comments * Formatting * Combine rewrite passes * Formatting * Add ref implementations * Formatting * Review comments * Formatting * Tidy warnings * Apply review comments * Formatting * Fix CI error * Formatting * Increase code coverage * Formatting * Move broadcasting of scales and zero points to onnx parser * Formatting * Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type * Formatting * Increase code coverage * Formatting * Switch certain variables to int64_t * Formatting * Fix overflow in implicit constant conversion * Formatting * Increase code coverage * Formatting * Remove operators.hpp from includes in tf_test.cpp * Formatting * Add conversion for int32 input to quantizelinear and add test case; remove operators.hpp from onnx_test.cpp includes * Formatting * Switch dequantizelinear math from int32 to float * Formatting * Remove changes to operators.hpp * Simplify apply_quantizelinear * Formatting * Add verify test for int32 data * Add rewrite_quantization back to CMakeLists
-
- 14 Jul, 2021 1 commit
-
-
Paul Fultz II authored
* Unify device_name function * Formatting Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 12 Jul, 2021 1 commit
-
-
Shucai Xiao authored
Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 09 Jul, 2021 1 commit
-
-
Shucai Xiao authored
-
- 08 Jul, 2021 4 commits
-
-
Paul Fultz II authored
* Add initial scan operator * Formatting * Fix with a working test * Fix bugs * Formatting * Formatting * Simplify * Formatting * Use non-power of 2 for test * Make pointer Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
Paul Fultz II authored
* Add preallocate method * Add preallocate_param pass * Preallocate buffers on the cpu * Formatting * Preallocate on the gpu * Add missing cpp file * Formatting * Add lifetime function * Formatting * Always allocate * Fix tidy warning * Add const * Add missing lifetime annotations Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 27 Jun, 2021 1 commit
-
-
Shucai Xiao authored
* Add definitions for all pointwise operators * Formatting * Add cpp generator class * Formatting * Move compilation to core * Formatting * Add clock to tmp name * Add dynamic loader * Formatting * Add tests for code gen * Formatting * Add test for literals * Formatting * Use with_char * Add missing header * Fix mismerge * Ignore tidy warning * Fxx gcc 5 errors * Apply fixits * Skip signed bitwise of status * Remove unused parameters * Explicitly add c++14 flag * Fix tidy warning * unify the compute function signature * clang format * make another change * unify the compute function * clang format * remove unnecessary code * more refinement about the operator compute funciton * clang format * add an overload function * clang format * add support for axes inputs for sequeeze/unsqueeze/reduce_sum * clang format * fix build problems * backup code changes * clang format * Add tuple type to shape class * Formatting * fix a bug in parsing quantizelinear operator * clang format * fix a cppcheck error * disable different versions of unit tests for different onnx version * clang format * upgrade onnx to 1.8 * update onnx to 1.8.1 * disable two more real models * clang format * Make data member private * Formatting * Add sub arguments * Formatting * Trun clang format off * Disable clang-format * fix review comments * fix the function of assign axes in parsing the squeeze operator * add unit tests and fix a bug * clang format * fix review comments * clang format * fix a build error * backup code changes * clang format * add more unit tests and add parsing opset version * clang format * Improve visiting tuples * Formatting * fix cppcheck error * adding installing the onnx package * resolve no protobuf compiler * add an inline subgraph pass * clang format * Add more argument tests * Formatting * Handle tuple in load * Formatting * code backup * clang format * Remove .o files * Add tuple type to api * Formatting * fix build errors * clang format * code backup * code backup * add unit tests for the inline subgraph * clang format * refine the inline subgraph and parse if operator * clang format * fix cppcheck issue * clang format * add unit test for inline subgraph pass * clang format * fix format issue * remove the context from the if operator * clang format * simplify the compute functions * Fix tidy warnings * fix cppcheck error * clang format * fix cppcheck error * Fix tidy warnings * fix a cppcheck error * clang format * Add a test for share method * Formatting * Add a test cpp_type * add unit tests for more code coverage * clang format * add unit tests to have more code coverage * clang format * try a comment in jenkins build * include the install onnnx line * code backup * reorder the dependenciesd installed * refine dockerfile * fix review comments * clang format * remove unnecessary overload function * fix cppcheck error * change back the argument test * Suppress tidy warning * add the operator get_tuple_elem * clang format * add get_tuple_elem to operator include file * chang if to support multiple operation outputs * clang format * optimize inline subgraph * clang format * code backup * clang format * fix bug * refine unit tests for tuple output of the if operator * clang format * refine a instruction replacement code * add a unit test and sort all the unit tests alphabetically * fix cppcheck error * add more unit tests for multiple op outputs * clang format * fix cppcheck error * Update pass manager to get modules after every pass * more unit test to cover more scenarios * clang format * fixed a bug in a unit test * add more tests * clang format * add more unit tests to have more code coverage * fix a bug in a unit test * Add program overload for module * Formatting * Hash modules for quicker lookup of modules * Bump file version * Add methods to remove modules * Formatting * add the tuple type to the support list * Eliminate unused modules * Formatting * Fix test errors * Foramtting * Fix tidy issues * fix problem related to inline subgraph * clang format * fix review comments * fix review comments * fix review comments * fix review comments * clang format * fix a unit test * one more code change * remove an optimization related to the if operator * clang format * fix review comments Co-authored-by:
Paul <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 25 Jun, 2021 4 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 15 Jun, 2021 1 commit
-
-
Shucai Xiao authored
* add a flag to indicate int8x4 input format * clang format * code backup * clang format * code backup * clang format * code backup * clang format * code backup * clang format * code backup * clang format * remove log info * remove unnecessary changes * fix cppcheck error * add unit tests to have more code coverage * clang format * add debug info * remove log info * fix cppcheck error * clang format * clang format * add one more unit tests for more scenarios * fix cppcheck error * clang format * fix review comments * clang format * rename p to m * fix review comments * refine unit tests * clang format * refine unit tests and fixed a bug * clang format * fix build error related to rocm4.2 * fix a bug related to alpha and beta * refine two unit tests related to int8_gemm * fix cppcheck error * refine unit test to pass on mi100 * add unit test for packing int8 args * clang format * change unit tests back * disable some unit tests for gpu * clang format * refine unit tests to run on mi100 * clang format * refine unit tests * refine unit tests * clang format * change back a unit test Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 11 Jun, 2021 1 commit
-
-
Paul Fultz II authored
Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 09 Jun, 2021 1 commit
-
-
kahmed10 authored
* alternative impl * formatting * add gpu pass to insert pad * formatting * update onnx test, still need cleanup * formatting * update tf_test * modify existing tests * formatting * remove print * code cleanup * formatting * code cleanup * formatting * fix tidy and cppcheck * remove variable * add test * formatting * add test and address comments * formatting Co-authored-by:
Shucai Xiao <shucai@gmail.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 08 Jun, 2021 1 commit
-
-
Cagri Eryilmaz authored
* init reverseOp branch: ref op + ref test. WIP * first passing basic test * cleanup * additional axis implementation * additional test * ref op implementation vec to int for axis * ref op test change for axis * initial gpu files and test * updates to implementation and test * fixed some issues * clang format * cleanup * formatting * removing comments * remove local size, back to default * update tests: replace with std functions * multiple axis for reverse op * fix a build error * clang format * more tests * fix a bug for the reverse device function * clang format * fix a bug * clang format * ref test updates, multiaxis * formatting Co-authored-by:
Shucai Xiao <Shucai.Xiao@amd.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 06 May, 2021 1 commit
-
-
Paul Fultz II authored
* Use hipStreamSynchronize instead of device sync * Formatting * Suppress FPs * Use sync_stream instead of device * Formatting * Fix python bindings * Formatting
-
- 03 May, 2021 1 commit
-
-
Paul Fultz II authored
* Remove unused data types * Formatting * Reduce types generated for hip kernels * Formatting * Fix onnx tests * Formatting Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 29 Apr, 2021 1 commit
-
-
SJW authored
* MLIR MIOpen Dialect integration (phase 1) (#768) * Added Findmlir.cmake (using environment variables to import) * Added mlir_conv pass to GPU target * Apply to any gpu::convolution if supported by MLIR * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution * Capture binary in dictionary for matching convolutions * Build a code_object_op with the binary and execution dimensions * Substitute for the gpu::convolution * Changed the parameters for the code_object to reflect the generated MLIR kernel * Expanded out MemRefDescriptor fields in param list * Also updated for MLIR C-API changes * * fixed global_size calculation * MLIR MIOpen Dialect integration (phase 1) (#768) * Added Findmlir.cmake (using environment variables to import) * Added mlir_conv pass to GPU target * Apply to any gpu::convolution if supported by MLIR * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution * Capture binary in dictionary for matching convolutions * Build a code_object_op with the binary and execution dimensions * Substitute for the gpu::convolution * Changed the parameters for the code_object to reflect the generated MLIR kernel * Expanded out MemRefDescriptor fields in param list * Also updated for MLIR C-API changes * * Added command line option: --enable_mlir * * fixed command line switch * updated for new MLIR API changes * * Added cget llvm-project-mlir to import MIIR API libraries into Dockerfile * removed cmake Findmlir * updated for changes in MIIR C-API * * updated CMakeLists.txt to allow disable of MLIR import * fixed memory leaks and removed copies * updated for 5D memrefs * * formatting * * fixed review comments * * fixed merge issues * hip gcnDeviceName now includes specifiers at the end * use major/minor values instead * * disable MLIR by default * * removed command-line switch --enable-mlir * * fix unused when MLIR disabled * * enable jenkins enable/test MLIR * * format * * fixed clang-tidy * * added new type Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 27 Apr, 2021 2 commits
-
-
Paul Fultz II authored
* Add definitions for all pointwise operators * Formatting * Add cpp generator class * Formatting * Move compilation to core * Formatting * Add clock to tmp name * Add dynamic loader * Formatting * Add tests for code gen * Formatting * Add test for literals * Formatting * Use with_char * Add missing header * Fix mismerge * Ignore tidy warning * Fxx gcc 5 errors * Apply fixits * Skip signed bitwise of status * Remove unused parameters * Explicitly add c++14 flag * Fix tidy warning * Add tuple type to shape class * Formatting * Make data member private * Formatting * Add sub arguments * Formatting * Trun clang format off * Disable clang-format * Improve visiting tuples * Formatting * Add more argument tests * Formatting * Handle tuple in load * Formatting * Remove .o files * Add tuple type to api * Formatting * Fix tidy warnings * Fix tidy warnings * Add a test for share method * Formatting * Add a test cpp_type * Suppress tidy warning Co-authored-by:Shucai Xiao <Shucai.Xiao@amd.com>
-
Paul Fultz II authored
-
- 19 Apr, 2021 1 commit
-
-
Paul Fultz II authored
* Add definitions for all pointwise operators * Formatting * Add cpp generator class * Formatting * Move compilation to core * Formatting * Add clock to tmp name * Add dynamic loader * Formatting * Add tests for code gen * Formatting * Add test for literals * Formatting * Use with_char * Add missing header * Fix mismerge * Ignore tidy warning * Fxx gcc 5 errors * Apply fixits * Skip signed bitwise of status * Remove unused parameters * Explicitly add c++14 flag * Fix tidy warning * Remove .o files
-
- 09 Apr, 2021 1 commit
-
-
Paul Fultz II authored
* Fix tidy warnings for 4.1 * Formatting * Upgrade to 4.1 in docker * Remove hcc build and enable ubsan on clang debug * Add missing openmp package * Construct directly * Construct directly * Upgrade rocm-cmake version
-
- 08 Apr, 2021 1 commit
-
-
Paul authored
-
- 05 Apr, 2021 1 commit
-
-
Shucai Xiao authored
* code cleanup * clang format * backup code * clang format * remove unnecessary code * clang format * add module print function * code backup * refine the module::print function * refine the module:to_value() function * code backup * backup code changes * code backup * remove to_value and from_value function from the module class * rename a function * rename the if operator * refine the if operator * refine the print function of module and program * code backup * code backup * fix a build warning * fix overload of compute_shape function * code backup * fix unit test error * fix cppcheck error * fix the issue related to the overload of compute_shape * fix review comments * fix cppcheck error * change the return name of if_op to be if * clang format * fix two unit tests * clang format * rename variables * clang format * remove the unused compute_op function * clang format * add lowering of if operator and compute_op function * clang format * add parsing if operator in onnx file * clang format * fix clang tidy format * clang format * add the gpu implementation of the if operator * enhance the validate function and uncomment a unit test * clang format * remove unnecessary code * add sub_module processing in ref passes * clang format * clang format * fix a hang issue related to the valid function * fix an issue in replace_refs * clang format * fix review comments * clang format * fix cppcheck error * clang format * add a unit test for more code coverage * clang format * fix review comments and add test for more code coverage * clang format * fix cppcheck error * clang format * fix cppcheck error * fix a cppcheck error * clang format * backup code * clang format * fix cppcheck error * clang format * some code refinement * clang format * code backup to handle submodules in module compilation * clang format * code backup * clang format * code backup * clang format * fix a bug related to literal id * fix a bug in gpu execution * change the way of compiling a graph * clang format * backup more changes * clang format * refine pass log information * remove unnecessary code * clang format * temp changes backup * clang format * add module name prefix to scratch memory id in hip_memory_allocation * clang format * change to copy the cond input by inserting a copy instruction * clang format * change to use the if output argument as the submodule output so can remove a gpu_copy * clang format * consider submodule in some compile passes * clang format * fix review comments * clang format * fix issues related to scratch memory * clang format * remove unnecessary code * fix cppcheck error * clang format * reslove the implicit dependencies issue related to submodule * clang format * fix cppcheck error * clang format * backup temp changes * clang format * fixed an bug in the has_instruction function * clang format * fix the return value of the gpu implementation of the if operator * fix a bug in the compute_shape function in the gpu implementation * add an if onnx unit test * clang format * add more unit tests * clang format * tmp code backup * clang format * fix a sync problem related to copy cond argument from gpu to cpu * clang format * change the compile offload copy flag setting * clang format * enable copy from cpu to be able to do synchronous copy * clang format * add more unit tests * add more unit tests * add more ref unit tests * clang format * fixed a bug error * tmp code backup * clang format * fixed an onnx verify unit test * add more unit tests * clang format * reverse a change * fix cppcheck error * fix cppcheck error * fix to print all instructions in program execution * clang format * fix bugs related to memory coloring and offload copy to be true * clang format * remove unnecessary include header file * sort test cases in ref_cpu_ops alphabetically * clang format * add a flag to disable cpu target in verification test * change the way to disable some tests * clang format * disable verify unit test of the if operators * add a function call to have more code coverage * fix a build error * fix review comments * fix review comments * clang format * add a api gpu unit test for more code coverage * clang format * change to use instruction.size() as node index * move the calc_implicit_deps function to module class as a member function * clang format * move the offload_copy flag setting to lowering * clang format * assign the module_eval lambda function to a variable to simplify code * clang format * move the compute function from ref/gpu implementation to the main if operator * clang format * fix cpp check error * add a unit test for more code coverage * clang format * add unit test to calculate implicit deps * add a python unit test * clang format * refine a unit test to have more code coverage * clang format * chang the way of wrap up arguments for sub modules * clang format * fix some build errors * code cleanup * refine unit tests to have more code coverage * clang format * refine unit test to have more code coverage * code backup * clang format * add memory coloring test * refine memory coloring unit test * clang format * remove an unnecessary line * remove an unused line * remove an unnecessary parameter in the lambda function * clang format * refine a unit test * remove an unnecessary line * refine unit tests to have more code coverage * clang format * combine two lines * add one more unit test for more code coverage * clang format * add one more unit test * clang format * fix review comments * refine a print out information * fix review comments * clang format * change the sync copy to using a gpu device sync * clang format * remove unnecessary code Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 26 Mar, 2021 1 commit
-
-
Paul Fultz II authored
* Add code object op * Formattting * Add more value tests * Formatting * Fix from_value conversion from binary * Formatting * Dont use offload copy * Remove iostream header * Fix compilation errors * Formatting * Rename var * Add missing files * Formatting * Remove duplicate variable * Remove comment * Template the function so sfinae will work * Formatting * Use template specialization since ADL is broken on hcc * Formatting * Annotate the constructor with HD for hcc * Make variable const Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 25 Mar, 2021 1 commit
-
-
Paul Fultz II authored
* Add eliminate_data_type pass * Formatting * Auto convert quant ops * Formatting * Flip the order of decompose * Compute max size differently * Formatting * Clamp values in convert * Formatting * Fix loss of precision in reduce * Formatting * Fix bugs in reduction * Fix accumulator type in reference softmax implementation * Formatting * Update convert test * Remove unused variables * Remove unnecessary quant_dot check * Formatting * Add tests * Formatting * Remove unused code * Remove duplicate ops * Remove blaze dependency * Use set since shape::type_t is no hashable on gcc 5 * Formatting * Add dnnl binary op * Formatting * Add binary and eltwise * Formatting * Add softmax * Formatting * Remove unused operators * Add missing files * Formatting * Add lrn * Formatting * Add deconvolution * Formatting * Change allocate default * Add reorder * Formatting * Add reductions * Formatting * Sort lines * Change literals in another loop * Add pow operator * Formatting * Add pow operator * Formatting * Make sure shapes are packed * Allow broadcasted inputs * Remove unused operators * Simplify functions * Remove softmax * Add sub and erf functions * Formatting * Fix bug * Formatting * Improve parallism * Formatting * Allow multiple batch dimensions * Formatting * Move literal transforms out of lowering * Formatting * Add gather operator * Sort lines * Add early exit for carry * Formatting * Add missing concat * Rename macro * Fix deep nesting * Formatting * Fix cppcheck issues * Remov else * Move attribute to typedef * Formatting * Disable maybe-uninitialized warning since its broken on gcc * Add constexpr default constructor * Formatting * Fix compiler warnings * Fix adjust_allocation test * Add layernorm matcher * Add gelu_erf matcher * Formatting * Add gelu_tanh matcher * Formatting * Remove match namespace * Formatting * Use matcher instead of string * Formatting * Add fusions * Formatting * Make input a const ref * Make this explicit for gcc 5 Co-authored-by:
Shucai Xiao <shucai@gmail.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 10 Mar, 2021 1 commit
-
-
Shucai Xiao authored
* fix the flag in rocblas api for int8 data type * used different flag for different rocblas versions * clang format Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 05 Mar, 2021 1 commit
-
-
kahmed10 authored
* fix relu6 * add more transposes * add multi output * formatting * add tests * formatting * fix tests * change to_nchw for outputs * add python api * fix cppcheck * remove variable * fix lambda * add multi_output test * add more tests and merge * fix help message * debugging work * fix valid op string * formatting * manual merge * mark function as const Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com> Co-authored-by:
Shucai Xiao <shucai@gmail.com>
-
- 26 Feb, 2021 2 commits
-
-
Cagri Eryilmaz authored
* changes for not operator * changed name of the op from unary_not to not * Added tests for op and onnx parsing * reordering not_test in onnx_test.cpp * not operator -- gpu implementation * added bool test for not operator * Added test and missing links for not operator on GPU * typo fix * adding .onnx test files for not operator * formatting Co-authored-by:
Shucai Xiao <shucai@gmail.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
Paul Fultz II authored
* Add eliminate_data_type pass * Formatting * Auto convert quant ops * Formatting * Flip the order of decompose * Compute max size differently * Formatting * Clamp values in convert * Formatting * Fix loss of precision in reduce * Formatting * Fix bugs in reduction * Fix accumulator type in reference softmax implementation * Formatting * Update convert test * Remove unused variables * Remove unnecessary quant_dot check * Formatting * Add tests * Formatting * Remove unused code * Remove duplicate ops * Remove blaze dependency * Use set since shape::type_t is no hashable on gcc 5 * Formatting * Add dnnl binary op * Formatting * Add binary and eltwise * Formatting * Add softmax * Formatting * Remove unused operators * Add missing files * Formatting * Add lrn * Formatting * Add deconvolution * Formatting * Change allocate default * Add reorder * Formatting * Add reductions * Formatting * Sort lines * Change literals in another loop * Add pow operator * Formatting * Add pow operator * Formatting * Make sure shapes are packed * Allow broadcasted inputs * Remove unused operators * Simplify functions * Remove softmax * Add sub and erf functions * Formatting * Fix bug * Formatting * Improve parallism * Formatting * Allow multiple batch dimensions * Formatting * Move literal transforms out of lowering * Formatting * Add gather operator * Sort lines * Add early exit for carry * Formatting * Add missing concat * Rename macro * Fix deep nesting * Formatting * Fix cppcheck issues * Remov else * Move attribute to typedef * Formatting * Disable maybe-uninitialized warning since its broken on gcc * Add constexpr default constructor * Formatting * Fix compiler warnings * Fix adjust_allocation test Co-authored-by:
Shucai Xiao <shucai@gmail.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-