- 05 Jul, 2022 6 commits
-
-
Ted Themistokleous authored
Use this call to also skip converts when running a simplify_algebra pass over a program.
-
Ted Themistokleous authored
-
Ted Themistokleous authored
-
Ted Themistokleous authored
Allows us to not throw warnings instead of using [[maybe_unused]] flag instead.
-
Ted Themistokleous authored
Adds this to handle broadcasted values instead of just scalars
-
Ted Themistokleous authored
Used to avoid the case where 1e-12 is used and is matched as zero errornously resulting in removing the call with the incorrect value.
-
- 01 Jul, 2022 1 commit
-
-
Ted Themistokleous authored
-
- 30 Jun, 2022 16 commits
-
-
Ted Themistokleous authored
-
Ted Themistokleous authored
-
Ted Themistokleous authored
-
Ted Themistokleous authored
Throw an exception when this occurs to indicate our simpliciation passes resulted in a singularity somewhere. Related to #1236
-
Ted Themistokleous authored
Simplify addition zero multiplication and divide operations. Added approrpiate test cases with returns and replacing the instruction and operand to just return zero.
-
Ted Themistokleous authored
Using the unit/neg unit matchers to handle subtraction operations in the same steps. Added unit tests for both cases.
-
Ted Themistokleous authored
-
Ted Themistokleous authored
Part of changes that go wtih #1236. Reverts -1 divide operations to a simple negation of the parameter
-
Ted Themistokleous authored
Add handling for zero addition operations into the find_unit_ops() matcher functor.
-
Ted Themistokleous authored
Added test case and code to simplify zero additions between paremeters and literals during simplifications. In reference to issue #1236
-
Ted Themistokleous authored
Simplfies our code for all operations and reusing original unit tests for overalpping matcher.
-
Ted Themistokleous authored
Done to satisfy simplifications specified by #1236 . Just replace every parameter divided by 1 with itself. It's assumed that the eliminate_identity() pass will handle generated identity operators in our run_pass()
-
Ted Themistokleous authored
Save a multiply operation with that of a negation of input parameter x. Suggested improvement via #1236
-
Ted Themistokleous authored
-
Ted Themistokleous authored
Commit for the day, work in progress as I'm failing one of our unit tests outside of the change
-
Paul Fultz II authored
This is an extension to insert_module_instructions, but instead of just inserting from a module, it can insert a range or a vector of instructions.
-
- 29 Jun, 2022 2 commits
-
-
Charlie Lin authored
Allows PyTorch converted version of SSD-resnet34 to work
-
Paul Fultz II authored
Compiles significantly faster than constructing all the objects. It also reduces recompiles as well.
-
- 26 Jun, 2022 1 commit
-
-
Paul Fultz II authored
* Add function to get a module tree * Get parent module in the pass manager
-
- 25 Jun, 2022 2 commits
-
-
Brian Pickrell authored
One-line fix to register the op miopen_fusion. This error was causing loading of compiled model files (*.mxr) to fail.
-
Paul Fultz II authored
* Jit contiguous
-
- 24 Jun, 2022 1 commit
-
-
Umang Yadav authored
Adds compute_method for the experimental custom ops. Adds a test for the same using HIP APIs. Depends on #1183 Solves #1101
-
- 23 Jun, 2022 1 commit
-
-
kahmed10 authored
* remove eliminate workspace * remove sync device and other tags
-
- 22 Jun, 2022 1 commit
-
-
Ted Themistokleous authored
Updated each source file in the repo with the existing license.
-
- 20 Jun, 2022 1 commit
-
-
Zhuoran Yin authored
* Fixing misspelled macro Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 17 Jun, 2022 3 commits
-
-
Umang Yadav authored
* remove code for allocation of C param in dot lowering * formatting Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
Ted Themistokleous authored
* [#935] Update tf_parser to have add_common_op() for parse_relu6 Similar to that of the onnx_parser.cpp add a add_common_op template and functionality to support clip based operations. This is done so clip operations can be guarenteed to have the same dimensions. * fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * fixup! fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * Formatting * fixup! Formatting Co-authored-by:
Umang Yadav <29876643+umangyadav@users.noreply.github.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com>
-
kahmed10 authored
* add allocate op header * formatting * add replace_allocate pass * formatting * move output param to remove_allocate pass * formatting * fix bugs in replace_allocate pass * formatting * fix verify if tests * formatting * move if op logic * formatting * cleanup lowering * cleanup lowering * formatting * fix tidy * formatting * fix tidy * add cpu allocate check * formatting * change cpu allocate in pass * formatting * add some tests for replace_allocate pass * formatting * pass by ref * fix run_pass * formatting * update variable name for module * update dce to use contains() and fix tidy * formatting * update cppcheck * add if test * formatting * add if test * rename var to mod_output_names * formatting * remove conditional * update allocate op and tests * formatting * update replace_allocate tests * update create_output_names() and conditional in replace_allocate * formatting * remove extra variable in replace_allocate * update tools script for allocation_model Co-authored-by:
Umang Yadav <29876643+umangyadav@users.noreply.github.com> Co-authored-by:
Chris Austen <causten@users.noreply.github.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com>
-
- 16 Jun, 2022 1 commit
-
-
Charlie Lin authored
* Use custom distance function * Pass module, skip order check if other module * Change other valid() * Remove unnecessary declaration * test multiple module dependency * Refactor to make more clear * Code cleanup * Simplify fix * Test EXPECT Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 10 Jun, 2022 1 commit
-
-
Paul Fultz II authored
Consolidate the vectorize and preload Add vectorization to reduction Co-authored-by:kahmed10 <15948690+kahmed10@users.noreply.github.com>
-
- 07 Jun, 2022 1 commit
-
-
Zhuoran Yin authored
prioritizing int8 over int8x4 when it is applicable Amend return to continue in apply loop Adding error handling in case int8x4 compilation failed Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 03 Jun, 2022 1 commit
-
-
Paul Fultz II authored
Break up the gpu::code_object print to show the actual kernels... gpu::code_object::add_kernel: 0.646121ms, 5% gpu::code_object::mul_kernel: 0.623822ms, 5% gpu::code_object::add_mul_erf_add_mul_mul_kernel: 0.498902ms, 4% gpu::code_object::mul_add_kernel: 0.478352ms, 4%
-
- 02 Jun, 2022 1 commit
-
-
Paul Fultz II authored
-