- 19 Jul, 2022 2 commits
-
-
Ted Themistokleous authored
Doing this allows for things to be university across all our targets without neededing an op for each ref. TODO. breaks right now. Reusing tests from previous iteration with some tweaks. Need to get back to this once I get better train of thought
-
Ted Themistokleous authored
-
- 18 Jul, 2022 1 commit
-
-
Ted Themistokleous authored
Conversion works, just issues with predicate right now.
-
- 11 Jul, 2022 2 commits
-
-
Ted Themistokleous authored
I may need additional checks for this, or to somehow find the matching division by zero, and cause a dangling reference so this gets flagged correctly at compile time. Current attempt inserts a divzero instruction that would later get picked up at the verify stage during compile. Not sure if this is correct incase we run into operator collisions down the road
-
Ted Themistokleous authored
This reverts commit fcc84214.
-
- 08 Jul, 2022 2 commits
-
-
Ted Themistokleous authored
-
Ted Themistokleous authored
-
- 05 Jul, 2022 1 commit
-
-
Ted Themistokleous authored
Making things more maintainable by splitting the unit tests that shared the baseline program for validation which resulted in an & case that was a bit more cumbersome to debug.
-
- 30 Jun, 2022 12 commits
-
-
Ted Themistokleous authored
-
Ted Themistokleous authored
-
Ted Themistokleous authored
Throw an exception when this occurs to indicate our simpliciation passes resulted in a singularity somewhere. Related to #1236
-
Ted Themistokleous authored
Simplify addition zero multiplication and divide operations. Added approrpiate test cases with returns and replacing the instruction and operand to just return zero.
-
Ted Themistokleous authored
Using the unit/neg unit matchers to handle subtraction operations in the same steps. Added unit tests for both cases.
-
Ted Themistokleous authored
Part of changes that go wtih #1236. Reverts -1 divide operations to a simple negation of the parameter
-
Ted Themistokleous authored
Added test case and code to simplify zero additions between paremeters and literals during simplifications. In reference to issue #1236
-
Ted Themistokleous authored
Done to satisfy simplifications specified by #1236 . Just replace every parameter divided by 1 with itself. It's assumed that the eliminate_identity() pass will handle generated identity operators in our run_pass()
-
Ted Themistokleous authored
Save a multiply operation with that of a negation of input parameter x. Suggested improvement via #1236
-
Ted Themistokleous authored
Original use case of having a literal 1, instead of any other number in simplify_mul_add, resulted in the find_unit_mult_const function to optimize away the literal 1 causing this test to fail. on the final check. Switched the constant to a non zero & one value, and now correctly passes.
-
Ted Themistokleous authored
Commit for the day, work in progress as I'm failing one of our unit tests outside of the change
-
Paul Fultz II authored
This is an extension to insert_module_instructions, but instead of just inserting from a module, it can insert a range or a vector of instructions.
-
- 29 Jun, 2022 1 commit
-
-
Charlie Lin authored
Allows PyTorch converted version of SSD-resnet34 to work
-
- 25 Jun, 2022 1 commit
-
-
Paul Fultz II authored
* Jit contiguous
-
- 24 Jun, 2022 2 commits
-
-
Ted Themistokleous authored
Used to determine what files contain a license and are stamped. If not we exit and return an error code that can be later ingested by another script, as well as a list of the outstanding files in questions. Currently baked in the list of files we should support or not support with licenses in them a well as some stuff to quickly ignore
-
Umang Yadav authored
Adds compute_method for the experimental custom ops. Adds a test for the same using HIP APIs. Depends on #1183 Solves #1101
-
- 22 Jun, 2022 1 commit
-
-
Ted Themistokleous authored
Updated each source file in the repo with the existing license.
-
- 17 Jun, 2022 2 commits
-
-
Ted Themistokleous authored
* [#935] Update tf_parser to have add_common_op() for parse_relu6 Similar to that of the onnx_parser.cpp add a add_common_op template and functionality to support clip based operations. This is done so clip operations can be guarenteed to have the same dimensions. * fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * fixup! fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * Formatting * fixup! Formatting Co-authored-by:
Umang Yadav <29876643+umangyadav@users.noreply.github.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com>
-
kahmed10 authored
* add allocate op header * formatting * add replace_allocate pass * formatting * move output param to remove_allocate pass * formatting * fix bugs in replace_allocate pass * formatting * fix verify if tests * formatting * move if op logic * formatting * cleanup lowering * cleanup lowering * formatting * fix tidy * formatting * fix tidy * add cpu allocate check * formatting * change cpu allocate in pass * formatting * add some tests for replace_allocate pass * formatting * pass by ref * fix run_pass * formatting * update variable name for module * update dce to use contains() and fix tidy * formatting * update cppcheck * add if test * formatting * add if test * rename var to mod_output_names * formatting * remove conditional * update allocate op and tests * formatting * update replace_allocate tests * update create_output_names() and conditional in replace_allocate * formatting * remove extra variable in replace_allocate * update tools script for allocation_model Co-authored-by:
Umang Yadav <29876643+umangyadav@users.noreply.github.com> Co-authored-by:
Chris Austen <causten@users.noreply.github.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com>
-
- 16 Jun, 2022 1 commit
-
-
Charlie Lin authored
* Use custom distance function * Pass module, skip order check if other module * Change other valid() * Remove unnecessary declaration * test multiple module dependency * Refactor to make more clear * Code cleanup * Simplify fix * Test EXPECT Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 07 Jun, 2022 1 commit
-
-
Zhuoran Yin authored
prioritizing int8 over int8x4 when it is applicable Amend return to continue in apply loop Adding error handling in case int8x4 compilation failed Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 02 Jun, 2022 1 commit
-
-
Paul Fultz II authored
-
- 26 May, 2022 1 commit
-
-
Paul Fultz II authored
* Upgrade to cppcheck 2.8
-
- 24 May, 2022 2 commits
-
-
Paul Fultz II authored
* Improve applicable batched gemms for bert
-
shivadbhavsar authored
As described in #1196, the ONNX mean parser does not work correctly for integral types. This update fixes the issue by handling integral types separately, where summation is performed before division. Additional test cases have also been added for handling integral types.
-
- 11 May, 2022 1 commit
-
-
Paul Fultz II authored
Fuse layernorm and added triadd_layernorm fusion. This is a prep performance booster
-
- 10 May, 2022 1 commit
-
-
Umang Yadav authored
Expose add_literal method in C/C++ api
-
- 06 May, 2022 1 commit
-
-
Paul Fultz II authored
Add compile tests for gpu math functions
-
- 03 May, 2022 1 commit
-
-
Paul Fultz II authored
Helps avoid dangling references. This also deprecates the constructors that didnt take a lifetime annotation since its ambiguous the lifetime.
-
- 29 Apr, 2022 1 commit
-
-
turneram authored
Add ref and gpu implementations for ONNX op GatherND Resolves #1032
-
- 26 Apr, 2022 1 commit
-
-
Umang Yadav authored
* expose get_queue method
-
- 23 Apr, 2022 1 commit
-
-
Charlie Lin authored
Implements the ReverseSequence ONNX operator as a parser. This parser can only handle a constant sequence_lens input. This is the same as what is handled for TensorRT as far as I can tell. We could handle a variable sequence_lens input; that would require ref and GPU implementations of the operator. The ONNX backend tests are disabled because this does not handle variable sequence_lens.
-