- 04 Aug, 2022 1 commit
-
-
Charlie Lin authored
* Dynamic shape handling in shape object * rewrite empty lens multibroadcast test * Shape class changes to handle dynamic * More throw errors for functions that don't make sense for dynamic shape * Print output changes * Serialization changes * Fixing serialization errors * Remove const on dyn_dim copy getters * Dynamic shape tests * Fix serialize errors * Add dyn_data struct to avoid ambiguous constructor * Tidy fix: emplace_back() over for loop * Tidy fix: use move * Use std::initializer_list in constructor Reverts the dyn_data struct change Should get around the ambiguous braced initialization list error * avoid typedef * element_space, min,max,opt _lens change * formatting * Comments fix * dynamic bytes() test * Seralize and reflect changes * formatting * Test the dynamic lens functions * progress * Formatting * Dynamic conv draft progress * Add operator<< tests for coverage * Coverage update * Add to conv dynamic batch test * Dynamic image size test * Dynamic weight handling * Dyn image shape test change, fix dyn weight cond * Comment update * Dynamic weights shape test and fix * Use ternary operator * Tidy fixes * Handle dynamic graph input shapes in ONNX parser * Formatting * Handle dynamic shape for convolution * formatting * cppcheck fixes * Add onnx test files * Fix typo * Disable auto_pad for dynamic input shape * check_shapes object checks for allowing dynamic shapes * Fix any_of * Change to maintain const objectness * Formatting * Check shapes allow dynamic * Refactor compute_shape() call into op.compute() Allows for per operator differences with handling dynamic shape Fix operation.hpp change to use the generator * Comment fix * Refactor normalize_attributes() calls to use max_lens() * Comment addition * Update other normalize_attributes() calls * Change to using constructor and add tests * Use const member function * Add more dynamic shape support * Add tests for error code coverage * Fix opt shape bug and add shape tests * capture all by ref * Fix typo with img shape calculation * Add more tests * dynamic auto pad attempt Linker error with pad_calc.cpp * Fix parse dyn auto_pad Should only need to use dynamic auto pad when the image shape or kernel shape are dynamic. For a dynamic batch size, the auto pad calculation is the same. * Fix linking error * Fix auto_pad bug Fixed input tensor with auto_pad setting on * auto_pad onnx tests * Fix auto_pad calculation, evaluate in ref_conv add ref_ops tests * Add shape tests, fix bugs * Refactor first two output dynamic len calculation * Conv MLIR test update * i64 MLIR test fix * Fix MLIR test typo Co-authored-by:Chris Austen <causten@users.noreply.github.com>
-
- 02 Aug, 2022 2 commits
-
-
Paul Fultz II authored
* Improve type printing in driver * Improve error with incorrect order for command * Add spell checking of arguments * Add validations and required checking * Add required arguments and groups
-
jungpark-mlir authored
-
- 29 Jul, 2022 1 commit
-
-
Umang Yadav authored
Currently, while copying a host buffer to the device, it first registers/maps the host buffer pointer to address space of the device. If the host buffer has been allocated by the hipHostMalloc then, it is implicitly registered to the device's address space, and no need to register again. This PR adds a check for the same.
-
- 27 Jul, 2022 2 commits
-
-
Ted Themistokleous authored
Gives better clarity to which argument is throwing an error, especially in cases with nested IF statements in the network.
-
Umang Yadav authored
instancenorm parser always creates literal of type float which would fail in type check while creating binary ops if model is fp16.
-
- 25 Jul, 2022 2 commits
-
-
Ted Themistokleous authored
* Add in changes for onnx Mod operator Initial operator for mod implementation and test cases for integer and floating based types. Need to use fmod from stdlib for floating point types. half_float::half thankfully is specced to the use the existing std::fmod() call when looking at the half.hpp implementation. fmod_flag should mirror the onnx fmod attribute. Right now using a floating point type without setting that on the user side to true will result in an exception. Ref ticket #1283
-
varunsh authored
* Add is_supported to the target * Add get_target_assignments * Rename assignment to target_assignments * Add ref target header to test * Add fpga target * Make context const in compute
-
- 22 Jul, 2022 1 commit
-
-
Umang Yadav authored
C++ API is not printing thrown exception string. this improves on it.
-
- 21 Jul, 2022 1 commit
-
-
Charlie Lin authored
Dynamic shape handling in shape object
-
- 19 Jul, 2022 3 commits
-
-
Umang Yadav authored
Bug 1: create_literal was using back_inserter to copy vector with already allocated size, causing double the size of literal. Fix 1 : not use back_inserter Bug 2: Input param to model can be from operation that has multiple output, in that case name of the input param would contain : e.g. input_1:0 Fix 2: Look for : and take substring
-
Charlie Lin authored
Depends on #1199 Adds ONNX parser functionality for dynamic input shapes. Uses options parameter in parse_onnx()
-
Charlie Lin authored
Changes to operator includes: removed some includes that were not used included argument.hpp where clang-tidy wanted it
-
- 12 Jul, 2022 3 commits
-
-
Paul Fultz II authored
Reduce header inclusion in op headers
-
Paul Fultz II authored
This will ensure that migraphx.h can be included from a C compiler, and check that the C API can be called. This includes stdbool.h which is needed when using bool from C.
-
Paul Fultz II authored
-
- 11 Jul, 2022 2 commits
-
-
turneram authored
-
Paul Fultz II authored
* Only run __syncthreads when there is data to preload * Improve loops * Add const attribute to improve optimizations
-
- 08 Jul, 2022 4 commits
-
-
Paul Fultz II authored
Show the number of operators and per operator avg time in summary... Summary: gpu::gemm: 8.738ms / 73 = 0.119699ms, 64% gpu::triadd_layernorm: 0.831381ms / 24 = 0.0346409ms, 7%
-
Paul Fultz II authored
Improve the assembly dump to track where certain instruction come from.
-
varunsh authored
Added is_supported and get_target_assignments methods to the target and program, respectively, to eventually support multi-target compilation and execution.
-
Charlie Lin authored
Initial sketch for changes to shape to handle dynamic dimensions
-
- 07 Jul, 2022 1 commit
-
-
Paul Fultz II authored
Instead of just unsqueezing to an axis of 1 a step can be set to use instead. So instead of unsqueezing {3, 12} to {3, 1, 12} a step of 2 will unsqeeze to {3, 2, 6} instead
-
- 06 Jul, 2022 1 commit
-
-
Paul Fultz II authored
*In the verification tests, check that saving and reloading the program is the same program. This also fixes serialization to always load instructions in the same order. There is also fixes for deconv and quant_conv which didn't save the solution id, and was broken for serialization.
-
- 05 Jul, 2022 2 commits
-
-
Paul Fultz II authored
* Add softmax kernel
-
Paul Fultz II authored
This reorders the transposes across slice to improve horizontal fusion for contiguous. This also improves eliminate_contiguous to remove contiguous better across splits.
-
- 03 Jul, 2022 1 commit
-
-
Paul Fultz II authored
* Add mlir c api * Formatting * Create a type attribute * Formatting * Parse module * Formatting * Add mlir dump function * Add test case * Formatting * Fix tidy issues * Update mlit version * Update to newer mlir * Format * Move mlir to the gpu and update the test * Formatting * Fix bug when appending module * Format * Remove old cmake flag * Update message * Add return * Format * Add mlir_compile * Format * Register dialect * Handle unsinged integers * Dont provide output for return instruction * Format * Add code to insert memrefs * Format * Add mlir verification * Formatting * Enable pointwise_fusion * Disable eliminate_data_type * Set kernal name * Format * Fix device name * Formatting * Fix output arg * Format * Updates * Upate hash * Add fuse_mlir pass * Format * Add fuse mlir * Format * Update mlir * Sort parameter names * Format * Reenable disabled passes * Remove old mlir conv * Remove asym default padding * Add more verbose tracing * Format * Fix compilation errors * Format * Whitelist operators * Format * Add namespace * Format * Update triple * Format * Use func dialect * Format * Use func.return * Format * Upgrade mlir version * Add comment * Handle symetrical padding * Format * Cleanup debug output * Format * List failed tests * Move mlir compile to jit pipeline * Format * Update version * Add source locations * Format * Correctly add module * Format * Update failed tests * Fix failures when mlir is disabled * Format * Update mlir version * Check type for fp32 * Format * Remove failed test * Update mlir in driver * Tidy fixes * Foramt * Tidy fixes * Format * Fix const * Remove from requirements * Fix cmake version * Fix tidy warning * Use another ifdef * Fix tidy * Other tidy fix * Format * Update hash * Add missing license files * Format * Format * Fix fnction name
-
- 30 Jun, 2022 1 commit
-
-
Paul Fultz II authored
This is an extension to insert_module_instructions, but instead of just inserting from a module, it can insert a range or a vector of instructions.
-
- 29 Jun, 2022 2 commits
-
-
Charlie Lin authored
Allows PyTorch converted version of SSD-resnet34 to work
-
Paul Fultz II authored
Compiles significantly faster than constructing all the objects. It also reduces recompiles as well.
-
- 26 Jun, 2022 1 commit
-
-
Paul Fultz II authored
* Add function to get a module tree * Get parent module in the pass manager
-
- 25 Jun, 2022 2 commits
-
-
Brian Pickrell authored
One-line fix to register the op miopen_fusion. This error was causing loading of compiled model files (*.mxr) to fail.
-
Paul Fultz II authored
* Jit contiguous
-
- 24 Jun, 2022 1 commit
-
-
Umang Yadav authored
Adds compute_method for the experimental custom ops. Adds a test for the same using HIP APIs. Depends on #1183 Solves #1101
-
- 23 Jun, 2022 1 commit
-
-
kahmed10 authored
* remove eliminate workspace * remove sync device and other tags
-
- 22 Jun, 2022 1 commit
-
-
Ted Themistokleous authored
Updated each source file in the repo with the existing license.
-
- 20 Jun, 2022 1 commit
-
-
Zhuoran Yin authored
* Fixing misspelled macro Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 17 Jun, 2022 3 commits
-
-
Umang Yadav authored
* remove code for allocation of C param in dot lowering * formatting Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
Ted Themistokleous authored
* [#935] Update tf_parser to have add_common_op() for parse_relu6 Similar to that of the onnx_parser.cpp add a add_common_op template and functionality to support clip based operations. This is done so clip operations can be guarenteed to have the same dimensions. * fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * fixup! fixup! fixup! fixup! [#935] Update tf_parser to have add_common_op() for parse_relu6 * Formatting * fixup! Formatting Co-authored-by:
Umang Yadav <29876643+umangyadav@users.noreply.github.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com>
-
kahmed10 authored
* add allocate op header * formatting * add replace_allocate pass * formatting * move output param to remove_allocate pass * formatting * fix bugs in replace_allocate pass * formatting * fix verify if tests * formatting * move if op logic * formatting * cleanup lowering * cleanup lowering * formatting * fix tidy * formatting * fix tidy * add cpu allocate check * formatting * change cpu allocate in pass * formatting * add some tests for replace_allocate pass * formatting * pass by ref * fix run_pass * formatting * update variable name for module * update dce to use contains() and fix tidy * formatting * update cppcheck * add if test * formatting * add if test * rename var to mod_output_names * formatting * remove conditional * update allocate op and tests * formatting * update replace_allocate tests * update create_output_names() and conditional in replace_allocate * formatting * remove extra variable in replace_allocate * update tools script for allocation_model Co-authored-by:
Umang Yadav <29876643+umangyadav@users.noreply.github.com> Co-authored-by:
Chris Austen <causten@users.noreply.github.com> Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com>
-