- 28 Jul, 2022 1 commit
-
-
Ted Themistokleous authored
-
- 27 Jul, 2022 1 commit
-
-
Ted Themistokleous authored
-
- 25 Jul, 2022 1 commit
-
-
varunsh authored
* Add is_supported to the target * Add get_target_assignments * Rename assignment to target_assignments * Add ref target header to test * Add fpga target * Make context const in compute
-
- 22 Jul, 2022 4 commits
-
-
Ted Themistokleous authored
-
Ted Themistokleous authored
-
Ted Themistokleous authored
-
Umang Yadav authored
C++ API is not printing thrown exception string. this improves on it.
-
- 21 Jul, 2022 2 commits
-
-
Ted Themistokleous authored
-
Charlie Lin authored
Dynamic shape handling in shape object
-
- 19 Jul, 2022 6 commits
-
-
Umang Yadav authored
Bug 1: create_literal was using back_inserter to copy vector with already allocated size, causing double the size of literal. Fix 1 : not use back_inserter Bug 2: Input param to model can be from operation that has multiple output, in that case name of the input param would contain : e.g. input_1:0 Fix 2: Look for : and take substring
-
Ted Themistokleous authored
Doing this allows for things to be university across all our targets without neededing an op for each ref. TODO. breaks right now. Reusing tests from previous iteration with some tweaks. Need to get back to this once I get better train of thought
-
Ted Themistokleous authored
-
Charlie Lin authored
Depends on #1199 Adds ONNX parser functionality for dynamic input shapes. Uses options parameter in parse_onnx()
-
Ted Themistokleous authored
-
Charlie Lin authored
Changes to operator includes: removed some includes that were not used included argument.hpp where clang-tidy wanted it
-
- 18 Jul, 2022 1 commit
-
-
Ted Themistokleous authored
Conversion works, just issues with predicate right now.
-
- 12 Jul, 2022 3 commits
-
-
Paul Fultz II authored
Reduce header inclusion in op headers
-
Paul Fultz II authored
This will ensure that migraphx.h can be included from a C compiler, and check that the C API can be called. This includes stdbool.h which is needed when using bool from C.
-
Paul Fultz II authored
-
- 11 Jul, 2022 6 commits
-
-
turneram authored
-
Ted Themistokleous authored
I may need additional checks for this, or to somehow find the matching division by zero, and cause a dangling reference so this gets flagged correctly at compile time. Current attempt inserts a divzero instruction that would later get picked up at the verify stage during compile. Not sure if this is correct incase we run into operator collisions down the road
-
Ted Themistokleous authored
This reverts commit fcc84214.
-
Ted Themistokleous authored
-
Ted Themistokleous authored
This reverts commit 4aeacc17.
-
Paul Fultz II authored
* Only run __syncthreads when there is data to preload * Improve loops * Add const attribute to improve optimizations
-
- 08 Jul, 2022 5 commits
-
-
Paul Fultz II authored
Show the number of operators and per operator avg time in summary... Summary: gpu::gemm: 8.738ms / 73 = 0.119699ms, 64% gpu::triadd_layernorm: 0.831381ms / 24 = 0.0346409ms, 7%
-
Ted Themistokleous authored
-
Paul Fultz II authored
Improve the assembly dump to track where certain instruction come from.
-
varunsh authored
Added is_supported and get_target_assignments methods to the target and program, respectively, to eventually support multi-target compilation and execution.
-
Charlie Lin authored
Initial sketch for changes to shape to handle dynamic dimensions
-
- 07 Jul, 2022 1 commit
-
-
Paul Fultz II authored
Instead of just unsqueezing to an axis of 1 a step can be set to use instead. So instead of unsqueezing {3, 12} to {3, 1, 12} a step of 2 will unsqeeze to {3, 2, 6} instead
-
- 06 Jul, 2022 1 commit
-
-
Paul Fultz II authored
*In the verification tests, check that saving and reloading the program is the same program. This also fixes serialization to always load instructions in the same order. There is also fixes for deconv and quant_conv which didn't save the solution id, and was broken for serialization.
-
- 05 Jul, 2022 8 commits
-
-
Paul Fultz II authored
* Add softmax kernel
-
Ted Themistokleous authored
Use this call to also skip converts when running a simplify_algebra pass over a program.
-
Ted Themistokleous authored
-
Ted Themistokleous authored
-
Paul Fultz II authored
This reorders the transposes across slice to improve horizontal fusion for contiguous. This also improves eliminate_contiguous to remove contiguous better across splits.
-
Ted Themistokleous authored
Allows us to not throw warnings instead of using [[maybe_unused]] flag instead.
-
Ted Themistokleous authored
Adds this to handle broadcasted values instead of just scalars
-
Ted Themistokleous authored
Used to avoid the case where 1e-12 is used and is matched as zero errornously resulting in removing the call with the incorrect value.
-