- 06 Apr, 2022 1 commit
-
-
Umang Yadav authored
Adds following API binding and tests to python : add_return add_instruction add_parameter create_module.
-
- 31 Mar, 2022 1 commit
-
-
Umang Yadav authored
Documentation update for valid targets
-
- 29 Mar, 2022 1 commit
-
-
Paul Fultz II authored
This adds the infrastructure so we can compile everything in parallel, whereas before only pointwise kernels were compiled in parallel. This will also directly integrate with lowering and the gpu-driver. The kernels for pointwise and roialign are using this infrastructure. Scatternd is not since it does require standard shape. This also makes it easier to add new runtime compiled kernels in the future.
-
- 28 Mar, 2022 2 commits
-
-
Paul Fultz II authored
Use ifdef instead of comment for the auto-generated method declarations for type erased classes (#1138) It seems the formatting of comments are unreadable for larger methods, so instead just generate a struct with the methods in the interface and add a comment if its optional. It wraps this in #ifdef TYPE_ERASED_DECLARATION(assuming this would never be defined) instead of #if 0, so most editors can still provide syntax highlighting(although I think vscode with clangd will still gray it out unfortunately).
-
Paul Fultz II authored
* Use ccache for runtime compilation
-
- 25 Mar, 2022 1 commit
-
-
Paul Fultz II authored
* Handle string literal in construction * Improve get_default with vector
-
- 24 Mar, 2022 1 commit
-
-
Paul Fultz II authored
This creates a custom op which has name() and compute_shape() methods.
-
- 22 Mar, 2022 1 commit
-
-
Paul Fultz II authored
Operators using arg.reshape() method the lifetime will be extended.
-
- 21 Mar, 2022 1 commit
-
-
Charlie Lin authored
* LpNormalization ONNX parser
-
- 18 Mar, 2022 2 commits
-
-
turneram authored
Add exclusive and reverse modes to gpu implementation of prefix_scan_sum, which completes support for ONNX op CumSum
-
Paul Fultz II authored
The get_context may change in the future(when we support multi-targets) so make this experimental for now.
-
- 15 Mar, 2022 2 commits
-
-
Umang Yadav authored
API includes following create_module, get_main_module add_instruction without module args add_instruction with module args add_parameter add_return
-
Paul Fultz II authored
This adds iterators to tensor_view, which can allow kernels to work with non-standard shapes like for roialign. To improve the performance of indexing when using the iterators, the shape class was updated to use integral_constants since the compiler doesn't always fold the const values. An integral_constant will at least enforce that in the AST. Finally, since index calculations with single integers are improved, I also updated pointwise to use single index rather than multi index. There is about 4% improvement in some cases.
-
- 14 Mar, 2022 2 commits
-
-
Shucai Xiao authored
change max number of groups in a kernel to 1B for greater performance
-
Paul Fultz II authored
* Show the operator fields in the driver
-
- 11 Mar, 2022 1 commit
-
-
Shucai Xiao authored
The module::debug_print(ins) is very slow, which makes the trave_eval==1/2 very slow. The reason is printing an ins involves search the whole module to get the instruction, the print it. This change is to fix that by calling module::print() to get names of all instructions of a program, then print the instruction by getting its name from a hash map.
-
- 09 Mar, 2022 3 commits
-
-
Charlie Lin authored
Add Celu ONNX operator
-
Paul Fultz II authored
Add python API to construct shape class
-
kahmed10 authored
Add a callable C++ API to migraphx
-
- 08 Mar, 2022 1 commit
-
-
Charlie Lin authored
* Implement size ONNX operator and tests
-
- 07 Mar, 2022 1 commit
-
-
Umang Yadav authored
add_common_op for parse_clip Should fix #1119
-
- 04 Mar, 2022 2 commits
-
-
Charlie Lin authored
Adds EyeLike ONNX parser and unit tests.
-
bpickrel authored
Changed the pooling values for two structures from strings to specialized enum classes. Many test and operator parsing changes to support this. Introduces one new source file, op_enums.cpp.
-
- 03 Mar, 2022 3 commits
-
-
Paul Fultz II authored
Boost the max number of workgroups for pointwise ops by matching what we are doing in launch.hpp
-
kahmed10 authored
better performance doing it this way
-
turneram authored
Add onnx parser and ref and gpu implementations of ONNX op ScatterND
-
- 02 Mar, 2022 2 commits
-
-
Charlie Lin authored
Implements the IsNaN operator, ref, gpu, and onnx parser.
-
bpickrel authored
Update the base version of clang-format from 5.0 to 10.0
-
- 25 Feb, 2022 3 commits
-
-
Paul Fultz II authored
Add with_type to shape class
-
Paul Fultz II authored
Needed for custom_op so we can generically convert the C type back to the C++ type in the function pointer.
-
Paul Fultz II authored
wrapped in a any_ptr class so the type can be checked at runtime for a mismatch.
-
- 24 Feb, 2022 1 commit
-
-
Paul Fultz II authored
Make doc/CMakeLists.txt standalone Switch to use rocm-cmake modules for document generation Add CONFIGURE_DEPENDS to file(GLOB) so it will update without an explicit cmake run Add STRINGS property for build type to make it easier to switch build types with ccmake Various fixes and improvements
-
- 23 Feb, 2022 1 commit
-
-
Shucai Xiao authored
This PR is the resolve two problems in the issue#999, i.e., non_standard_shape input to reshape and reduce_mean. Three fixes: Any operator that has a standard shape requirement will add a contiguous input for its input. Eliminate_contiguous, when computing whether a contiguous can be removed, we should use all the updated args, not just the one that is being checked. In two optimization in the simplify_reshape, we remove the contiguous in the reshaper name list, since eliminate_contiguous will remove the contiguous if it can be removed. the solution is add an attribute to the operator that requires standard input shape, then in the auto_contiguous pass, add a contiguous to every input of such operators.
-
- 16 Feb, 2022 2 commits
-
-
Umang Yadav authored
Support nonstandard shapes like slice, broadcast and transpose for the unsqueeze op
-
kahmed10 authored
-
- 11 Feb, 2022 1 commit
-
-
kahmed10 authored
* add submodule test * remove for loop * simplify reshape test
-
- 09 Feb, 2022 2 commits
-
-
Paul Fultz II authored
There is now a MIGRAPHX_DISABLE_POINTWISE_FUSION to disable it
-
Umang Yadav authored
Support slice, broadcast and transpose shapes for the squeeze op.
-
- 08 Feb, 2022 2 commits
-
-
Paul Fultz II authored
This causes incorrect memory coloring, which was causing the accuracy failures in the vision model when enabling the pointwise fusions. Resnet50, inceptionv3 and inceptionv4 do verify now in the driver.
-
Paul Fultz II authored
Enforce types to avoid compilation error in pointwise fusions This fixes compile failure: gpt-2, fp16 on Navi
-