- 22 Mar, 2022 1 commit
-
-
Shucai Xiao authored
-
- 21 Mar, 2022 1 commit
-
-
Charlie Lin authored
* LpNormalization ONNX parser
-
- 18 Mar, 2022 2 commits
-
-
turneram authored
Add exclusive and reverse modes to gpu implementation of prefix_scan_sum, which completes support for ONNX op CumSum
-
Paul Fultz II authored
The get_context may change in the future(when we support multi-targets) so make this experimental for now.
-
- 15 Mar, 2022 2 commits
-
-
Umang Yadav authored
API includes following create_module, get_main_module add_instruction without module args add_instruction with module args add_parameter add_return
-
Paul Fultz II authored
This adds iterators to tensor_view, which can allow kernels to work with non-standard shapes like for roialign. To improve the performance of indexing when using the iterators, the shape class was updated to use integral_constants since the compiler doesn't always fold the const values. An integral_constant will at least enforce that in the AST. Finally, since index calculations with single integers are improved, I also updated pointwise to use single index rather than multi index. There is about 4% improvement in some cases.
-
- 14 Mar, 2022 3 commits
-
-
Charlie Lin authored
Have git ignore build directory and compiled python Add ignores from other ROCm projects that look applicable Ignore downloaded models in test/ Remove including visual studio settings
-
Shucai Xiao authored
change max number of groups in a kernel to 1B for greater performance
-
Paul Fultz II authored
* Show the operator fields in the driver
-
- 11 Mar, 2022 1 commit
-
-
Shucai Xiao authored
The module::debug_print(ins) is very slow, which makes the trave_eval==1/2 very slow. The reason is printing an ins involves search the whole module to get the instruction, the print it. This change is to fix that by calling module::print() to get names of all instructions of a program, then print the instruction by getting its name from a hash map.
-
- 10 Mar, 2022 4 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 09 Mar, 2022 4 commits
-
-
Charlie Lin authored
Add Celu ONNX operator
-
Paul Fultz II authored
Add python API to construct shape class
-
Shucai Xiao authored
-
kahmed10 authored
Add a callable C++ API to migraphx
-
- 08 Mar, 2022 6 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Charlie Lin authored
* Implement size ONNX operator and tests
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 07 Mar, 2022 3 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Umang Yadav authored
add_common_op for parse_clip Should fix #1119
-
- 04 Mar, 2022 7 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Charlie Lin authored
Adds EyeLike ONNX parser and unit tests.
-
bpickrel authored
Changed the pooling values for two structures from strings to specialized enum classes. Many test and operator parsing changes to support this. Introduces one new source file, op_enums.cpp.
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 03 Mar, 2022 6 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Paul Fultz II authored
Boost the max number of workgroups for pointwise ops by matching what we are doing in launch.hpp
-
Shucai Xiao authored
-
kahmed10 authored
better performance doing it this way
-
turneram authored
Add onnx parser and ref and gpu implementations of ONNX op ScatterND
-