"src/tf/parse_biasadd.cpp" did not exist on "3d24a21cfc088e8530fac5648651f29434c6855e"
- 30 Nov, 2021 1 commit
-
-
Paul Fultz II authored
-
- 25 Nov, 2021 1 commit
-
-
Shucai Xiao authored
Resolves a problem in parsing the ssd-10 model. The problem is, after inserting contiguous in the auto_contiguous pass, standard output shape of some operators becomes non-standard. Then, if the next operator requires standard input shape, an exception is throw. For example, if we pass the following model: Input (standard shape) -> transpose (transposed) -> softmax (transposed) -> transpose (standard) -> gather. It works fine, and no contiguous is required. In the auto_contiguous pass, a contiguous is inserted after the first transpose. Then we need to replace the first transpose with the contiguous and recompute all shapes. When it comes to the gather operator, its input is a transposed shape, and an exception is thrown. The solution is in the recompute_shape() function. If it is called by the auto_contiguous pass and shape of an instruction is changed, and the shape is non_standard, we do not recompute shape of its output. The reason is: since its output shape is non_standard, a contiguous op will be added after the instruction, which will recompute shape for later operators.
-
- 24 Nov, 2021 1 commit
-
-
Paul Fultz II authored
* Check jit kernels files with clang-tidy
-
- 22 Nov, 2021 1 commit
-
-
kahmed10 authored
Allows --fp16 to be used in the driver to compare the target fp16 result and the ref fp32 result.
-
- 18 Nov, 2021 1 commit
-
-
Paul Fultz II authored
Do compilation in parallel
-
- 17 Nov, 2021 1 commit
-
-
Paul Fultz II authored
Currently, eliminate_contiguous will never remove contiguous for operators that use module inputs due to the fact that it doesn't pass the module inputs to compute_shape. - Update to pass the module inputs correctly to compute_shape - Fix the overloads of compute_shape so that when passed an empty vector of module inputs it will call the overload without module inputs - Add tests with contiguous and pointwise module function. - Move add_pointwise function to a seperate header to reuse across different tests
-
- 15 Nov, 2021 1 commit
-
-
kahmed10 authored
Currently we have the option of passing in --batch to the driver to change the batch size when the model has a dynamic dim value. We can use this flag to adjust the perf report's rate.
-
- 11 Nov, 2021 1 commit
-
-
Paul Fultz II authored
This enables the pointwise fusions using the MIGRAPHX_ENABLE_POINTWISE_FUSION env variable. Its disabled by default since MIOpen fusions need to be refactored. This also adds a compile_ops pass to compile the pointwise modules. All tests except test_gpu_fast_math passes with MIGRAPHX_ENABLE_POINTWISE_FUSION=1 set.
-
- 09 Nov, 2021 1 commit
-
-
turneram authored
* Add workaround for devices that do not support miopen conv fusions
-
- 05 Nov, 2021 1 commit
-
-
kahmed10 authored
Moving our Docker file from ROCm 4.3 to 4.5 Add Navi base GPUs in to the CI infrastructure
-
- 03 Nov, 2021 1 commit
-
-
Umang Yadav authored
In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape. If there is trailing binary pointwise operator after depthToSpace then, migraphx can move binary operator before contiguous and reshape of the depthtospce. So, it becomes reshape-->transpose-->binary_op-->contiguous-->reshape. Explicit contiguous wouldn't be required since binary_op outputs standard shape. So, it becomes reshape-->transpose-->binary-->reshape. simplify_reshapes already has matcher that can do this transformation. This PR adds test for cases like depthtospace +binary op. solves #905
-
- 28 Oct, 2021 3 commits
-
-
Shucai Xiao authored
This PR is the ref implementation of the nonmaxsuppression operator. It always returns the max possible output shape, which is the problem tracked in issue #948.
-
Umang Yadav authored
In migraphx, DepthToSpace (d2s) is implemented as reshape --> transpose --> contiguous --> reshape. This PR adds matcher to find d2s + unary pointwise ops. Application of the matcher moves the pointwise unary operation before the contiguous and reshape of the d2s. So it becomes reshape --> transpose --> unary --> contiguous --> reshape. Motivation is that, later pointwise module would be created out of unary --> contiguous --> reshape. Codegen for this pointwise module can write out buffer such that explicit contiguous and reshape wouldn't be required. This transformation is not always guaranteed to improve performance, since unary op will operate on non-standard shape. So, we would need some tuning mechanism to make decision. #905 pending PR for binary operations.
-
Shucai Xiao authored
GPU implementation of the roialign operator, using the jit approach to reduce the lib size.
-
- 20 Oct, 2021 1 commit
-
-
Shucai Xiao authored
Implementation of the roialign operator. For now, we have only the ref implementation. When we run a model on the GPU, we fall back the execution to use the ref implementation.
-
- 19 Oct, 2021 2 commits
-
-
Paul Fultz II authored
pthread linking errors on SLES.
-
Paul Fultz II authored
Adds a pass to fuse pointwise operators into one "pointwsie" op that has a submodule which does the calculation.
-
- 18 Oct, 2021 2 commits
-
-
Paul Fultz II authored
Designed to allow a user to format the values needed for the json_string: migraphx::operation("reduce_mean", "{axes : [%i, %i, %i, %i]}", axes[0], axes[1], axes[2], axes[3]) instead of needing to use string concat or stringstream -
Paul Fultz II authored
Enable a cppcheck rule to catch these redundant casts in the future
-
- 15 Oct, 2021 1 commit
-
-
Cagri authored
Added features: This enables wrapping each migraphx operator with rocTX markers. It adds new knob trace to migraphx-driver binary. Limitation: rocTX standalone does not output a file, it needs to be used with rocprof. Example command line: /opt/rocm/bin/rocprof -i ./in.txt --hip-trace --roctx-trace --flush-rate 10ms --timestamp on -d cagri_out --obj-tracking on /opt/rocm/bin/migraphx-driver trace ./resnet50-v2-7.onnx --onnx --gpu Co-authored-by:Shucai Xiao <shucai@gmail.com>
-
- 14 Oct, 2021 1 commit
-
-
Umang Yadav authored
Inverse of DepthToSpace op Co-authored-by:Shucai Xiao <shucai@gmail.com>
-
- 13 Oct, 2021 2 commits
-
-
Shucai Xiao authored
when running a model on GPU, migraphx tries to print out content from gpu memory, which causes a segfault. The solution is to copy the gpu memory content back to CPU before the print.
-
Paul Fultz II authored
-
- 08 Oct, 2021 2 commits
-
-
Shucai Xiao authored
This PR is for the nonzero operator with static output shape. Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
Umang Yadav authored
Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs. Aim is to have the definition of dot operator as C = A . B without having alpha or beta. In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.
-
- 01 Oct, 2021 2 commits
-
-
turneram authored
Add multinomial op to onnx parser with ref and GPU implementations. The onnx parser inserts a literal of shape {batch_size, sample_size} with random values in the range [0, 1) and inserts existing ops to compute the cumulative density function. The multinomial operator multiplies the random values by the sum of the CDF and returns the index of the first element of the CDF that is greater than the result, representing samples randomly drawn from [0, class_size) that follow the log-probability distribution. Resolves #821 Co-authored-by:Shucai Xiao <shucai@gmail.com>
-
turneram authored
Add RandomNormal, RandomNormalLike, RandomUniform, and RandomUniformLike to onnx parser and onnx tests Each pair of Random*/Random*Like is implemented using a single op_parser because the ops share the same essential attributes and algorithm with the difference that Random*Like get the output type and/or shape from an input argument and Random* take both from attributes. Resolves #907 Resolves #959
-
- 29 Sep, 2021 1 commit
-
-
Cagri Eryilmaz authored
Supports 1,11,13 ONNX Operator Set
-
- 27 Sep, 2021 1 commit
-
-
kahmed10 authored
Checks wavefront size, then changes implementation and number of threads for DPP reduce
-
- 23 Sep, 2021 1 commit
-
-
Umang Yadav authored
Add forward compatibility support for compile options
-
- 21 Sep, 2021 1 commit
-
-
Paul Fultz II authored
Needed to bypass passes when fusing pointwise operators into a module.
-
- 17 Sep, 2021 3 commits
-
-
Paul Fultz II authored
This reverts commit 9e43cb8b.
-
Umang Yadav authored
This PR aims to remove alpha and beta attributes from dot operator completely. Previously dot operator was defined as C = alpha * A . B + beta * C where * is scalar multiplication and . is dot product or matrix multiplication depending on dimension of the inputs. Aim is to have the definition of dot operator as C = A . B without having alpha or beta. In order to achieve the same effect as alpha and beta (1) it multiplies the one of the inputs to the dot operator with alpha value. (2) if beta is present then, multiplies the C with beta and then adds into the output from step 1.
-
Umang Yadav authored
make file_options struct an opaque object for ABI compatibility, uses make generate to auto-generate and includes modified tests. Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 16 Sep, 2021 1 commit
-
-
Shucai Xiao authored
Add Loop operator for opset version 13. Notes: 1) Default max iteration number is 10 if no max iteration number is provided 2) To change the max iter number, a user can set the max_loop_iterations in the onnx_option struct when parsing a model. 3) The returned shape of the scan output is from the max_loop_iterations even the actual loop num is less than that. This issue also applies to other operators like NonZero and NonMaxSuppression. A issue #948 is created to track this and to be resolved later. Co-authored-by:
Paul <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 10 Sep, 2021 2 commits
-
-
Paul Fultz II authored
Assert shapes dont change Co-authored-by:Shucai Xiao <Shucai.Xiao@amd.com>
-
turneram authored
Add ability to parse ThresholdedRelu ONNX operator. Resolves #888 Co-authored-by:Shucai Xiao <shucai@gmail.com>
-
- 07 Sep, 2021 3 commits
-
-
Shucai Xiao authored
Add operators, refactor parsers, add rewrite passes, add tests Add ref implementations Move broadcasting of scales and zero points to onnx parser Allow for x and zero_point to have different types in quantizelinear; fix zero_point default type fp16 and fp8 quantization to include subgraph and parameters fix unit test to use qdq operators for int8 quantization Co-authored-by:turneram <alturner@amd.com>
-
Paul Fultz II authored
Improve check_context to handle submodules better Co-authored-by:Shucai Xiao <shucai@gmail.com>
-
Paul Fultz II authored
* Add module pass manage
-