"git@developer.sourcefind.cn:OpenDAS/torchaudio.git" did not exist on "71214b48548b1dcb6ebd581dd36a9d0e60af6837"
- 27 Jun, 2023 1 commit
-
-
Ted Themistokleous authored
We can't change the behaviour of the nonzero op and we currently pad the output with zeros. This unfortunately obfuscates the following cases: 1. When the only nonzero element is the first index - the whole tensor is padded with zeros its not obvious if the first value is valid index or padded 2. When the nonzero elements vector is used for indicies. The resulting vector with the padded value of 0 is still a valid index thus gather/gatherND and other ops will assume the 0 index is valid and operate accordingly. In this case, by adding a sentinel value of the number of static elements used by the desired shape, the resulting nonzero output can now track how many elements are valid by determining the value in the correct range. Originally I intended to use -1 but not all datatypes use this if say, we're dealing with unsigned values in our vectors or booleans.
-
- 26 Jun, 2023 1 commit
-
-
Umang Yadav authored
-
- 23 Jun, 2023 1 commit
-
-
Umang Yadav authored
Fixes #1852 Fixes #1847
-
- 22 Jun, 2023 2 commits
-
-
Zhuoran Yin authored
Add mlir quant_dot operator support
-
Ted Themistokleous authored
* Update instal_prereqs.sh to handle 22.04 defines Needed to run containers with 22.04 * Add Dockerfile for Ubuntu 22.04 and ROCm 5.5 Updated dockerfile to use ROCm 5.5 and Ubuntu 22.04 for use with building MIGraphX Able to run make -j$(nproc) check successfully with this * Clean this up since its breaking CI * cleanup install preq some more. -use one protobuf version -remove extra python3.8 installs from 3.10 case * Move comment for protobuf comment * Move Dockerfile for 22.04 to Dockerfiles/ folder * Move and rename 2204 docker file remove Docker_** from the name. Move these to tools/docker * Add pip3 installs to be shared between python versions * Add Package pin from repo.radeon.com * Add CMAKE_ARG ONNX_USE_PROTOBUF_SHARED_LIBS for every default python dist Set this to be default as part of installing prereqs --------- Co-authored-by:
Charlie Lin <charlie.lin@amd.com> Co-authored-by:
Umang Yadav <29876643+umangyadav@users.noreply.github.com> Co-authored-by:
kahmed10 <15948690+kahmed10@users.noreply.github.com>
-
- 21 Jun, 2023 2 commits
-
-
Paul Fultz II authored
Co-authored-by:kahmed10 <15948690+kahmed10@users.noreply.github.com>
-
Umang Yadav authored
Co-authored-by:kahmed10 <15948690+kahmed10@users.noreply.github.com>
-
- 20 Jun, 2023 1 commit
-
-
github-actions[bot] authored
Co-authored-by:
causten <causten@users.noreply.github.com> Co-authored-by:
Ted Themistokleous <107195283+TedThemistokleous@users.noreply.github.com>
-
- 17 Jun, 2023 3 commits
-
-
Ted Themistokleous authored
* Add trace for SIMPLIFY_ALGEBRA matches * Fix format * handle review comments from Umang -int to size_t for trace -move env arg to top of simplify_algebra.cpp -handle overload beter for find_matches * Rename trace_mod param to trace_pass More representative naming for what this trace flag does
-
turneram authored
* Add initial ck_gemm code * Format * Add additional src files * Format * Add include * Simplify fuse_ck * Format * Rename var * Enable pass * Update ck version * Fix include * Add group stride * Disable warnings for ck headers * Format * Add unpack array * Add interface to enable tuning * Format * Update compile_ops to handle tuning config * Format * Add some comments * Move time_op to migraphx_gpu * Add banchmarking * Refactor * Format * Add lift class macro * Use device name * Format * Generate configs * Format * Pass tuning parameter * Move data type to is_ck_gemm matcher * Format * Add problem_cache to avoid retuning same configs * Format * Format * Mark the problems * Format * Use is_null * Format * Resize vector * Only tune with exaustive tuning * Format * Use assert * FOrmat * Tidy fixes * More tidy fixes * Format * Add license to missing files * Format * Use transform * Format * Fix tidy * Format * Fix cppcheck issues * Format * Add static_assert * Add ops header * Add assertion in batcher * Format * Improve the batch fold check * Format * Add where op workaround for CK * Skip if any input is not a supported ck type * Format * Check batch is standard * Format * Remove redundant static keyword * Update commit hash * Fix error when running without --exhaustive-tune * Formatting * Formatting * Remove fuse_ck_gemm_softmax_gemm * Update ck hash * Correct spelling mistake * Remove commented out logic from fuse_ck * Remove unused include and add comment * Formatting * Remove redundant get_shape and remove ck_gemm from names * Formatting * Allow for mixed types with int8 gemms * Formatting * Add back find_package from merge * Update CK commit hash and add gfx940 to fuse_ops supported archs * Formatting * Update CK hash
-
Umang Yadav authored
* Fix convert for the NaNs * NaNs can't be compared, use std::isnan() * formatting * formatting * formatting * add extra tests
-
- 16 Jun, 2023 2 commits
-
-
Charlie Lin authored
* initial * Added tests and new functionality * Update optimals handling * Simplify conditionals * Ref test, update docs * Remove comment, suggestion unclear --------- Co-authored-by:Umang Yadav <29876643+umangyadav@users.noreply.github.com>
-
Paul Fultz II authored
-
- 15 Jun, 2023 2 commits
-
-
Umang Yadav authored
-
Brian Pickrell authored
* fix parse_instancenorm to create broadcast and multibroadcast instructions with two dynamic shape arguments instead of 1. Their make_op() functions don't support dynamic shapes when called with one input. This caused an error when parsing an ONNX 3duunet model * Use add_common_op() to create multibroadcast op. * add verification and parsing test for instance_norm with dynamic input. Parse test doesn't pass. * fix for test; still doesn't pass * another fix for test; still doesn't pass * work in progress, instance_norm_dyn_batch_test works but instance_norm_test doesn't * fix onnx instancenorm tests to match parser changes. Passes all check tests * Updated comments explaining usage of add_common_op() * hand-merged conflicts with develop * fix instance_norm_half_test after merge * add Onnx test instance_norm_dyn_batch_half_test * add shape test cases broadcast_1in_dyn_error and multibroadcast_1in_dyn_error_0
-
- 14 Jun, 2023 2 commits
-
-
Umang Yadav authored
* add fix for the trace_eval * Add throw for the debug builds * Formatting --------- Co-authored-by:Chris Austen <causten@users.noreply.github.com>
-
Umang Yadav authored
-
- 13 Jun, 2023 1 commit
-
-
Charlie Lin authored
-
- 12 Jun, 2023 1 commit
-
-
Paul Fultz II authored
-
- 09 Jun, 2023 3 commits
-
-
Chris Austen authored
-
Umang Yadav authored
-
Umang Yadav authored
#1791 Added hash function for value class. It uses the Visit function and has specialization for the bool_type and <vector> type but was missing specialization for the nullptr. Nullptr caused compilation issues for RHEL, SLES and CentOS.
-
- 08 Jun, 2023 2 commits
-
-
Paul Fultz II authored
Enable with MIGRAPHX_ENABLE_CK=1 and --exhaustive-tune tune flag
-
Chris Austen authored
-
- 06 Jun, 2023 2 commits
-
-
Umang Yadav authored
-
Umang Yadav authored
Sigmoid approximation for GeLU was introduced in #1299 for Fp16. The sigmoid approximation is known to get better perf but lower accuracy. https://arxiv.org/pdf/1606.08415.pdf
-
- 05 Jun, 2023 1 commit
-
-
Charlie Lin authored
Changed the doc for find_permutation(shape) to be more clear that it is finding the permutation that would make the shape standard
-
- 04 Jun, 2023 1 commit
-
-
Igor Mirosavljevic authored
-
- 02 Jun, 2023 1 commit
-
-
Chris Austen authored
-
- 01 Jun, 2023 1 commit
-
-
Umang Yadav authored
By converting to fp32 : fp16 3d-unet model accuracy comes out the same as FP32 accuracy. By using reduce_sum method on Fp16 : accuracy comes out ~0.9% lower compared to fp32 while keeping entire model in fp16.
-
- 31 May, 2023 2 commits
-
-
Paul Fultz II authored
-
Umang Yadav authored
partially solves #1656 This PR only handles compilation part of multitarget.
-
- 30 May, 2023 2 commits
-
-
Paul Fultz II authored
Use generate_argument instead of generate_literal for python output as generate_literal doesnt exists Shorten the names for variables from the main module Use prefix p_ for parameters Use shorter variable m for main module in python
-
Paul Fultz II authored
-
- 29 May, 2023 2 commits
-
-
Pavle Jacovic authored
-
Chris Austen authored
-
- 28 May, 2023 1 commit
-
-
Paul Fultz II authored
* Allow quantizing for both int8 and fp16
-
- 25 May, 2023 1 commit
-
-
Ted Themistokleous authored
Use std::numeric_limits::min/max() functions plus the appropriate value to encode -inf/inf
-
- 24 May, 2023 2 commits
-
-
Paul Fultz II authored
Enable retrieving the code object to do tuning in the future.
-
kahmed10 authored
Refactor supported gfx archs
-