- 30 Nov, 2023 1 commit
-
-
Manupa Karunaratne authored
-
- 23 Nov, 2023 1 commit
-
-
Paul Fultz II authored
-
- 30 Oct, 2023 1 commit
-
-
Ahsan Saghir authored
-
- 16 Oct, 2023 1 commit
-
-
Paul Fultz II authored
This will enable MLIR by default for these cases: Any convolution fusion Any int8 gemm fusion All Navi3 standalone convolutions With a flag(ie MIGRAPHX_ENABLE_MLIR) to enable MLIR for floating-point gemm fusions Except: 3x3 winnograd convolutions fusions (except on Navi) K > 2048 on gemm (as CK) Also there is MIGRAPHX_DISABLE_MLIR to disable MLIR completely.
-
- 12 Oct, 2023 1 commit
-
-
Chris Austen authored
-
- 02 Oct, 2023 1 commit
-
-
Chris Austen authored
-
- 01 Oct, 2023 1 commit
-
-
Chris Austen authored
-
- 29 Sep, 2023 1 commit
-
-
Umang Yadav authored
add flags for ck, Enable CK with hipRTC. CK can be used with the MIGRAPHX_ENABLE_CK=1 and MIGRAPHX_TUNE_CK=1
-
- 28 Sep, 2023 2 commits
-
-
Ted Themistokleous authored
-
Ted Themistokleous authored
Avoid the vega cards for the ORT build runs.
-
- 18 Aug, 2023 1 commit
-
-
Paul Fultz II authored
-
- 09 Aug, 2023 1 commit
-
-
Paul Fultz II authored
-
- 28 Jul, 2023 1 commit
-
-
Paul Fultz II authored
The --py output can be loaded back in the driver. This will embed the migraphx interperter so we can execute the python directly. There is a migraphx_py library which will dynamically load the version of the library for python version is available on the system.
-
- 27 Jul, 2023 1 commit
-
-
Artur Wojcik authored
* rename function 'near' to 'make_near' * try disabling vega10 machine
-
- 21 Jul, 2023 2 commits
-
-
Umang Yadav authored
Fixes #1957 Clamping was removed in #1853. Turns out clamping as necessary to handle overflow/underflow cases. during downcasting, if it overflowed then without clamping it returned infinity.
-
Umang Yadav authored
HIP requires global work items in multiple of local work items. If it is not it is not guaranteed to generate correct results all the time. Fixes #1977 Fixes #1644 MIGraphX CI has moved to rocm-5.6 which doesn't require hipRTC workarounds
-
- 18 Jul, 2023 1 commit
-
-
Paul Fultz II authored
-
- 17 Jul, 2023 1 commit
-
-
Krzysztof Drewniak authored
This commit removes the build options to disable threading and removes the mutex in compile_mlir. The commit being tested is a draft PR on rocMLIR that'll get merged if this passes
-
- 02 Jul, 2023 1 commit
-
-
Paul Fultz II authored
Add a CI job to test CK Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK Continue tuning even when there is invalid configs Fix a bug with parallel compilation not using all available threads Add additional test for gemms using half types Removed int32 as supported type since it doesnt pass our test suite
-
- 31 May, 2023 2 commits
-
-
Paul Fultz II authored
-
Umang Yadav authored
partially solves #1656 This PR only handles compilation part of multitarget.
-
- 29 May, 2023 1 commit
-
-
Chris Austen authored
-
- 19 May, 2023 1 commit
-
-
Chris Austen authored
Co-authored-by:
Sam Wu <sam.wu2@amd.com> Co-authored-by:
Paul <pfultz2@yahoo.com>
-
- 22 Mar, 2023 1 commit
-
-
Umang Yadav authored
prevent dynamically loading the target library that is not compiled with the same version of MIGraphX core lib.
-
- 13 Mar, 2023 1 commit
-
-
Manupa Karunaratne authored
* [MLIR] Adds a runtime switch to trigger MLIR
-
- 16 Feb, 2023 1 commit
-
-
Umang Yadav authored
* deprecate HCC
-
- 31 Jan, 2023 1 commit
-
-
Umang Yadav authored
Added CMakeFlag for hipRTC. MIGRAPHX_USE_HIPRTC. Added stages in Jenkins for hipRTC. Fixes for some of the pending issues from hipRTC.
-
- 06 Jan, 2023 1 commit
-
-
Paul Fultz II authored
Run a stage using MIGRAPHX_GPU_DEBUG=1.
-
- 26 Sep, 2022 1 commit
-
-
Charlie Lin authored
Rewrites the BatchNormalization ONNX operator into other MIGX operators - Added handling of 1D input tensor case (edge case in ONNX spec) Removes the spatial and per_activation functionality (not in the ONNX spec) - Did not remove the batch_norm_inference related code as the TensorFlow parser still uses it - Can remove that code when the TF version is updated
-
- 12 Jul, 2022 1 commit
-
-
Paul Fultz II authored
This will ensure that migraphx.h can be included from a C compiler, and check that the C API can be called. This includes stdbool.h which is needed when using bool from C.
-
- 16 Jun, 2022 1 commit
-
-
Paul authored
-
- 29 Mar, 2022 1 commit
-
-
Chris Austen authored
modify CI temporarily to stop using Navi hardware
-
- 05 Nov, 2021 1 commit
-
-
kahmed10 authored
Moving our Docker file from ROCm 4.3 to 4.5 Add Navi base GPUs in to the CI infrastructure
-
- 28 Sep, 2021 1 commit
-
-
Paul Fultz II authored
No longer avoid dependency problems and install the half package
-
- 26 Jul, 2021 1 commit
-
-
Paul authored
-
- 25 Jul, 2021 1 commit
-
-
Paul authored
-
- 29 Apr, 2021 1 commit
-
-
SJW authored
* MLIR MIOpen Dialect integration (phase 1) (#768) * Added Findmlir.cmake (using environment variables to import) * Added mlir_conv pass to GPU target * Apply to any gpu::convolution if supported by MLIR * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution * Capture binary in dictionary for matching convolutions * Build a code_object_op with the binary and execution dimensions * Substitute for the gpu::convolution * Changed the parameters for the code_object to reflect the generated MLIR kernel * Expanded out MemRefDescriptor fields in param list * Also updated for MLIR C-API changes * * fixed global_size calculation * MLIR MIOpen Dialect integration (phase 1) (#768) * Added Findmlir.cmake (using environment variables to import) * Added mlir_conv pass to GPU target * Apply to any gpu::convolution if supported by MLIR * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution * Capture binary in dictionary for matching convolutions * Build a code_object_op with the binary and execution dimensions * Substitute for the gpu::convolution * Changed the parameters for the code_object to reflect the generated MLIR kernel * Expanded out MemRefDescriptor fields in param list * Also updated for MLIR C-API changes * * Added command line option: --enable_mlir * * fixed command line switch * updated for new MLIR API changes * * Added cget llvm-project-mlir to import MIIR API libraries into Dockerfile * removed cmake Findmlir * updated for changes in MIIR C-API * * updated CMakeLists.txt to allow disable of MLIR import * fixed memory leaks and removed copies * updated for 5D memrefs * * formatting * * fixed review comments * * fixed merge issues * hip gcnDeviceName now includes specifiers at the end * use major/minor values instead * * disable MLIR by default * * removed command-line switch --enable-mlir * * fix unused when MLIR disabled * * enable jenkins enable/test MLIR * * format * * fixed clang-tidy * * added new type Co-authored-by:
Paul Fultz II <pfultz2@yahoo.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 09 Apr, 2021 1 commit
-
-
Paul Fultz II authored
* Fix tidy warnings for 4.1 * Formatting * Upgrade to 4.1 in docker * Remove hcc build and enable ubsan on clang debug * Add missing openmp package * Construct directly * Construct directly * Upgrade rocm-cmake version
-
- 08 Jan, 2021 1 commit
-
-
Paul Fultz II authored
* Add build and test github workflow * Fix cget command * Remove def-requirements.txt * Add tmate session to debug workflow * Run tmate session after installing dependencies * Print date periodically * Add clang tidy action * Seperate build and run container in two different jobs * Run bash script * Remove interactive flag * Try to mount the files * Try to use the github workspace * WIthout double braces * Use env variable * Pipe bash script in * Run using hip-clang * Use correct path * Add verbose * Remove j flag * Only run for onnx file to debug * Manually run clang-tidy * Remove quiet flag * Print header file * Printout environment * Remove extra defines * Remove fixits and config flag * Show ldd * Add tmate session * Run onnx protobuf first * Generate proto for tensorflow * Update cppcheck version * Fix some cppcheck issues * Add const * Cppcheck fixes * Formatting * Fix more cppcheck issues * Run two jobs * Cache analysis and run format checking * Fix yaml issues * Fix yaml issues * Fix indentation * Switch to hip-clang for main docker file * Use hip-clang in the readme * Fixes for jenkins * Use ccache to build * Combine file * Set restore keys * Change stage name * Build with ccache * Add missing dependency for ccache * Build debug with codecov * Fix workflow syntax * Fix list * Use quotes * Got to correct build path * Install lcov * Use sudo * Echo all commands * Setup tmate * Add verbose output * Build with cmake directly * Add pthread flag * Remove python config * Continue on error * Use on or off for cmake flag * Use always upload cache * Verbose output * Verbose output from build * Build one target * Reduce debug symbols * Increase garbage collection * Remove dmesg * Increase it to 20 * Update rocm cmake version * Remove jobs from jenkins * Run on all 3 ubuntus * Remove gcc 5 jobs * Dont add flag on 16.04 * Only upload coverage on 18.04 * Dont build for ubuntu 20.04 * Use matrix.os * Use O2 for hip-clang since lower optimizations are broken * Use rocm 3.0 * Pass ccache as cmake variable instead of env variable * Build miopen from source * Show ccache statistics * Print log information * Set compression level * Use hash dir * Set hashdir * Install clang ocl from system * Up compression level * Add locale * Increase cache size to 1G * Lower compression level to 9 * Remove split dwarf * Remove Og * Add back Og * Seperate debug and codecov * Add missing backlash * Garbage collect more often * Add missing locales package * Use Os * Install onednn in docker and run tests * Include target headers in tests * Increase timeout * Remove if condtion * Make flag public * Suppress memory leaks in onednn * Use equal * Add gh annotations * Update rocm-cmake version * Add ldconfig Co-authored-by:Shucai Xiao <shucai@gmail.com>
-
- 14 Dec, 2020 1 commit
-
-
Paul Fultz II authored
* Add flag to enable cpu backend * Make buffers shared * Enable optimizations * Add onednn * Formatting * Formatting * Add dnnl header * Formatting * Rewrite rnn first * Formatting * Call reference implementation * Formatting * Make literal data shared * Formatting * Add convolution * Formatting * Compensate for dilation * Formatting * Use name/make_op instead * Formatting * Rename gemm header * Formatting * Add dnnl convolution/gemm operators * Formatting * Add eliminate_contiguous * Add faster pointwise operators * Formatting * Formatting * Formatting * Add dnnl op class * Formatting * Add add op * Formatting * Add concat operator * Formatting * Add more ops * Create descriptor during finalization * Formatting * Dont rewrite pooling * Enable memory coloring * Formatting * Add output aliases * Formatting * Fix errors * Formatting * Convert literals * Add missing file * Remove batch_norm * Formatting * Use strides * Formatting * Add some debug checks * Formatting * Fix big in adjusting shape for gemm * Formatting * Fix fallback dot operator * Zero initialize buffers * Add suport for group convolutions * Formatting * Make adjust allocation target independent * Formatting * Enable adjust_allocation for gpu/cpu * Formatting * Add copy to allocation model * Formatting * Add copy operator * Formatting * Better handling of output parameters in adjust_allocation * Formatting * Build with dnnl * Make dnnl required * Fix compile error * Tidy fixes * Formatting * Tidy fixes * Formatting * Fix more tidy issues * Formatting * Add mul op * Add mul op * Set c compiler to clang as well * Compensate for normalized compute shape * Formatting * Fix cppcheck errors * Formatting * Add onednn library to hcc * Guard clang pragmas * Disable cpu mode for gcc for now * Leave it enabled it for gcc 7 * Fix cppcheck suppresion * Fix compile error on gcc 5 * Remove unused code Co-authored-by:
Shucai Xiao <shucai.xiao@amd.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-