1. 30 Nov, 2023 1 commit
  2. 23 Nov, 2023 1 commit
  3. 30 Oct, 2023 1 commit
  4. 16 Oct, 2023 1 commit
    • Paul Fultz II's avatar
      Enable MLIR by default for more cases (#2274) · 650ba45f
      Paul Fultz II authored
      This will enable MLIR by default for these cases:
      
      Any convolution fusion
      Any int8 gemm fusion
      All Navi3 standalone convolutions
      With a flag(ie MIGRAPHX_ENABLE_MLIR) to enable MLIR for floating-point gemm fusions
      Except:
      
      3x3 winnograd convolutions fusions (except on Navi)
      K > 2048 on gemm (as CK)
      Also there is MIGRAPHX_DISABLE_MLIR to disable MLIR completely.
      650ba45f
  5. 12 Oct, 2023 1 commit
  6. 02 Oct, 2023 1 commit
  7. 01 Oct, 2023 1 commit
  8. 29 Sep, 2023 1 commit
  9. 28 Sep, 2023 2 commits
  10. 18 Aug, 2023 1 commit
  11. 09 Aug, 2023 1 commit
  12. 28 Jul, 2023 1 commit
    • Paul Fultz II's avatar
      Load python files in the driver (#1793) · b164ceef
      Paul Fultz II authored
      The --py output can be loaded back in the driver. This will embed the migraphx interperter so we can execute the python directly. There is a migraphx_py library which will dynamically load the version of the library for python version is available on the system.
      b164ceef
  13. 27 Jul, 2023 1 commit
  14. 21 Jul, 2023 2 commits
    • Umang Yadav's avatar
      Add back clamping and add tests (#1969) · 6957243c
      Umang Yadav authored
      Fixes #1957
      
      Clamping was removed in #1853.
      
      Turns out clamping as necessary to handle overflow/underflow cases. during downcasting, if it overflowed then without clamping it returned infinity.
      6957243c
    • Umang Yadav's avatar
      Make global workitems multiple of local workitems (#1976) · 3216fe52
      Umang Yadav authored
      HIP requires global work items in multiple of local work items. If it is not it is not guaranteed to generate correct results all the time.
      Fixes #1977
      Fixes #1644
      MIGraphX CI has moved to rocm-5.6 which doesn't require hipRTC workarounds
      3216fe52
  15. 18 Jul, 2023 1 commit
  16. 17 Jul, 2023 1 commit
    • Krzysztof Drewniak's avatar
      Enable threading in MLIR (#1899) · 5f5356cc
      Krzysztof Drewniak authored
      This commit removes the build options to disable threading and removes the mutex in compile_mlir.
      The commit being tested is a draft PR on rocMLIR that'll get merged if this passes
      5f5356cc
  17. 02 Jul, 2023 1 commit
    • Paul Fultz II's avatar
      Improvement to ck integration (#1859) · 3c9df3b4
      Paul Fultz II authored
      Add a CI job to test CK
      Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK
      Continue tuning even when there is invalid configs
      Fix a bug with parallel compilation not using all available threads
      Add additional test for gemms using half types
      Removed int32 as supported type since it doesnt pass our test suite
      3c9df3b4
  18. 31 May, 2023 2 commits
  19. 29 May, 2023 1 commit
  20. 19 May, 2023 1 commit
  21. 22 Mar, 2023 1 commit
  22. 13 Mar, 2023 1 commit
  23. 16 Feb, 2023 1 commit
  24. 31 Jan, 2023 1 commit
    • Umang Yadav's avatar
      hipRTC fixes (#1531) · 91cc7242
      Umang Yadav authored
      Added CMakeFlag for hipRTC. MIGRAPHX_USE_HIPRTC.
      Added stages in Jenkins for hipRTC.
      Fixes for some of the pending issues from hipRTC.
      91cc7242
  25. 06 Jan, 2023 1 commit
  26. 26 Sep, 2022 1 commit
    • Charlie Lin's avatar
      Rewrite ONNX parse batch norm (#1362) · c00f8202
      Charlie Lin authored
      Rewrites the BatchNormalization ONNX operator into other MIGX operators
      - Added handling of 1D input tensor case (edge case in ONNX spec)
      Removes the spatial and per_activation functionality (not in the ONNX spec)
      - Did not remove the batch_norm_inference related code as the TensorFlow parser still uses it
      - Can remove that code when the TF version is updated
      c00f8202
  27. 12 Jul, 2022 1 commit
    • Paul Fultz II's avatar
      Add tests for C API (#1266) · a7a32a9e
      Paul Fultz II authored
      This will ensure that migraphx.h can be included from a C compiler, and check that the C API can be called. This includes stdbool.h which is needed when using bool from C.
      a7a32a9e
  28. 16 Jun, 2022 1 commit
  29. 29 Mar, 2022 1 commit
  30. 05 Nov, 2021 1 commit
  31. 28 Sep, 2021 1 commit
  32. 26 Jul, 2021 1 commit
  33. 25 Jul, 2021 1 commit
  34. 29 Apr, 2021 1 commit
    • SJW's avatar
      MLIR MIOpen Dialect integration (phase 1) (#768) (#769) · 56584fa2
      SJW authored
      
      
      * MLIR MIOpen Dialect integration (phase 1) (#768)
      
      * Added Findmlir.cmake (using environment variables to import)
      
      * Added mlir_conv pass to GPU target
      
        * Apply to any gpu::convolution if supported by MLIR
      
        * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution
      
        * Capture binary in dictionary for matching convolutions
      
        * Build a code_object_op with the binary and execution dimensions
      
        * Substitute for the gpu::convolution
      
      * Changed the parameters for the code_object to reflect the generated MLIR kernel
      
      * Expanded out MemRefDescriptor fields in param list
      
      * Also updated for MLIR C-API changes
      
      * * fixed global_size calculation
      
      * MLIR MIOpen Dialect integration (phase 1) (#768)
      
      * Added Findmlir.cmake (using environment variables to import)
      
      * Added mlir_conv pass to GPU target
      
        * Apply to any gpu::convolution if supported by MLIR
      
        * Call MLIR C-API to generate iGEMM kernel with configuration from gpu::convolution
      
        * Capture binary in dictionary for matching convolutions
      
        * Build a code_object_op with the binary and execution dimensions
      
        * Substitute for the gpu::convolution
      
      * Changed the parameters for the code_object to reflect the generated MLIR kernel
      
      * Expanded out MemRefDescriptor fields in param list
      
      * Also updated for MLIR C-API changes
      
      * * Added command line option: --enable_mlir
      
      * * fixed command line switch
      
      * updated for new MLIR API changes
      
      * * Added cget llvm-project-mlir to import MIIR API libraries into Dockerfile
        * removed cmake Findmlir
      
      * updated for changes in MIIR C-API
      
      * * updated CMakeLists.txt to allow disable of MLIR import
      
      * fixed memory leaks and removed copies
      
      * updated for 5D memrefs
      
      * * formatting
      
      * * fixed review comments
      
      * * fixed merge issues
      
      * hip gcnDeviceName now includes specifiers at the end
        * use major/minor values instead
      
      * * disable MLIR by default
      
      * * removed command-line switch --enable-mlir
      
      * * fix unused when MLIR disabled
      
      * * enable jenkins enable/test MLIR
      
      * * format
      
      * * fixed clang-tidy
      
      * * added new type
      Co-authored-by: default avatarPaul Fultz II <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      56584fa2
  35. 09 Apr, 2021 1 commit
    • Paul Fultz II's avatar
      Upgrade docker to rocm 4.1 and drop hcc (#795) · 6d937d80
      Paul Fultz II authored
      * Fix tidy warnings for 4.1
      
      * Formatting
      
      * Upgrade to 4.1 in docker
      
      * Remove hcc build and enable ubsan on clang debug
      
      * Add missing openmp package
      
      * Construct directly
      
      * Construct directly
      
      * Upgrade rocm-cmake version
      6d937d80
  36. 08 Jan, 2021 1 commit
    • Paul Fultz II's avatar
      Revamp CI infrastucture (#706) · ceb4ca09
      Paul Fultz II authored
      
      
      * Add build and test github workflow
      
      * Fix cget command
      
      * Remove def-requirements.txt
      
      * Add tmate session to debug workflow
      
      * Run tmate session after installing dependencies
      
      * Print date periodically
      
      * Add clang tidy action
      
      * Seperate build and run container in two different jobs
      
      * Run bash script
      
      * Remove interactive flag
      
      * Try to mount the files
      
      * Try to use the github workspace
      
      * WIthout double braces
      
      * Use env variable
      
      * Pipe bash script in
      
      * Run using hip-clang
      
      * Use correct path
      
      * Add verbose
      
      * Remove j flag
      
      * Only run for onnx file to debug
      
      * Manually run clang-tidy
      
      * Remove quiet flag
      
      * Print header file
      
      * Printout environment
      
      * Remove extra defines
      
      * Remove fixits and config flag
      
      * Show ldd
      
      * Add tmate session
      
      * Run onnx protobuf first
      
      * Generate proto for tensorflow
      
      * Update cppcheck version
      
      * Fix some cppcheck issues
      
      * Add const
      
      * Cppcheck fixes
      
      * Formatting
      
      * Fix more cppcheck issues
      
      * Run two jobs
      
      * Cache analysis and run format checking
      
      * Fix yaml issues
      
      * Fix yaml issues
      
      * Fix indentation
      
      * Switch to hip-clang for main docker file
      
      * Use hip-clang in the readme
      
      * Fixes for jenkins
      
      * Use ccache to build
      
      * Combine file
      
      * Set restore keys
      
      * Change stage name
      
      * Build with ccache
      
      * Add missing dependency for ccache
      
      * Build debug with codecov
      
      * Fix workflow syntax
      
      * Fix list
      
      * Use quotes
      
      * Got to correct build path
      
      * Install lcov
      
      * Use sudo
      
      * Echo all commands
      
      * Setup tmate
      
      * Add verbose output
      
      * Build with cmake directly
      
      * Add pthread flag
      
      * Remove python config
      
      * Continue on error
      
      * Use on or off for cmake flag
      
      * Use always upload cache
      
      * Verbose output
      
      * Verbose output from build
      
      * Build one target
      
      * Reduce debug symbols
      
      * Increase garbage collection
      
      * Remove dmesg
      
      * Increase it to 20
      
      * Update rocm cmake version
      
      * Remove jobs from jenkins
      
      * Run on all 3 ubuntus
      
      * Remove gcc 5 jobs
      
      * Dont add flag on 16.04
      
      * Only upload coverage on 18.04
      
      * Dont build for ubuntu 20.04
      
      * Use matrix.os
      
      * Use O2 for hip-clang since lower optimizations are broken
      
      * Use rocm 3.0
      
      * Pass ccache as cmake variable instead of env variable
      
      * Build miopen from source
      
      * Show ccache statistics
      
      * Print log information
      
      * Set compression level
      
      * Use hash dir
      
      * Set hashdir
      
      * Install clang ocl from system
      
      * Up compression level
      
      * Add locale
      
      * Increase cache size to 1G
      
      * Lower compression level to 9
      
      * Remove split dwarf
      
      * Remove Og
      
      * Add back Og
      
      * Seperate debug and codecov
      
      * Add missing backlash
      
      * Garbage collect more often
      
      * Add missing locales package
      
      * Use Os
      
      * Install onednn in docker and run tests
      
      * Include target headers in tests
      
      * Increase timeout
      
      * Remove if condtion
      
      * Make flag public
      
      * Suppress memory leaks in onednn
      
      * Use equal
      
      * Add gh annotations
      
      * Update rocm-cmake version
      
      * Add ldconfig
      Co-authored-by: default avatarShucai Xiao <shucai@gmail.com>
      ceb4ca09
  37. 14 Dec, 2020 1 commit
    • Paul Fultz II's avatar
      Use dnnl for cpu backend (#688) · 406afeb8
      Paul Fultz II authored
      
      
      * Add flag to enable cpu backend
      
      * Make buffers shared
      
      * Enable optimizations
      
      * Add onednn
      
      * Formatting
      
      * Formatting
      
      * Add dnnl header
      
      * Formatting
      
      * Rewrite rnn first
      
      * Formatting
      
      * Call reference implementation
      
      * Formatting
      
      * Make literal data shared
      
      * Formatting
      
      * Add convolution
      
      * Formatting
      
      * Compensate for dilation
      
      * Formatting
      
      * Use name/make_op instead
      
      * Formatting
      
      * Rename gemm header
      
      * Formatting
      
      * Add dnnl convolution/gemm operators
      
      * Formatting
      
      * Add eliminate_contiguous
      
      * Add faster pointwise operators
      
      * Formatting
      
      * Formatting
      
      * Formatting
      
      * Add dnnl op class
      
      * Formatting
      
      * Add add op
      
      * Formatting
      
      * Add concat operator
      
      * Formatting
      
      * Add more ops
      
      * Create descriptor during finalization
      
      * Formatting
      
      * Dont rewrite pooling
      
      * Enable memory coloring
      
      * Formatting
      
      * Add output aliases
      
      * Formatting
      
      * Fix errors
      
      * Formatting
      
      * Convert literals
      
      * Add missing file
      
      * Remove batch_norm
      
      * Formatting
      
      * Use strides
      
      * Formatting
      
      * Add some debug checks
      
      * Formatting
      
      * Fix big in adjusting shape for gemm
      
      * Formatting
      
      * Fix fallback dot operator
      
      * Zero initialize buffers
      
      * Add suport for group convolutions
      
      * Formatting
      
      * Make adjust allocation target independent
      
      * Formatting
      
      * Enable adjust_allocation for gpu/cpu
      
      * Formatting
      
      * Add copy to allocation model
      
      * Formatting
      
      * Add copy operator
      
      * Formatting
      
      * Better handling of output parameters in adjust_allocation
      
      * Formatting
      
      * Build with dnnl
      
      * Make dnnl required
      
      * Fix compile error
      
      * Tidy fixes
      
      * Formatting
      
      * Tidy fixes
      
      * Formatting
      
      * Fix more tidy issues
      
      * Formatting
      
      * Add mul op
      
      * Add mul op
      
      * Set c compiler to clang as well
      
      * Compensate for normalized compute shape
      
      * Formatting
      
      * Fix cppcheck errors
      
      * Formatting
      
      * Add onednn library to hcc
      
      * Guard clang pragmas
      
      * Disable cpu mode for gcc for now
      
      * Leave it enabled it for gcc 7
      
      * Fix cppcheck suppresion
      
      * Fix compile error on gcc 5
      
      * Remove unused code
      Co-authored-by: default avatarShucai Xiao <shucai.xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      406afeb8