"src/include/blockwise_gemm.hpp" did not exist on "114fdb58af8253a1f3bc79057188fed9b4e95e8b"
  1. 13 Aug, 2022 2 commits
    • Chao Liu's avatar
      000eefbf
    • Anthony Chang's avatar
      Fused attention (#345) · cac014f1
      Anthony Chang authored
      
      
      * initial stub for gemm_gemm_xdl_cshuffle
      
      * set up example code
      
      * compiles
      
      * prevent integer overflow
      
      * harmonize interface between ref_gemm and ref_batched_gemm
      
      * batched_gemm_gemm
      
      * fix example
      
      * host tensor gen: diagonal pattern in lowest two-dimensions only
      
      * make c descriptors containing only integral constants
      
      * clean up
      
      * add BlockwiseGemmXdlops_v2 while exploring an unified approach
      
      * implement proper interface
      
      * tidy up example
      
      * fix compilation warnings
      
      * coarsely controlled 2nd gemm padding
      
      * remove rocm-cmake's hard requirement for certain revision
      
      * clang-format
      
      * resolve merge conflict
      
      * fix compilation error on gfx10
      
      * adds acc0 elementwise op to interface
      
      * attention host validation
      
      * add blockwsie softmax v1
      
      * iteratively update softmax+gemm
      
      * transpose both gemm0 and gemm1 xdl output so as to avoid broadcasting softmax max/sum
      
      * add init method for easier debugging
      
      * do away with manual thread cluster calculation
      
      * generalize blockwise softmax interface
      
      * row-wise softmax sum & max
      
      * format
      
      * rename to DeviceBatchedGemmSoftmaxGemm
      
      * add gemm_softmax_gemm instances and tests
      
      * comment
      Co-authored-by: default avatarltqin <letao.qin@amd.com>
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      cac014f1
  2. 12 Aug, 2022 4 commits
  3. 11 Aug, 2022 24 commits
  4. 10 Aug, 2022 1 commit
    • zjing14's avatar
      Add batched/grouped_gemm contraction deviceOps (#349) · e08d68d2
      zjing14 authored
      
      
      * convnd_fwd fp16 example
      
      * update example
      
      * update example
      
      * update instance
      
      * updating refernce conv
      
      * update reference conv
      
      * update conv fwd profiler
      
      * update conv 1d and 3d instance
      
      * update include path
      
      * clean
      
      * update profiler for conv bwd data and weight
      
      * update conv bwd weight
      
      * clean
      
      * update conv example
      
      * update profiler for conv bwd weight
      
      * update ckprofiler for conv bwd data
      
      * fix reference conv bwd data bug; update conv bwd data test
      
      * update examples
      
      * fix initialization issue
      
      * update test for conv fwd
      
      * clean
      
      * clean
      
      * remove test case too sensitive to error threshhold
      
      * fix test
      
      * clean
      
      * fix build
      
      * adding conv multiple d
      
      * adding conv multiple D
      
      * add matrix padder
      
      * add gemm padding to convnd
      
      * adding group conv
      
      * update gemm multi-d
      
      * refactor
      
      * refactor
      
      * refactor
      
      * clean
      
      * clean
      
      * refactor
      
      * refactor
      
      * reorg
      
      * add ds
      
      * add bias
      
      * clean
      
      * add G
      
      * adding group
      
      * adding group
      
      * adding group
      
      * update Tensor
      
      * clean
      
      * update example
      
      * update DeviceGemmMultipleD_Xdl_CShuffle
      
      * update conv bwd-data and bwd-weight
      
      * upate contraction example
      
      * update gemm and batch gemm with e permute
      
      * fix example build
      
      * instance for grouped conv1d
      
      * update example
      
      * adding group conv instance
      
      * update gemm bilinear instance
      
      * update gemm+add+add+fastgelu instance
      
      * update profiler
      
      * update profiler
      
      * update test
      
      * update test and client example
      
      * clean
      
      * add grouped conv into profiler
      
      * update profiler
      
      * clean
      
      * add test grouped conv, update all conv test to gtest
      
      * update test
      
      * change gemm_c_permute with contraction
      
      * add grouped_contraction
      
      * add contraction in group_gemm
      
      * add example of grouped_gemm with contraction
      
      * add example of grouped_contraction_bias_e_permute
      
      * clean
      
      * fixed ds
      
      * add m3n2 m2n3 examples into gemm_bias_e_permute
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      e08d68d2
  5. 08 Aug, 2022 1 commit
    • Illia Silin's avatar
      Fix QA, allow switching compiler versions, fix google test compilation error. (#348) · aba7fefc
      Illia Silin authored
      * allow selecting compiler version
      
      * fix typo
      
      * add Wno-deprecated flag for google tests
      
      * change git repo, fix qa log files names
      
      * change the git clone syntax
      
      * use Omkar's git credentials
      
      * try to use jenkins as git user
      
      * try using illsilin username for gerrit repo with ssh key
      
      * try new gerrit authorization
      
      * change ssh key syntax
      
      * try another way of passing ssh key to docker
      
      * add mount ssh in dockerfile
      
      * create .ssh folder
      
      * move ssh-keyscan to later
      
      * get rid of npm call
      
      * build first docker image on master
      
      * check the contents of the .ssh folder
      
      * try replacing omkars creds with gerrit creds
      
      * use open repo, clean up changes
      
      * get rid of ssh default argument
      aba7fefc
  6. 07 Aug, 2022 1 commit
  7. 03 Aug, 2022 1 commit
  8. 02 Aug, 2022 2 commits
    • Adam Osewski's avatar
      CGEMM examples bf16, fp32, int8 (#332) · fb0dc358
      Adam Osewski authored
      
      
      * Add int8 specialization for elementwise Add and Subtract.
      
      * CGEMM examples bf16, fp32, int8
      
      * Add convert reference output to CDataType.
      
      * Skip BF16 data type during testing.
      
      * Lower K value to get rid of accumulation error.
      
      * Fix merge artifact.
      
      * Fix changed function name: GetElementSpaceSize()
      
      * Fix merge artifact.
      Co-authored-by: default avatarAdam Osewski <aosewski@amd.com>
      fb0dc358
    • Illia Silin's avatar
      Run CI on MI100 nodes only, run daily QA on MI200 nodes. (#339) · 984b3722
      Illia Silin authored
      
      
      * turn on full qa only on gfx90a, use int initialization
      
      * change script syntax
      
      * update script parsing clinfo, throw exception if 0 devices
      
      * fix syntax
      
      * try using toBoolean for the QA conditions
      
      * run regular CI on MI100 only, use MI200 only for daily QA
      
      * evaluate when conditions before agent
      
      * launch QA on develop branch and update profile_reduce script
      
      * update test script
      
      * update script
      
      * remove false dependency from dockerfile
      
      * try removing rbuild completely
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      Co-authored-by: default avatarChao Liu <lc.roy86@gmail.com>
      984b3722
  9. 29 Jul, 2022 1 commit
    • Chao Liu's avatar
      Clean up conv example, Instances, profiler and test (#324) · 500fa995
      Chao Liu authored
      * convnd_fwd fp16 example
      
      * update example
      
      * update example
      
      * update instance
      
      * updating refernce conv
      
      * update reference conv
      
      * update conv fwd profiler
      
      * update conv 1d and 3d instance
      
      * update include path
      
      * clean
      
      * update profiler for conv bwd data and weight
      
      * update conv bwd weight
      
      * clean
      
      * update conv example
      
      * update profiler for conv bwd weight
      
      * update ckprofiler for conv bwd data
      
      * fix reference conv bwd data bug; update conv bwd data test
      
      * update examples
      
      * fix initialization issue
      
      * update test for conv fwd
      
      * clean
      
      * clean
      
      * remove test case too sensitive to error threshhold
      
      * fix test
      
      * clean
      
      * fix build
      
      * adding conv multiple d
      
      * adding conv multiple D
      
      * add matrix padder
      
      * add gemm padding to convnd
      
      * adding group conv
      
      * update gemm multi-d
      
      * refactor
      
      * refactor
      
      * refactor
      
      * clean
      
      * clean
      
      * refactor
      
      * refactor
      
      * reorg
      
      * add ds
      
      * add bias
      
      * clean
      
      * add G
      
      * adding group
      
      * adding group
      
      * adding group
      
      * update Tensor
      
      * clean
      
      * update example
      
      * update DeviceGemmMultipleD_Xdl_CShuffle
      
      * update conv bwd-data and bwd-weight
      
      * upate contraction example
      
      * update gemm and batch gemm with e permute
      
      * fix example build
      
      * instance for grouped conv1d
      
      * update example
      
      * adding group conv instance
      
      * update gemm bilinear instance
      
      * update gemm+add+add+fastgelu instance
      
      * update profiler
      
      * update profiler
      
      * update test
      
      * update test and client example
      
      * clean
      
      * add grouped conv into profiler
      
      * update profiler
      
      * clean
      
      * add test grouped conv, update all conv test to gtest
      
      * update test
      500fa995
  10. 22 Jul, 2022 2 commits
  11. 21 Jul, 2022 1 commit
    • Illia Silin's avatar
      Add full QA with verification option, few other changes. (#331) · d8415a96
      Illia Silin authored
      * add verify flag and update scripts
      
      * replace old check_error function with the new check_err
      
      * fix syntax
      
      * remove blank spaces
      
      * remove empty line
      
      * add check_err for tensors
      
      * fix syntax
      
      * replace tensors with vectors in check_err calls
      
      * fix syntax
      
      * remove blank spaces
      
      * fix syntax
      
      * add new line at end of file
      
      * disable conv2d_bwd_weight test, add gpu check
      
      * set check_gpu using export
      
      * check GPU using runShell
      
      * add definition of runShell
      
      * fix script syntax
      
      * reduce the number of threads, add full qa option
      
      * run processing scripts in bash
      
      * fix the branch and host names in performance scripts, add chronos
      
      * replace parameterizedCron with cron
      
      * archive the perf log files
      
      * try to fix git call
      
      * pass branch and host names as arguments into scripts
      
      * fix script arguments
      
      * fix script arguments
      
      * process results on master
      
      * fix pipeline
      
      * add definition of gpu_arch
      
      * run processing scripts in docker
      
      * fix the brackets
      
      * add agent master for the processing stage
      
      * get rid of show_node_info call on master
      
      * try using mici label instead of master, disable MI100 tests for now
      
      * fix syntax
      
      * simplify container for results processing
      
      * remove node(master) from the process_results stage
      
      * put all stages in original order
      
      * change the agent label from master to mici for gfx908
      d8415a96