1. 12 Oct, 2022 3 commits
  2. 11 Oct, 2022 3 commits
  3. 06 Oct, 2022 2 commits
  4. 05 Oct, 2022 2 commits
  5. 04 Oct, 2022 4 commits
  6. 03 Oct, 2022 3 commits
    • Chao Liu's avatar
      Update readme (#465) · 9d8f834a
      Chao Liu authored
      * update cmake script
      
      * update readme
      
      * Update README.md
      
      * add citation
      
      * add images
      
      * Update README.md
      
      * update
      
      * Update README.md
      
      * Update CONTRIBUTORS.md
      
      * Update README.md
      
      * Update CITATION.cff
      
      * Update README.md
      
      * Update CITATION.cff
      
      * update doc
      
      * Update CONTRIBUTORS.md
      
      * Update LICENSE
      
      * update
      9d8f834a
    • Chao Liu's avatar
      Update doc (#464) · 6de749e2
      Chao Liu authored
      * update cmake script
      
      * update readme
      
      * Update README.md
      
      * add citation
      
      * add images
      
      * Update README.md
      
      * update
      
      * Update README.md
      
      * Update CONTRIBUTORS.md
      
      * Update README.md
      
      * Update CITATION.cff
      
      * Update README.md
      
      * Update CITATION.cff
      
      * update doc
      
      * Update CONTRIBUTORS.md
      
      * Update LICENSE
      6de749e2
    • Chao Liu's avatar
      update document: Readme, contributors, citation, (#463) · 473ba5bc
      Chao Liu authored
      * update cmake script
      
      * update readme
      
      * Update README.md
      
      * add citation
      
      * add images
      
      * Update README.md
      
      * update
      
      * Update README.md
      
      * Update CONTRIBUTORS.md
      
      * Update README.md
      
      * Update CITATION.cff
      
      * Update README.md
      
      * Update CITATION.cff
      473ba5bc
  7. 01 Oct, 2022 1 commit
    • Illia Silin's avatar
      Allow setting ROCM version, activate cchache, etc. (#462) · 7fc3ed76
      Illia Silin authored
      * enable ccache and decouple it from MIOpen ccache use
      
      * fix the ccache check script
      
      * use another method to get server name
      
      * fix syntax
      
      * add quotes around the server name variable
      
      * use check_host as function
      
      * change syntax
      
      * fix syntax
      
      * test if server name is parsed correctly
      
      * try different syntax
      
      * check the env var value
      
      * test new check node function
      
      * add ROCMVERSION parameter and fix script syntax
      
      * fix script syntax
      
      * add missing instances of rocm version
      
      * install ccache in the docker image
      
      * do not check GPU in clang format stage, clean up old code
      
      * update defaults and clean up
      7fc3ed76
  8. 28 Sep, 2022 3 commits
  9. 27 Sep, 2022 2 commits
    • Illia Silin's avatar
      Fix build issues, set new compiler default, etc. (#451) · b8825547
      Illia Silin authored
      * add an option to select specific compiler commit
      
      * change the logic of forcing building a docker
      
      * add check for compiler commit in dockerfile
      
      * compiler check syntax fix
      
      * change compiler selection logic
      
      * fix the new compiler build issue
      
      * set new compiler as default, update dev-requirements
      
      * fix jenkins syntax
      
      * fix docker syntax
      
      * get rid of hipcc.pl editing in jenkinsfile
      
      * fix the hipcc.pl in both places
      
      * try to fix the 10738 compiler linking bug
      
      * fix syntax
      
      * use dockerhub to store images
      
      * use newer amd-stg-open commit as default
      b8825547
    • Astha Rai's avatar
      fixed NumDim dimension error · 76b44c60
      Astha Rai authored
      76b44c60
  10. 26 Sep, 2022 3 commits
  11. 25 Sep, 2022 4 commits
  12. 23 Sep, 2022 1 commit
  13. 22 Sep, 2022 2 commits
  14. 21 Sep, 2022 3 commits
    • Lixun Zhang's avatar
      Updated the supported components (#435) · 7acbf104
      Lixun Zhang authored
      7acbf104
    • Illia Silin's avatar
      Build the CK targets only once. (#433) · 85b0920d
      Illia Silin authored
      * build CK only once, use deb package in all subsequent stages
      
      * update jenkins file
      
      * change prefix for build_CK stage
      
      * update writing deb metadata to control file
      
      * update ubuntu source for docker, script syntax for deb package metadata
      
      * try different way to create deb metadata
      
      * clean up DEBIAN before creating one
      
      * fix the CI folder names, fix splitK qa
      
      * use correct docker in all stages, separate tests for splitK verification and performance
      
      * clean old comments, change dir before packaging
      
      * use different package syntax
      
      * change packaging syntax
      
      * package with cmake
      
      * remove unnecessary build prefix
      
      * get rid of unnecessary paths
      
      * change paths during unpacking
      
      * change script syntax while unpacking
      
      * get rid of unneccesary steps
      
      * get rid of comments in the scripts
      
      * use double quotes for scripts
      
      * add ccache during build, try dpkg -x
      
      * pull and install each package separately
      
      * use full package names
      
      * try to use stashing for packages
      
      * change stash/unstash syntax
      
      * move unstash out of shell, run tests on any gpu node
      
      * unpack each package separately
      
      * try re-using existing workspace
      
      * merge the build and test stages, only stash ckProfiler
      
      * merge the build and test stages, only stash zipped ckProfiler
      
      * fix syntax
      
      * add GPU check before build and test, rename docker to usual name
      85b0920d
    • zjing14's avatar
      fixed G offset calc for long_index (#428) · 01876afa
      zjing14 authored
      01876afa
  15. 20 Sep, 2022 4 commits
    • Chao Liu's avatar
      fix build (#427) · 567f70f5
      Chao Liu authored
      * fix build
      
      * fix build
      567f70f5
    • Shaojie WANG's avatar
      MNKO padding support on bmm+masking+scale+softmax+bmm+premute (#425) · ebab84b6
      Shaojie WANG authored
      
      
      * add lower triangle bmm
      
      * init code for tile skipping
      
      * functionality right with lower triangle mask
      
      * add decoder lower triangular mask calculation
      
      * use 7*13 group
      
      * fix n2 compute error
      
      * attention with lower triangle mask with tile skipping
      
      * add template to distinguish masking kernel
      
      * rename template and remove default template value
      
      * remove lower triangle gemm reference struct
      
      * add some comments on example
      
      * add 10 instance for masking bmm + scale + softmax + bmm + permute kernels
      
      * add test
      
      * add test file
      
      * add gtest for bmm masking scale softmax bmm permute
      
      * clang-format
      
      * fix compile error
      
      * check lef bottom corner for tile skipping
      
      * fix error: check left bottom corner for tile skipping
      
      * add k padding
      
      * add test and instance for MNK padding
      
      * passing a mask struct
      
      * fix instances
      
      * delete used comments
      
      * format
      Co-authored-by: default avatardanyao12 <yaodan@dc-smc-13.amd.com>
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      ebab84b6
    • Illia Silin's avatar
    • rocking5566's avatar
      Group norm (#417) · 4eba345f
      rocking5566 authored
      
      
      * Add groupnorm example by layernorm
      1.  Reference is not ready
      2. shape of gamma and beta need to be fix
      
      * Let shape of gamma and beta can be same as x
      
      * Modify test, instance and client example
      
      * [What] Fix bug of layernorm for greater than 2 dimension.
      [Why] We need to get upper length from merge transform instead of embed transform.
      
      * Add reference for groupnorm
      
      * Fuse sigmoid after groupnorm
      
      * [What] Rename original layernorm into layernorm2d
      [Why] Prepare to add groupnorm using layernorm5d
      
      * clang-format
      
      * Add groupnorm test
      
      * Refine error message
      
      * Add groupnorm ckProfiler
      
      * Test groupnorm kernel from device_instance
      
      * update example
      
      * upadte profiler
      
      * Fix test naming
      
      * Fix argc number
      
      * Move descriptor and sweeponce to argument for quick debugging
      Co-authored-by: default avatarChao Liu <chao.liu2@amd.com>
      4eba345f