Commits · 30072aec376a2fc6c6a465fe632bc7be050ec5ea · yangql / composable_kernel-1

"vscode:/vscode.git/clone" did not exist on "1b6fc8e57bd8bea9de183c0d858df90779bd84fe"

10 Jun, 2021 1 commit

Restructure gridwise and blockwise GEMM, add tensor contraction and FWD-v4r5 (#36) · 30072aec

Chao Liu authored Jun 09, 2021

* experimenting magic number division

* overhauling fwd-v4r4 to clearly reflect transformation graph

* added fwd-v4r5

* bug fix for make_dynamic_naive_tensor_descriptor_aligned_v2

* bug fix and added sanity-check in transform_dynamic_tensor_descriptor

* added conv_driver_v2

30072aec

12 May, 2021 1 commit

Use DynamicBuffer instead of raw pointer (#32) · 78b987fb

Chao Liu authored May 12, 2021

* Use DynamicBuffer to hold raw pointer (to global and LDS memory)

* add workaround for compiler issue (inefficient ISA) of ds_write for int8x4, int8x8, int8x16

78b987fb

28 Apr, 2021 1 commit
- Use Tuple and vector_type instead of Array for holding tensor data (#30) · d075adf1
  Chao Liu authored Apr 28, 2021
```
* replacing array with tuple and vector for tensor data
```
  d075adf1
13 Apr, 2021 2 commits
- Overhaul vector_type and use real vector for int8x4_t instead of aliasing from int32_t (#29) · e4790c25
  Chao Liu authored Apr 12, 2021
```
* overhaul vector_type, make int8x4_t real vector instead of aliasing from int32_t
```
  e4790c25
- Initial implementation of magic number division and "Merge" transformation that use it (#28) · 3bf52e60
  Chao Liu authored Apr 12, 2021
```
* initial implementation for magic number division and DynamicMerge_v2_magic_division that uses it

* turn off DynamicMerge_v2_magic_division that use magic number division by default
```
  3bf52e60
06 Apr, 2021 2 commits
- Fix performance issue when passing tensor descriptor from host to kernel by void pointers (#27) · d2217f30
  Chao Liu authored Apr 06, 2021
```
* use address_space(4) in kernel signature to fix performance issue when passing tensor descriptor from host to kernel by (void) pointers

* remove passing by pointer* option (only use pass by value or void*)
```
  d2217f30
- bug fix for buffer resource setting (#26) · 6a5ea493
  zjing14 authored Apr 06, 2021
  
  6a5ea493
25 Mar, 2021 1 commit

Dynamic tensor descriptor (#24) · fcbb9788

Chao Liu authored Mar 25, 2021



* support dynamic tensor descriptor

* use buffer load OOB feature for padding case

* add navi support

* add int8x4 inference kernel
Co-authored-by: Chao Liu <chao@ixt-rack-81.local.lan>
Co-authored-by: Jing Zhang <jizhan@amd.com>

fcbb9788

29 Jul, 2020 1 commit

Improve buffer address for out of bound check (#21) · ac62d13e

Chao Liu authored Jul 29, 2020

* Use buffer load built-in OOB check. buffer size is limited to 2GB.
* buffer APIs use combined wave and thread offset
* use uint32_t for addr shift in buffer addressing

ac62d13e

24 Jun, 2020 1 commit

Code clean up (#20) · 5c7cec11

Chao Liu authored Jun 23, 2020



* tuning para,

* testing on v100

* add fp16

* remove deprecated tensor descriptor

* sync with miopen

* update build script
Co-authored-by: Jing Zhang <jizhan@amd.com>

5c7cec11

17 Feb, 2020 1 commit
- MIopen integration (#13) · 1a66e35b
  Chao Liu authored Feb 17, 2020
```
* update for miopen integration: cosmetic refactor
```
  1a66e35b
27 Jan, 2020 1 commit
- Update for recent MIOpen integration (#11) · 3406a114
  Chao Liu authored Jan 27, 2020
```
* update for MIOpen integration
```
  3406a114
20 Jan, 2020 1 commit

Added bwd data v3r1 v4r1, tweaking v1 (#10) · c5da0377

Chao Liu authored Jan 20, 2020

* Added bwd data v3r1: breaking down compute into a series of load balanced GEMM, and launch in a single kernel
* Added bwd data v4r1: like v3r1, but launch GEMMs in multiple kernels
* Tweaked v1r1  and v1r2 (atomic) on AMD GPU

c5da0377

03 Dec, 2019 1 commit

backward data (#7) · 8f5f6496

Chao Liu authored Dec 03, 2019

* enabled atomic add in tensor copy
* added gridwise GEMM
* added backward data conv using GEMM + atomic
* added backward data conv using GEMM, no atomic

8f5f6496

04 Nov, 2019 1 commit
- MIOpen integration: recent bug fixes from MIOpen (#5) · 562e1e27
  Chao Liu authored Nov 04, 2019
  
  562e1e27
11 Oct, 2019 1 commit
- Refactor for MIOpen integration (#4) · 52c3fe05
  Chao Liu authored Oct 11, 2019
```
Refactor, so can bring multi-index transformation and padding support into MIOpen
```
  52c3fe05