"sgl-kernel/vscode:/vscode.git/clone" did not exist on "4a0d19198bf9222edcb9879028990b481f8ffe56"
- 04 Jun, 2021 1 commit
-
-
Jing Zhang authored
-
- 03 Jun, 2021 1 commit
-
-
Jing Zhang authored
-
- 02 Jun, 2021 1 commit
-
-
Jing Zhang authored
-
- 01 Jun, 2021 7 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 31 May, 2021 1 commit
-
-
Jing Zhang authored
-
- 26 May, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 25 May, 2021 1 commit
-
-
Jing Zhang authored
-
- 21 May, 2021 3 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 20 May, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 19 May, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 18 May, 2021 5 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
Jing Zhang authored
-
- 17 May, 2021 1 commit
-
-
Jing Zhang authored
-
- 16 May, 2021 2 commits
-
-
Jing Zhang authored
-
Jing Zhang authored
-
- 13 May, 2021 1 commit
-
-
Jing Zhang authored
-
- 12 May, 2021 3 commits
-
-
Jing Zhang authored
-
Chao Liu authored
-
Chao Liu authored
* Use DynamicBuffer to hold raw pointer (to global and LDS memory) * add workaround for compiler issue (inefficient ISA) of ds_write for int8x4, int8x8, int8x16
-
- 11 May, 2021 2 commits
-
-
Jing Zhang authored
-
Chao Liu authored
* Replace most raw index calculation to coordinate transformation * Overhaul blockwise and threadwise GEMM * Overhaul driver for gridwies GEMM kernel Co-authored-by:Jing Zhang <jizhan@amd.com>
-
- 28 Apr, 2021 1 commit
-
-
Chao Liu authored
* replacing array with tuple and vector for tensor data
-
- 13 Apr, 2021 2 commits
-
-
Chao Liu authored
* overhaul vector_type, make int8x4_t real vector instead of aliasing from int32_t
-
Chao Liu authored
* initial implementation for magic number division and DynamicMerge_v2_magic_division that uses it * turn off DynamicMerge_v2_magic_division that use magic number division by default
-
- 07 Apr, 2021 1 commit
-
-
zjing14 authored
* Hybrid direct + implicit GEMM forward convolution NCHWc v5r1. Input tensor bypass LDS. Support fp32/fp16/int8
-
- 06 Apr, 2021 1 commit
-
-
Chao Liu authored
* use address_space(4) in kernel signature to fix performance issue when passing tensor descriptor from host to kernel by (void) pointers * remove passing by pointer* option (only use pass by value or void*)
-