- 04 Feb, 2025 2 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
- 03 Feb, 2025 1 commit
-
-
Andriy Roshchenko authored
-
- 01 Feb, 2025 1 commit
-
-
Andriy Roshchenko authored
-
- 31 Jan, 2025 4 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
Test the functionality of V_MFMA_F32_16X16X128_F8F6F4 and V_MFMA_F32_32X32X64_F8F6F4 instructions. (#293) * Introduced MFMA tests * Verified f8f6f4 MFMA Instructions
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
- 30 Jan, 2025 5 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
- 29 Jan, 2025 6 commits
-
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
Andriy Roshchenko authored
-
- 24 Jan, 2025 1 commit
-
-
Andriy Roshchenko authored
-
- 22 Jan, 2025 4 commits
-
-
Illia Silin authored
Fix build logic when building for multiple targets, including gfx950.
-
illsilin authored
-
illsilin authored
-
illsilin authored
-
- 21 Jan, 2025 2 commits
-
-
Andriy Roshchenko authored
-
illsilin authored
-
- 20 Jan, 2025 1 commit
-
-
illsilin authored
-
- 17 Jan, 2025 4 commits
-
-
illsilin authored
-
Illia Silin authored
Merge from public
-
illsilin authored
-
Aviral Goel authored
* smoke and regression targets working with tests * test filters work for both examples and test * removed uneccesary comments * added a missing comment * added a missing comment * fixed typo in the comments * updated README * Update PULL_REQUEST_TEMPLATE.md updating the template for future addition of test cases * Update PULL_REQUEST_TEMPLATE.md
-
- 16 Jan, 2025 2 commits
-
-
Bartłomiej Kocot authored
* Fix and optimize dynamic unary elementwise * fix
-
carlushuang authored
* fix mock token id * prepare host for g1u1 * reformat inline-asm * restructure uk_0 * restructure gate_up * done * change default to init=1 * update readme * fix a bug in interleave pipeline * rcp for silu
-
- 15 Jan, 2025 4 commits
-
-
Illia Silin authored
-
Bartłomiej Kocot authored
* Add rounding for float to bf16 conversion * Add bhalf test * Add inf test bhalf * Refactor * update cmake * Fixes
-
ruanjm authored
* Add shortcut to RMSNorm * Modify test for adding shortcut for RMSNorm * Add fused parameter into tests * 1. Add YDataType. 2. rmsnorm2d_fwd_traits_ from rmsnorm2d_fwd.hpp to rmsnorm2d_fwd_api.cpp and rmsnorm2d_fwd_instance_common.hpp * 1. Supports various stride and percisions. * Add support of Epilogue * Add fuse and epilogue support to rmsnorm ref * Modify rmsnorm example * Refactor tests/examples * Bug fix for newly added tests/examples * Bug fix for new tests 2 * Modify smoke test scripts remove dbg code * Supports non-smooth dyanmic quant * Update Rmsnorm2dFwd::GetName() * rename xscale and prec_sx to smoothscale and prec_sm Bug fix after rename Remove files * change example_rmsnorm2d_fwd.cpp * update performance calculator * Fix issue in two-pass when fuse add is enabled * Remove comment of beta --------- Co-authored-by:rocking <ChunYu.Lai@amd.com>
-
Andriy Roshchenko authored
Temporarily uses `DeviceGemmMultiD_ABScale_Xdl_CShuffle_V3` kernel and 128x128 scaling matrices. Must be modified to use MX-native GEMM kernell with 16 or 32 component vectors per scale. Verified on the emulator.
-
- 14 Jan, 2025 1 commit
-
-
Andriy Roshchenko authored
-
- 13 Jan, 2025 2 commits
-
-
Max Podkorytov authored
add unit test for gen instances for gemms add unit tests for conv and batched gemms add unit test for preselected gemm instances apply ruff lint add license header for the unit test add inductor pytest to CI verbose pip install switch the directory before installing python packages move the inductor codegen test try yet another workdir Update Jenkinsfile The directory looks right, fixing pip module not found by invoking pip directly Update Jenkinsfile invoke pytest directly since the module is not found Update Dockerfile Install setuptools update package structure bump setuptools maybe fix data path for library sources fix library search path for conv instances fix path in pyproject definition compare path used in gen_instances with one in pyproject.toml; fix the difference Co-authored-by:Illia Silin <98187287+illsilin@users.noreply.github.com>
-
feli authored
* port tiles from a8w8 * rm debug used files * add instances * remove all non gemm in cmake * merge; impl fp16 * recover cmake from develop * add missed files; fix clang format --------- Co-authored-by:coderfeli <coderfeli@163.com>
-