- 18 Oct, 2023 1 commit
-
-
zjing14 authored
* Add a condition to build fp8 instances * simplified buffer_load/store * add bfp8/fp8 * fixed * remove all f8/bf8 condition include folder * fixed cmake conditions * fixed DTYPES=fp16/bfp16 * fix * fixed buffer_load * fixed buffer_store * fix * clean example cmake files * fixed ci * fixed cit --------- Co-authored-by:
Rostyslav Geyyer <rosty.geyyer@amd.com> Co-authored-by:
Jing Zhang <jizha@amd.com>
-
- 21 Sep, 2023 1 commit
-
-
Illia Silin authored
* refactor cmake files for the tests * refactor cmake files for examples * fix cmake for gemm example * fix the cmake file for all examples * add splitting by data types in gemm_splitk instance header * rename test to reflect only dl instances are used * clean up CI workspace, update cmake for instances * change the jenkinsfile syntax * build all instances except DL on gfx11 * move workspace cleanup after stages * clean up workspace after every stage * isolate data types in grouped_conv_fwd header * isolate dl instances for grouped_conv2d_fwd * fix syntax * fix cmake and batchnorm instances * fix typo * fix reduction instances * fix grouped_conv headers * fix syntax * replace parsing logic for instances, replace bfp16 with bf16 * fix the client examples build * clean up DTYPES from instances cmake files * update the parsing logic in cmake files * make an exception for reduction kernels * update few remaining cmake files to handle DTYPES * fix syntax * fix cmake conflicts * replace f8 with fp8 test name * resolve conflicts for dpp instances
-
- 07 Aug, 2023 1 commit
-
-
Illia Silin authored
* properly split conv_nd_bwd_data instances * split conv2d_fwd instance data types * split the gemm, conv2d_fwd and batched_gemm_softamx_gemm * split the tests by data types where possible * filter examples by DTYPES * split few remaining examples by DTYPES * filter most instances by DTYPES * add new lines at end of headers, fix grouped_gemm profiler * fix syntax * split the ckprofiler instances by DTYPES * split the conv2d and quantization DL and XDL instances * fix the splitting of conv2d DL instances * split softmax and pool_fwd tests for fp16 and fp32 types * fix syntax * fix the dl_int8 quantization instances isolation
-
- 20 Sep, 2022 1 commit
-
-
Po Yen Chen authored
* Add example folder for 'DeviceElementwise' * Re-structure example files * Move common parts into common.hpp * Use more strict input * Add more helper methods in 'DeviceElementwise' * Use more specific method to write example * Allow specify problem through command line argument * Allow specify problem 'axes' through command line argument * Add check to template type argument * Add transpose_shape() to generalize shape permute * Generalize transpose utility functions * Use better name for tensor indices * Add checks in helper functions * Remove debug messages * Refine error message for check_err() * Generalize variable naming in example code * Add device op 'DevicePermute' This device op is clone of 'DeviceElementwise' * Use 'DevicePermute' device op in example * Remove 'elementwise' from identifiers * Remove 'elementwise' from file paths * Remove base class of 'DevicePermute' * Let 'DevicePermute' ...
-