Commit ca34d4d2 authored by yanjl1's avatar yanjl1
Browse files

Initial

parents
# hygon_samples
本项目提供了 hipDNN(HIP Deep Neural Network)前端 API 的使用示例,覆盖海光 DCU(Deep Computing Unit)硬件上常用的深度学习算子、融合算子以及 PyTorch 集成用法。
## 环境要求
- **DTK 版本**:≥ 25.04.2(推荐 26.04)
- **支持架构**`gfx906``gfx926``gfx928``gfx936``gfx938``gfx92a`
- **依赖**`hipdnn`(Python/C++)、`hip::host``hipdnn_frontend``PyTorch`(Python 示例)
所有开发和运行都需要先加载 DTK 环境:
```bash
source /data/dtk-26.04/env.sh
```
> 若未加载,C++ 编译时会报错 `Must be source dtk/env.sh`(`ROCM_PATH` 未设置)。
## 目录结构
```
.
├── cpp/ # C++ 示例(hipDNN Frontend C++ API)
│ ├── CMakeLists.txt
│ ├── utils.hpp # 错误检查宏(HIP_CHECK / HIPDNN_CHECK / HIPDNN_FE_CHECK)
│ ├── build/ # 编译输出目录
│ ├── convolution/ # 卷积前向/反向/权值更新
│ ├── conv_fusion/ # 卷积融合:bias + ReLU/Swish/PReLU/Add 等
│ ├── conv_depthtospace_fusion/ # 卷积 + DepthToSpace 融合
│ ├── concat_conv_fusion/ # Concat + 卷积融合
│ ├── matmul/ # 矩阵乘法
│ ├── matmul_fusion/ # MatMul + bias + 激活
│ ├── batchnorm/ # BatchNorm 推理/训练/反向
│ ├── layernorm/ # LayerNorm
│ ├── groupnorm/ # GroupNorm
│ ├── instancenorm/ # InstanceNorm
│ ├── rmsnorm/ # RMSNorm
│ ├── sdpa/ # Scaled Dot-Product Attention
│ ├── rope/ # RoPE(旋转位置编码)
│ ├── deformconvolution/ # 可变形卷积
│ ├── deformattention/ # 可变形注意力
│ ├── adamw/ # AdamW 优化器
│ ├── softmax/ # Softmax
│ ├── reduction/ # Reduce / Pointwise+Reduce
│ ├── transpose/ # Transpose
│ ├── pointwise/ # 逐元素二元运算
│ ├── ctc_loss/ # CTC Loss
│ ├── kthvalue/ # Top-K / KthValue
│ ├── multi_margin_loss/ # MultiMarginLoss
│ ├── soft_margin_loss/ # SoftMarginLoss
│ ├── block_scale/ # 块量化/反量化
│ └── ...
├── python/ # Python 示例(hipdnn Python API + PyTorch)
│ ├── convolution/
│ ├── conv_fusion/
│ ├── matmul/
│ ├── sdpa/
│ ├── batchnorm/
│ ├── layernorm/
│ ├── groupnorm/
│ ├── adamw/
│ ├── torch_wrapper/ # PyTorch 模块封装(如 TorchPReLU)
│ └── ...
└── CLAUDE.md # 本项目开发指引
```
## 编译 C++ 示例
```bash
cd cpp/build
cmake -G Ninja ..
ninja
```
编译完成后,可执行文件位于 `cpp/build/bin/`。如需单独编译某个示例:
```bash
ninja conv_forward
ninja sdpa_inference
```
> `CMakeLists.txt` 中部分示例被注释掉(如 `bn_finalize`、`block_scale_quantize`、`slice`、`rng`),如需启用请取消对应 `add_hipdnn_sample(...)` 行的注释。
## 运行示例
**C++ 示例:**
```bash
./cpp/build/bin/conv_forward
./cpp/build/bin/softmax
./cpp/build/bin/sdpa_inference
```
**Python 示例:**
```bash
cd python/softmax
python softmax.py
```
Python 示例依赖 `import hipdnn``import torch`,张量需创建在 `device="cuda"` 上。
运行前需安装 hipdnn Python whl 包(在已加载 DTK 环境的前提下):
```bash
pip install ${ROCM_PATH}/share/hipdnn/wheels/hipdnn-*.whl
```
## 算子示例分类
| 分类 | C++ 路径 | Python 路径 | 说明 |
|------|----------|-------------|------|
| 卷积 | `convolution/``conv_fusion/``conv_depthtospace_fusion/``concat_conv_fusion/` | `convolution/``conv_fusion/``conv_depthtospace_fusion/``concat_conv_fusion/` | 前向、反向、权值梯度、融合 bias/激活/ReLU/Swish/PReLU/INT8/DepthToSpace |
| 矩阵乘法 | `matmul/``matmul_fusion/` | `matmul/``matmul_fusion/` | MatMul、MatMul+bias+激活 |
| 归一化 | `batchnorm/``layernorm/``groupnorm/``instancenorm/``rmsnorm/` | `batchnorm/``layernorm/``groupnorm/``instancenorm/``rmsnorm/` | 推理、训练、反向 |
| 注意力 | `sdpa/``rope/``deformattention/` | `sdpa/``rope/``deformattention/` | SDPA、RoPE、可变形注意力 |
| 优化器 | `adamw/` | `adamw/` | AdamW、Transformer 调度 AdamW |
| 融合算子 | `fusion/``conv_bn_fusion/` | `fusion/``conv_bn_fusion/` | add+layernorm、groupnorm+swish、pointwise+conv+genstats、scale/bias 融合 |
| 量化 | `block_scale/``conv_fusion/Int8*` | `block_scale/``conv_fusion/convint8_*` | INT8 卷积、块量化/反量化 |
| PyTorch 封装 | — | `torch_wrapper/` | `hipdnn.TorchPReLU()` 等模块级封装 |
| 其他 | `softmax/``reduction/``transpose/``pointwise/``ctc_loss/``kthvalue/` 等 | `softmax/``reduction/``transpose/``pointwise/``ctc_loss/``kthvalue/` 等 | 常用算子及 Loss |
## 快速开始
1. 加载 DTK 环境:
```bash
source /data/dtk-26.04/env.sh
```
2. 编译 C++ 示例并运行:
```bash
cd cpp/build && cmake -G Ninja .. && ninja
./bin/conv_forward
```
3. 运行 Python 示例:
```bash
cd python/softmax
python softmax.py
```
## 常见问题排查
| 现象 | 原因 | 解决方式 |
|------|------|----------|
| `Must be source dtk/env.sh` | `ROCM_PATH` 未设置 | 先执行 `source /data/dtk-26.04/env.sh` |
| `hipdnn` 模块找不到 | Python 环境未加载 hipDNN | 确认 DTK 环境已加载,且 `hipdnn``PYTHONPATH` 中 |
| CMake 找不到 `hipdnn_frontend` | hipDNN 未安装或环境未加载 | 检查 `${ROCM_PATH}/lib/cmake/hipdnn/` 是否存在 |
| CUDA 相关报错 | PyTorch 张量未放至 GPU | 确保张量使用 `device="cuda"` |
| 编译警告被当作错误 | CMake 开启了 `-Werror` | 修复代码中的警告,或临时在 `CMakeLists.txt` 中移除 `-Werror` |
## 数据类型与布局说明
- **默认数据类型**`float`(C++)/ `torch.float32`(Python)。
- **FP16**:使用 `hipdnn_data_sdk::types::half` / `torch.float16`
- **INT8**:使用 `int8_t` / `torch.int8`,并采用 **NCHWc32** 分块布局(`vector_count=32`)。INT8 示例中的量化/反量化通过显式的 SUB/MUL/DIV/ADD 节点完成。
- **布局**:默认 NCHW;部分卷积示例使用 channels-last(`torch.channels_last`);INT8 使用 NCHWc32。
## 许可证
代码文件遵循 MIT 许可证(SPDX-License-Identifier: MIT)。
{
"session_id": "eafc2bdd-f5a2-481a-9368-0fc8c1ceb3a0",
"ended_at": "2026-06-02T10:11:57.127Z",
"reason": "prompt_input_exit",
"agents_spawned": 0,
"agents_completed": 0,
"modes_used": []
}
\ No newline at end of file
# Copyright © Advanced Micro Devices, Inc., or its affiliates.
# SPDX-License-Identifier: MIT
cmake_minimum_required(VERSION 3.25.2)
# Enable PIC/PIE to ensure compatibility with the plugin loader system (dlopen). This prevents
# potential Thread Local Storage (TLS) model mismatches between the executable and dynamically
# loaded backend plugins.
set(CMAKE_POSITION_INDEPENDENT_CODE ON)
add_compile_definitions(__HIP_PLATFORM_AMD__)
if(DEFINED ENV{ROCM_PATH})
set(ROCM_PATH "$ENV{ROCM_PATH}")
else()
message(FATAL_ERROR "Must be source dtk/env.sh")
endif()
project(hipdnn_samples VERSION 0.1.0 LANGUAGES C CXX)
include(GNUInstallDirs)
set(CMAKE_CXX_STANDARD 17)
find_package(hip REQUIRED)
find_package(Threads REQUIRED)
if(NOT TARGET hipdnn_frontend)
find_package(hipdnn_frontend CONFIG REQUIRED)
endif()
include_directories(${CMAKE_CURRENT_LIST_DIR})
set(CMAKE_RUNTIME_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/bin)
list(PREPEND HIPDNN_WARNING_COMPILE_OPTIONS
-Werror # Treat all warnings as errors
-Wall # Enable most common warnings
-Wextra # Enable additional warnings not covered by -Wall
-Wpedantic # Enforce strict ISO C++ compliance
-Wshadow # Warn about variable shadowing
-Wnon-virtual-dtor # Warn if a class with virtual functions has a non-virtual destructor
-Wold-style-cast # Warn about C-style casts
-Wcast-align # Warn about potential performance issues with misaligned casts
-Woverloaded-virtual # Warn if a base class function is hidden by a derived class function with the same name
-Wconversion # Warn about implicit type conversions that may alter a value
-Wsign-conversion # Warn about implicit conversions between signed and unsigned types
-Wnull-dereference # Warn about dereferencing null pointers
-Wdouble-promotion # Warn when a float is implicitly promoted to a double
-Wformat=2 # Enable stricter format string checks
-Winit-self # Warn about variables initialized with itself
-Wunreachable-code # Warn about unreachable code
-Wno-return-type # DTK-25.04.2 need ignore
-Wswitch-default # Warn if a switch statement does not have a default case
)
function(add_hipdnn_sample NAME SOURCE)
add_executable(${NAME} ${SOURCE})
target_compile_options(${NAME} PRIVATE ${HIPDNN_WARNING_COMPILE_OPTIONS})
target_link_libraries(${NAME} PRIVATE hip::host Threads::Threads hipdnn_frontend)
endfunction()
add_hipdnn_sample(bn_inference batchnorm/BnInference.cpp)
#add_hipdnn_sample(bn_finalize batchnorm/BnFinalize.cpp)
add_hipdnn_sample(bn_training batchnorm/BnTraining.cpp)
add_hipdnn_sample(bn_backward batchnorm/BnBackward.cpp)
#add_hipdnn_sample(bn_backward_weight batchnorm/BnBackwardWeight.cpp)
add_hipdnn_sample(conv_forward convolution/ConvForward.cpp)
add_hipdnn_sample(conv_backward convolution/ConvBackward.cpp)
add_hipdnn_sample(conv_wrw convolution/ConvBackwardWeight.cpp)
add_hipdnn_sample(conv_bias_prelu conv_fusion/ConvBiasPrelu.cpp)
add_hipdnn_sample(conv_bias_prelu_add conv_fusion/ConvBiasPreluAdd.cpp)
add_hipdnn_sample(conv_bias_swish_add conv_fusion/ConvBiasSwishAdd.cpp)
add_hipdnn_sample(conv_bias_swish conv_fusion/ConvBiasSwish.cpp)
add_hipdnn_sample(conv_bias_relu conv_fusion/ConvBiasRelu.cpp)
add_hipdnn_sample(conv_bias_add conv_fusion/ConvBiasAdd.cpp)
add_hipdnn_sample(conv_bias conv_fusion/ConvBias.cpp)
add_hipdnn_sample(conv_bias_add_relu conv_fusion/ConvBiasAddRelu.cpp)
add_hipdnn_sample(convbwd_bias_relu conv_fusion/ConvbwdBiasRelu.cpp)
add_hipdnn_sample(convint8_bias conv_fusion/Int8ConvBias.cpp)
add_hipdnn_sample(convint8_bias_add conv_fusion/Int8ConvBiasAdd.cpp)
add_hipdnn_sample(convint8_bias_add_relu conv_fusion/Int8ConvBiasAddRelu.cpp)
add_hipdnn_sample(convint8_bias_relu conv_fusion/Int8ConvBiasRelu.cpp)
add_hipdnn_sample(convint8_bias_relu_add conv_fusion/Int8ConvBiasReluAdd.cpp)
add_hipdnn_sample(convfp16_bias_relu conv_fusion/Fp16ConvBiasRelu.cpp)
add_hipdnn_sample(ln_inference layernorm/LnInference.cpp)
#add_hipdnn_sample(ln_backward layernorm/LnBackward.cpp)
add_hipdnn_sample(rms_forward rmsnorm/RmsnormForward.cpp)
add_hipdnn_sample(deform_conv_fprop deformconvolution/DeformConvForward.cpp)
add_hipdnn_sample(deform_conv_dgrad deformconvolution/DeformConvBackward.cpp)
add_hipdnn_sample(deform_conv_wgrad deformconvolution/DeformConvBackwardWeight.cpp)
add_hipdnn_sample(gn_training groupnorm/GNTraining.cpp)
add_hipdnn_sample(gn_inference groupnorm/GNInference.cpp)
add_hipdnn_sample(gn_backward groupnorm/GNBackward.cpp)
add_hipdnn_sample(add_layernorm fusion/AddLayernorm.cpp)
add_hipdnn_sample(gn_swish fusion/GroupnormSwish.cpp)
add_hipdnn_sample(sdpa_inference sdpa/SDPAInference.cpp)
add_hipdnn_sample(reduction reduction/Reduction.cpp)
add_hipdnn_sample(reluBwd_reduction reduction/PointwiseReduction.cpp)
add_hipdnn_sample(transpose transpose/Transpose.cpp)
#add_hipdnn_sample(genstats genstats/Genstats.cpp)
add_hipdnn_sample(reshape_transpose fusion/ReshapeTranspose.cpp)
#add_hipdnn_sample(resample resample/Resample.cpp)
add_hipdnn_sample(deform_attn_fprop deformattention/DeformAttnForward.cpp)
add_hipdnn_sample(deform_attn_dgrad deformattention/DeformAttnBackward.cpp)
add_hipdnn_sample(instancenorm_inference instancenorm/InstancenormInference.cpp)
add_hipdnn_sample(instancenorm_backward instancenorm/InstancenormBackward.cpp)
add_hipdnn_sample(instancenorm_training instancenorm/InstancenormTraining.cpp)
#add_hipdnn_sample(block_scale_dequantize block_scale/BlockScaleDequantize.cpp)
#add_hipdnn_sample(block_scale_quantize block_scale/BlockScaleQuantize.cpp)
#add_hipdnn_sample(slice slice/Slice.cpp)
#add_hipdnn_sample(rng rng/Rng.cpp)
add_hipdnn_sample(adamw adamw/Adamw.cpp)
add_hipdnn_sample(transformer_adamw adamw/TransformerAdamw.cpp)
add_hipdnn_sample(concatenate concatenate/Concatenate.cpp)
# add_hipdnn_sample(pw_conv_genstats fusion/PointwiseConvGenstats.cpp)
add_hipdnn_sample(concat_conv concat_conv_fusion/ConcatConv.cpp)
add_hipdnn_sample(concat_conv_bias concat_conv_fusion/ConcatConvBias.cpp)
add_hipdnn_sample(concat_conv_bias_add concat_conv_fusion/ConcatConvBiasAdd.cpp)
add_hipdnn_sample(concat_conv_bias_leakyRelu concat_conv_fusion/ConcatConvBiasLeakyRelu.cpp)
add_hipdnn_sample(concat_conv_bias_leakyRelu_add concat_conv_fusion/ConcatConvBiasLeakyReluAdd.cpp)
add_hipdnn_sample(conv_bias_depthToSpace conv_depthtospace_fusion/ConvBiasDepthToSpace.cpp)
add_hipdnn_sample(conv_bias_depthToSpace_add conv_depthtospace_fusion/ConvBiasDepthToSpaceAdd.cpp)
add_hipdnn_sample(conv_bias_add_depthToSpace conv_depthtospace_fusion/ConvBiasAddDepthToSpace.cpp)
add_hipdnn_sample(conv_bias_depthToSpace_clippedRelu conv_depthtospace_fusion/ConvBiasDepthToSpaceClippedRelu.cpp)
add_hipdnn_sample(conv_bias_depthToSpace_clippedRelu_add conv_depthtospace_fusion/ConvBiasDepthToSpaceClippedReluAdd.cpp)
add_hipdnn_sample(conv_depthToSpace conv_depthtospace_fusion/ConvDepthToSpace.cpp)
add_hipdnn_sample(matmul matmul/Matmul.cpp)
add_hipdnn_sample(matmul_bias matmul_fusion/MatmulBias.cpp)
add_hipdnn_sample(matmul_bias_swish matmul_fusion/MatmulBiasSwish.cpp)
add_hipdnn_sample(rope_forward rope/RopeForward.cpp)
add_hipdnn_sample(rope_backward rope/RopeBackward.cpp)
add_hipdnn_sample(pointwise_binary pointwise/BinaryPointwise.cpp)
add_hipdnn_sample(softmax softmax/Softmax.cpp)
add_hipdnn_sample(ctc_loss ctc_loss/CtcLoss.cpp)
add_hipdnn_sample(kthvalue2d kthvalue/Kthvalue2D.cpp)
add_hipdnn_sample(kthvalue4d kthvalue/Kthvalue4D.cpp)
add_hipdnn_sample(multi_margin_loss multi_margin_loss/MultiMarginLoss.cpp)
add_hipdnn_sample(soft_margin_loss soft_margin_loss/SoftMarginLossForward.cpp)
add_hipdnn_sample(soft_margin_loss_backward soft_margin_loss/SoftMarginLossBackward.cpp)
add_hipdnn_sample(getitem_indices_backward getitem_backward/GetitemBackwardIndices.cpp)
add_hipdnn_sample(getitem_slice_backward getitem_backward/GetitemBackwardSlice.cpp)
add_hipdnn_sample(scale_bias_relu_conv_genstats conv_bn_fusion/ScaleBiasReluConvGenstats.cpp)
add_hipdnn_sample(scale_bias_relu_convwrw conv_bn_fusion/ScaleBiasReluConvwrw.cpp)
add_hipdnn_sample(mul_mul_add_add conv_bn_fusion/MulMulAddAdd.cpp)
add_hipdnn_sample(sub_mul_mul_add_convbwd_relubwd_bnwrw conv_bn_fusion/SubMulMulAddConvbwdRelubwdBnwrw.cpp)
add_hipdnn_sample(conv_genstats conv_bn_fusion/ConvGenstats.cpp)
add_hipdnn_sample(scale_bias conv_bn_fusion/ScaleBias.cpp)
\ No newline at end of file
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
const int64_t n = 2; // Batch size
// Input
const int64_t c = 3; // Number of channels
const int64_t h = 4; // Height
const int64_t w = 5; // Width
auto buildAdamwGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("adamw_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT); //
auto params = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("params")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1}));
auto grads = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("grads")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1}));
auto expAvgs = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("exp_avgs")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1}));
auto expAvgSqs = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("exp_avg_sqs")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1}));
auto maxExpAvgSqs = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("max_exp_avg_sqs")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1}));
auto adamwAttributes = hipdnn_frontend::graph::AdamwAttributes()
.set_name("adamw_node")
.set_transformeradamw(false)
.set_max_exp_avg_sqs(maxExpAvgSqs);
graph->adamw(params, grads, expAvgs, expAvgSqs, adamwAttributes);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, params, grads, expAvgs, expAvgSqs, maxExpAvgSqs);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, params, grads, expAvgs, expAvgSqs, maxExpAvgSqs] = buildAdamwGraph(handle);
// Allocate DCU memory
hipdnn_data_sdk::utilities::Tensor<InputType> paramsTensor(params->get_dim(),
params->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> gradsTensor(grads->get_dim(),
grads->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> expAvgsTensor(expAvgs->get_dim(),
expAvgs->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> expAvgSqsTensor(expAvgSqs->get_dim(),
expAvgSqs->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> maxExpAvgSqsTensor(maxExpAvgSqs->get_dim(),
maxExpAvgSqs->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[params->get_uid()] = paramsTensor.memory().deviceData();
variantPack[grads->get_uid()] = gradsTensor.memory().deviceData();
variantPack[expAvgs->get_uid()] = expAvgsTensor.memory().deviceData();
variantPack[expAvgSqs->get_uid()] = expAvgSqsTensor.memory().deviceData();
variantPack[maxExpAvgSqs->get_uid()] = maxExpAvgSqsTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "Adamw graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
const int64_t n = 2; // Batch size
// Input
const int64_t c = 3; // Number of channels
const int64_t h = 4; // Height
const int64_t w = 5; // Width
auto buildTransformerAdamwGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("transformer_adamw_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT); //
auto params = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("params")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1}));
auto grads = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("grads")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1}));
auto expAvgs = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("exp_avgs")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1}));
auto expAvgSqs = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("exp_avg_sqs")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1}));
auto adamwAttributes = hipdnn_frontend::graph::AdamwAttributes()
.set_name("transformer_adamw_node")
.set_correct_bias(false)
.set_transformeradamw(true);
graph->adamw(params, grads, expAvgs, expAvgSqs, adamwAttributes);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, params, grads, expAvgs, expAvgSqs);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, params, grads, expAvgs, expAvgSqs] = buildTransformerAdamwGraph(handle);
// Allocate DCU memory
hipdnn_data_sdk::utilities::Tensor<InputType> paramsTensor(params->get_dim(),
params->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> gradsTensor(grads->get_dim(),
grads->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> expAvgsTensor(expAvgs->get_dim(),
expAvgs->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> expAvgSqsTensor(expAvgSqs->get_dim(),
expAvgSqs->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[params->get_uid()] = paramsTensor.memory().deviceData();
variantPack[grads->get_uid()] = gradsTensor.memory().deviceData();
variantPack[expAvgs->get_uid()] = expAvgsTensor.memory().deviceData();
variantPack[expAvgSqs->get_uid()] = expAvgSqsTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "TransformerAdamw graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
const int64_t n = 16; // Batch size
// Input
const int64_t c = 16; // Number of channels
const int64_t h = 16; // Height
const int64_t w = 16; // Width
auto buildBnBackwardGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("bn_backward_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto dy = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("dy")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto x = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto scale = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("scale")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto savedMean = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("save_mean")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto savedInvVariance = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("save_inv_variance")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto bnBwdAttributes = hipdnn_frontend::graph::BatchnormBackwardAttributes()
.set_name("bn_backward_node")
.set_saved_mean_and_inv_variance(savedMean, savedInvVariance);
auto [dx, dscale, dbias] = graph->batchnorm_backward(dy, x, scale, bnBwdAttributes);
dx->set_output(true);
dscale->set_output(true);
dbias->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, dy, x, scale, savedMean, savedInvVariance, dx, dscale, dbias);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, dy, x, scale, savedMean, savedInvVariance, dx, dscale, dbias]
= buildBnBackwardGraph(handle);
// Allocate DCU memory
hipdnn_data_sdk::utilities::Tensor<InputType> dyTensor(dy->get_dim(), dy->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> xTensor(x->get_dim(), x->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> scaleTensor(scale->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> savedMeanTensor(savedMean->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> savedInvVarTensor(savedInvVariance->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> dxTensor(dx->get_dim(), dx->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> dscaleTensor(dscale->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> dbiasTensor(dbias->get_dim());
std::unordered_map<int64_t, void*> variantPack;
variantPack[dy->get_uid()] = dyTensor.memory().deviceData();
variantPack[x->get_uid()] = xTensor.memory().deviceData();
variantPack[scale->get_uid()] = scaleTensor.memory().deviceData();
variantPack[savedMean->get_uid()] = savedMeanTensor.memory().deviceData();
variantPack[savedInvVariance->get_uid()] = savedInvVarTensor.memory().deviceData();
variantPack[dx->get_uid()] = dxTensor.memory().deviceData();
variantPack[dscale->get_uid()] = dscaleTensor.memory().deviceData();
variantPack[dbias->get_uid()] = dbiasTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "Batch normalization backward graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
#include <iostream>
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_frontend.hpp>
#include "hipdnn_data_sdk/utilities/Workspace.hpp"
#include "utils.hpp"
int main()
{
using InputType = float;
const int64_t n = 16; // Batch size
// Input
const int64_t c = 16; // Number of channels
const int64_t h = 16; // Height
const int64_t w = 16; // Width
auto buildBnBackwarWeightdGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("bn_backward_weight_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto dy = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("dy")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto x = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto scale = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("scale")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto savedMean = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("save_mean")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto savedInvVariance = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("save_inv_variance")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto bnBwdWeightAttributes
= hipdnn_frontend::graph::BatchnormBackwardWeightAttributes().set_name(
"bn_backward_weight_node");
auto [dscale, dbias, eqScaleDy, eqScaleX, eqBias]
= graph->dbn_weight(dy, x, savedMean, savedInvVariance, scale, bnBwdWeightAttributes);
dscale->set_output(true);
dbias->set_output(true);
eqScaleDy->set_output(true);
eqScaleX->set_output(true);
eqBias->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph,
dy,
x,
scale,
savedMean,
savedInvVariance,
dscale,
dbias,
eqScaleDy,
eqScaleX,
eqBias);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph,
dy,
x,
scale,
savedMean,
savedInvVariance,
dscale,
dbias,
eqScaleDy,
eqScaleX,
eqBias]
= buildBnBackwarWeightdGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> dyTensor(dy->get_dim(), dy->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> xTensor(x->get_dim(), x->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> scaleTensor(scale->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> savedMeanTensor(savedMean->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> savedInvVarTensor(savedInvVariance->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> dscaleTensor(dscale->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> dbiasTensor(dbias->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> eqScaleDyTensor(eqScaleDy->get_dim(),
eqScaleDy->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> eqScaleXTensor(eqScaleX->get_dim(),
eqScaleX->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> eqBiasTensor(eqBias->get_dim(),
eqBias->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[dy->get_uid()] = dyTensor.memory().deviceData();
variantPack[x->get_uid()] = xTensor.memory().deviceData();
variantPack[scale->get_uid()] = scaleTensor.memory().deviceData();
variantPack[savedMean->get_uid()] = savedMeanTensor.memory().deviceData();
variantPack[savedInvVariance->get_uid()] = savedInvVarTensor.memory().deviceData();
variantPack[dscale->get_uid()] = dscaleTensor.memory().deviceData();
variantPack[dbias->get_uid()] = dbiasTensor.memory().deviceData();
variantPack[eqScaleDy->get_uid()] = eqScaleDyTensor.memory().deviceData();
variantPack[eqScaleX->get_uid()] = eqScaleXTensor.memory().deviceData();
variantPack[eqBias->get_uid()] = eqBiasTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "nBatch normalization backward weight graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
#include <iostream>
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_frontend.hpp>
#include "hipdnn_data_sdk/utilities/Workspace.hpp"
#include "utils.hpp"
int main()
{
using InputType = hipdnn_data_sdk::types::half;
const int64_t n = 1; // Batch size
// Input
const int64_t c = 32; // Number of channels
const int64_t h = 1; // Height
const int64_t w = 1; // Width
auto buildBnFinalizeGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("bn_finalize_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto sum = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("sum")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto sqSum = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("sq_sum")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto scale = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("scale")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto bias = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("bias")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto prevRunningMean = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("save_mean")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto prevRunningVar = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("save_inv_variance")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto momentum = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("momentum")
.set_dim({1, 1, 1, 1})
.set_stride({1, 1, 1, 1}));
auto epsilon = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("epsilon")
.set_dim({1, 1, 1, 1})
.set_stride({1, 1, 1, 1}));
auto accumCount = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("accum_count")
.set_dim({1, 1, 1, 1})
.set_stride({1, 1, 1, 1}));
epsilon->set_value(1e-5);
momentum->set_value(0.001f);
accumCount->set_value(static_cast<int32_t>(n * h * w));
auto bnFinalizeAttributes
= hipdnn_frontend::graph::BatchnormFinalizeAttributes()
.set_name("bn_finalize_node")
.set_previous_running_stats(prevRunningMean, prevRunningVar, momentum);
auto [eqScale, eqBias, mean, invVariance, nextRunningMean, nextRunningVar]
= graph->bn_finalize(
sum, sqSum, scale, bias, epsilon, accumCount, bnFinalizeAttributes);
eqScale->set_output(true);
eqBias->set_output(true);
mean->set_output(true);
invVariance->set_output(true);
nextRunningMean->set_output(true);
nextRunningVar->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph,
sum,
sqSum,
scale,
bias,
epsilon,
prevRunningMean,
prevRunningVar,
momentum,
accumCount,
eqScale,
eqBias,
mean,
invVariance,
nextRunningMean,
nextRunningVar);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph,
sum,
sqSum,
scale,
bias,
epsilon,
prevRunningMean,
prevRunningVar,
momentum,
accumCount,
eqScale,
eqBias,
mean,
invVariance,
nextRunningMean,
nextRunningVar]
= buildBnFinalizeGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> sumTensor(sum->get_dim(), sum->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> sqSumTensor(sqSum->get_dim(),
sqSum->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> scaleTensor(scale->get_dim(),
scale->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> biasTensor(bias->get_dim(), bias->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> prevMeanTensor(prevRunningMean->get_dim(),
prevRunningMean->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> prevVarTensor(prevRunningVar->get_dim(),
prevRunningVar->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> momentumTensor(momentum->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> epsilonTensor(epsilon->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> accumCountTensor(accumCount->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> eqScaleTensor(eqScale->get_dim(),
eqScale->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> eqBiasTensor(eqBias->get_dim(),
eqBias->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> nextMeanTensor(nextRunningMean->get_dim(),
nextRunningMean->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> nextVarTensor(nextRunningVar->get_dim(),
nextRunningVar->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> meanTensor(mean->get_dim(), mean->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> invVarTensor(invVariance->get_dim(),
invVariance->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[sum->get_uid()] = sumTensor.memory().deviceData();
variantPack[sqSum->get_uid()] = sqSumTensor.memory().deviceData();
variantPack[scale->get_uid()] = scaleTensor.memory().deviceData();
variantPack[bias->get_uid()] = biasTensor.memory().deviceData();
variantPack[prevRunningMean->get_uid()] = prevMeanTensor.memory().deviceData();
variantPack[prevRunningVar->get_uid()] = prevVarTensor.memory().deviceData();
variantPack[momentum->get_uid()] = momentumTensor.memory().deviceData();
variantPack[epsilon->get_uid()] = epsilonTensor.memory().deviceData();
variantPack[accumCount->get_uid()] = accumCountTensor.memory().deviceData();
variantPack[eqScale->get_uid()] = eqScaleTensor.memory().deviceData();
variantPack[eqBias->get_uid()] = eqBiasTensor.memory().deviceData();
variantPack[nextRunningMean->get_uid()] = nextMeanTensor.memory().deviceData();
variantPack[nextRunningVar->get_uid()] = nextVarTensor.memory().deviceData();
variantPack[mean->get_uid()] = meanTensor.memory().deviceData();
variantPack[invVariance->get_uid()] = invVarTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "Batch normalization finalize graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
#include <iostream>
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_frontend.hpp>
#include "hipdnn_data_sdk/utilities/Workspace.hpp"
#include "utils.hpp"
int main()
{
using InputType = hipdnn_data_sdk::types::half;
const int64_t n = 16; // Batch size
// Input
const int64_t c = 16; // Number of channels
const int64_t h = 16; // Height
const int64_t w = 16; // Width
auto buildBnInferenceGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("bn_inference_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto x = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto scale = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("scale")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto bias = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("bias")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto mean = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("mean")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto variance = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("variance")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto epsilon = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("epsilon")
.set_dim({1, 1, 1, 1})
.set_stride({1, 1, 1, 1})
.set_value(1e-5));
auto bnInferenceAttributes
= hipdnn_frontend::graph::BatchnormInferenceAttributesVarianceExt().set_name(
"bn_inference_node");
auto y = graph->batchnorm_inference_variance_ext(
x, mean, variance, scale, bias, epsilon, bnInferenceAttributes);
y->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x, scale, bias, mean, variance, epsilon, y);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x, scale, bias, mean, variance, epsilon, y] = buildBnInferenceGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> xTensor(x->get_dim(), x->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> scaleTensor(scale->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> biasTensor(bias->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> meanTensor(mean->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> varianceTensor(variance->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> epsilonTensor(epsilon->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[x->get_uid()] = xTensor.memory().deviceData();
variantPack[scale->get_uid()] = scaleTensor.memory().deviceData();
variantPack[bias->get_uid()] = biasTensor.memory().deviceData();
variantPack[mean->get_uid()] = meanTensor.memory().deviceData();
variantPack[variance->get_uid()] = varianceTensor.memory().deviceData();
variantPack[epsilon->get_uid()] = epsilonTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "Batch normalization inference graph execution complete. ";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
#include <iostream>
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_frontend.hpp>
#include "hipdnn_data_sdk/utilities/Workspace.hpp"
#include "utils.hpp"
int main()
{
using InputType = hipdnn_data_sdk::types::half;
const int64_t n = 16; // Batch size
// Input
const int64_t c = 16; // Number of channels
const int64_t h = 16; // Height
const int64_t w = 16; // Width
auto buildBnTrainingGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("bn_training_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto x = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto scale = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("scale")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto bias = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("bias")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto prevRunningMean = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("prev_running_mean")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto prevRunningVar = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("prev_running_variance")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto momentum = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("momentum")
.set_dim({1, 1, 1, 1})
.set_stride({1, 1, 1, 1}));
auto epsilon = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("epsilon")
.set_dim({1, 1, 1, 1})
.set_stride({1, 1, 1, 1}));
epsilon->set_value(1e-5);
momentum->set_value(0.1);
auto bnTrainingAttributes
= hipdnn_frontend::graph::BatchnormAttributes()
.set_name("bn_training_node")
.set_epsilon(epsilon)
.set_previous_running_stats(prevRunningMean, prevRunningVar, momentum);
auto [y, savedMean, savedInvVariance, nextRunningMean, nextRunningVar]
= graph->batchnorm(x, scale, bias, bnTrainingAttributes);
y->set_output(true);
nextRunningMean->set_output(true);
nextRunningVar->set_output(true);
savedMean->set_output(true);
savedInvVariance->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph,
x,
scale,
bias,
prevRunningMean,
prevRunningVar,
momentum,
epsilon,
y,
savedMean,
savedInvVariance,
nextRunningMean,
nextRunningVar);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph,
x,
scale,
bias,
prevRunningMean,
prevRunningVar,
momentum,
epsilon,
y,
savedMean,
savedInvVariance,
nextRunningMean,
nextRunningVar]
= buildBnTrainingGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> xTensor(x->get_dim(), x->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> scaleTensor(scale->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> biasTensor(bias->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> prevMeanTensor(prevRunningMean->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> prevVarTensor(prevRunningVar->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> momentumTensor(momentum->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> epsilonTensor(epsilon->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> savedMeanTensor(savedMean->get_dim());
hipdnn_data_sdk::utilities::Tensor<InputType> savedInvVarTensor(savedInvVariance->get_dim());
std::unordered_map<int64_t, void*> variantPack;
variantPack[x->get_uid()] = xTensor.memory().deviceData();
variantPack[scale->get_uid()] = scaleTensor.memory().deviceData();
variantPack[bias->get_uid()] = biasTensor.memory().deviceData();
variantPack[prevRunningMean->get_uid()] = prevMeanTensor.memory().deviceData();
variantPack[prevRunningVar->get_uid()] = prevVarTensor.memory().deviceData();
variantPack[momentum->get_uid()] = momentumTensor.memory().deviceData();
variantPack[epsilon->get_uid()] = epsilonTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
// hipDNN uses two separate memory blocks to store the statistics before and after updates,
// whereas MIOpen only uses one memory block to store them.
// To accommodate this difference, both the prev and next statistics in the hipDNN interface are pointed to the same memory address here,
// and the plugin layer passes this address to MIOpen.
variantPack[nextRunningMean->get_uid()] = prevMeanTensor.memory().deviceData();
variantPack[nextRunningVar->get_uid()] = prevVarTensor.memory().deviceData();
variantPack[savedMean->get_uid()] = savedMeanTensor.memory().deviceData();
variantPack[savedInvVariance->get_uid()] = savedInvVarTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "Batch normalization training graph execution complete. ";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
const int64_t n = 2; // Batch size
// Input
const int64_t c = 32; // Number of channels
const int64_t h = 32; // Height
const int64_t w = 32; // Width
std::vector<int32_t> blockSize = {1, 32};
const int64_t scaleW = w / blockSize[1];
auto buildBlockScaleDequantizeGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("block_scale_dequantize_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto x = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto scale = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("scale")
.set_dim({n, c, h, scaleW})
.set_stride({c * h * scaleW, 1, c * scaleW, c}));
auto blockScaleDequantizeAttributes
= hipdnn_frontend::graph::BlockScaleDequantizeAttributes()
.set_name("block_scale_dequantize_node")
.set_block_size(blockSize)
.set_is_negative_scale(true);
auto y = graph->block_scale_dequantize(x, scale, blockScaleDequantizeAttributes);
y->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x, scale, y);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x, scale, y] = buildBlockScaleDequantizeGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> xTensor(x->get_dim(), x->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> scaleTensor(scale->get_dim(),
scale->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
std::unordered_map<int64_t, void*> variantPack;
variantPack[x->get_uid()] = xTensor.memory().deviceData();
variantPack[scale->get_uid()] = scaleTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "\nBlockScaleDequantize graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
const int64_t n = 2; // Batch size
// Input
const int64_t c = 32; // Number of channels
const int64_t h = 32; // Height
const int64_t w = 32; // Width
const int32_t blockSize = 1;
auto buildBlockScaleQuantizeGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("block_scale_quantize_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto x = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto blockScaleQuantizeAttributes = hipdnn_frontend::graph::BlockScaleQuantizeAttributes()
.set_name("block_scale_quantize_node")
.set_block_size(blockSize);
auto [y, scale] = graph->block_scale_quantize(x, blockScaleQuantizeAttributes);
y->set_output(true);
scale->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x, y, scale);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x, y, scale] = buildBlockScaleQuantizeGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> xTensor(x->get_dim(), x->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> scaleTensor(scale->get_dim(),
scale->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
std::unordered_map<int64_t, void*> variantPack;
variantPack[x->get_uid()] = xTensor.memory().deviceData();
variantPack[scale->get_uid()] = scaleTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "\nBlockScaleQuantize graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
// Copyright © Advanced Micro Devices, Inc., or its affiliates.
// SPDX-License-Identifier: MIT
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
// params
const int64_t n = 1; // Batch size
const int64_t c = 32; // Number of channels
const int64_t h = 128; // Height
const int64_t w = 128; // Width
const int64_t k = 32; // Number of filters
const int64_t r = 2; // Height
const int64_t s = 2; // Width
const int64_t axis = 1;
// create graph
auto buildConcatConvGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
const auto inputType = hipdnn_frontend::getDataTypeEnumFromType<InputType>();
graph->set_name("concat_conv_graph")
.set_io_data_type(inputType)
.set_intermediate_data_type(inputType)
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
// create concat
auto x1 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x1")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto x2 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x2")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto concatenateAttributes = hipdnn_frontend::graph::ConcatenateAttributes().set_axis(axis);
auto concatOutput = graph->concatenate({x1, x2}, concatenateAttributes);
// create conv
const int64_t c2 = c * 2;
auto filter = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("filter")
.set_dim({k, c2, r, s})
.set_stride({c2 * r * s, 1, c2 * s, c2}));
auto convFpropAttributes = hipdnn_frontend::graph::ConvFpropAttributes()
.set_name("conv_fprop_node")
.set_padding({1, 1})
.set_stride({1, 1})
.set_dilation({1, 1});
auto y = graph->conv_fprop(concatOutput, filter, convFpropAttributes);
y->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x1, x2, filter, y);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x1, x2, filter, y] = buildConcatConvGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> x1Tensor(x1->get_dim(), x1->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> x2Tensor(x2->get_dim(), x2->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> filterTensor(filter->get_dim(),
filter->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[x1->get_uid()] = x1Tensor.memory().deviceData();
variantPack[x2->get_uid()] = x2Tensor.memory().deviceData();
variantPack[filter->get_uid()] = filterTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "Concatenate graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
// Copyright © Advanced Micro Devices, Inc., or its affiliates.
// SPDX-License-Identifier: MIT
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
// params
const int64_t n = 1; // Batch size
const int64_t c = 32; // Number of channels
const int64_t h = 128; // Height
const int64_t w = 128; // Width
const int64_t k = 32; // Number of filters
const int64_t r = 2; // Height
const int64_t s = 2; // Width
const int64_t axis = 1;
// create graph
auto buildConcatConvGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
const auto inputType = hipdnn_frontend::getDataTypeEnumFromType<InputType>();
graph->set_name("concat_conv_pointwise_graph")
.set_io_data_type(inputType)
.set_intermediate_data_type(inputType)
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
// create concat
auto x1 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x1")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto x2 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x2")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto concatenateAttributes = hipdnn_frontend::graph::ConcatenateAttributes().set_axis(axis);
auto concatOutput = graph->concatenate({x1, x2}, concatenateAttributes);
// create conv
const int64_t c2 = c * 2;
auto filter = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("filter")
.set_dim({k, c2, r, s})
.set_stride({c2 * r * s, 1, c2 * s, c2}));
auto convFpropAttributes = hipdnn_frontend::graph::ConvFpropAttributes()
.set_name("conv_fprop_node")
.set_padding({1, 1})
.set_stride({1, 1})
.set_dilation({1, 1});
auto y = graph->conv_fprop(concatOutput, filter, convFpropAttributes);
// create bias
auto bias = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("bias")
.set_dim({1, k, 1, 1})
.set_stride({k, 1, k, k}));
auto biasAttributes = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("bias_node")
.set_mode(hipdnn_frontend::PointwiseMode_t::ADD);
auto biasOutput = graph->pointwise(y, bias, biasAttributes);
biasOutput->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x1, x2, filter, bias, biasOutput);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x1, x2, filter, bias, y] = buildConcatConvGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> x1Tensor(x1->get_dim(), x1->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> x2Tensor(x2->get_dim(), x2->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> filterTensor(filter->get_dim(),
filter->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> biasTensor(bias->get_dim(), bias->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[x1->get_uid()] = x1Tensor.memory().deviceData();
variantPack[x2->get_uid()] = x2Tensor.memory().deviceData();
variantPack[filter->get_uid()] = filterTensor.memory().deviceData();
variantPack[bias->get_uid()] = biasTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "ConcatConvPointwise graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
// Copyright © Advanced Micro Devices, Inc., or its affiliates.
// SPDX-License-Identifier: MIT
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
// params
const int64_t n = 1; // Batch size
const int64_t c = 32; // Number of channels
const int64_t h = 128; // Height
const int64_t w = 128; // Width
const int64_t k = 32; // Number of filters
const int64_t r = 3; // Height
const int64_t s = 3; // Width
const int64_t axis = 1;
// create graph
auto buildConcatConvGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
const auto inputType = hipdnn_frontend::getDataTypeEnumFromType<InputType>();
graph->set_name("concat_conv_pointwise_graph")
.set_io_data_type(inputType)
.set_intermediate_data_type(inputType)
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
// create concat
auto x1 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x1")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto x2 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x2")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto concatenateAttributes = hipdnn_frontend::graph::ConcatenateAttributes().set_axis(axis);
auto concatOutput = graph->concatenate({x1, x2}, concatenateAttributes);
// create conv
const int64_t c2 = c * 2;
auto filter = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("filter")
.set_dim({k, c2, r, s})
.set_stride({c2 * r * s, 1, c2 * s, c2}));
auto convFpropAttributes = hipdnn_frontend::graph::ConvFpropAttributes()
.set_name("conv_fprop_node")
.set_padding({1, 1})
.set_stride({1, 1})
.set_dilation({1, 1});
auto y = graph->conv_fprop(concatOutput, filter, convFpropAttributes);
// create bias
auto bias = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("bias")
.set_dim({1, k, 1, 1})
.set_stride({k, 1, k, k}));
auto biasAttributes = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("bias_node")
.set_mode(hipdnn_frontend::PointwiseMode_t::ADD);
auto biasOutput = graph->pointwise(y, bias, biasAttributes);
// create add
auto add = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("add")
.set_dim({n, k, h, w})
.set_stride({k * h * w, 1, k * w, k}));
auto addAttributes = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("add_node")
.set_mode(hipdnn_frontend::PointwiseMode_t::ADD);
auto output = graph->pointwise(biasOutput, add, addAttributes);
output->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x1, x2, filter, bias, add, output);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x1, x2, filter, bias, add, y] = buildConcatConvGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> x1Tensor(x1->get_dim(), x1->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> x2Tensor(x2->get_dim(), x2->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> filterTensor(filter->get_dim(),
filter->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> biasTensor(bias->get_dim(), bias->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> addTensor(add->get_dim(), add->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[x1->get_uid()] = x1Tensor.memory().deviceData();
variantPack[x2->get_uid()] = x2Tensor.memory().deviceData();
variantPack[filter->get_uid()] = filterTensor.memory().deviceData();
variantPack[bias->get_uid()] = biasTensor.memory().deviceData();
variantPack[add->get_uid()] = addTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "ConcatConvPointwise graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
// Copyright © Advanced Micro Devices, Inc., or its affiliates.
// SPDX-License-Identifier: MIT
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
// params
const int64_t n = 1; // Batch size
const int64_t c = 32; // Number of channels
const int64_t h = 128; // Height
const int64_t w = 128; // Width
const int64_t k = 32; // Number of filters
const int64_t r = 2; // Height
const int64_t s = 2; // Width
const int64_t axis = 1;
// create graph
auto buildConcatConvGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
const auto inputType = hipdnn_frontend::getDataTypeEnumFromType<InputType>();
graph->set_name("concat_conv_pointwise_graph")
.set_io_data_type(inputType)
.set_intermediate_data_type(inputType)
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
// create concat
auto x1 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x1")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto x2 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x2")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto concatenateAttributes = hipdnn_frontend::graph::ConcatenateAttributes().set_axis(axis);
auto concatOutput = graph->concatenate({x1, x2}, concatenateAttributes);
// create conv
const int64_t c2 = c * 2;
auto filter = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("filter")
.set_dim({k, c2, r, s})
.set_stride({c2 * r * s, 1, c2 * s, c2}));
auto convFpropAttributes = hipdnn_frontend::graph::ConvFpropAttributes()
.set_name("conv_fprop_node")
.set_padding({1, 1})
.set_stride({1, 1})
.set_dilation({1, 1});
auto y = graph->conv_fprop(concatOutput, filter, convFpropAttributes);
// create bias
auto bias = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("bias")
.set_dim({1, k, 1, 1})
.set_stride({k, 1, k, k}));
auto biasAttributes = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("bias_node")
.set_mode(hipdnn_frontend::PointwiseMode_t::ADD);
auto biasOutput = graph->pointwise(y, bias, biasAttributes);
// create leaky relu
auto reluAttributes = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("relu_node")
.set_mode(hipdnn_frontend::PointwiseMode_t::RELU_FWD)
.set_relu_lower_clip_slope(0.1f);
auto reluOutput = graph->pointwise(biasOutput, reluAttributes);
reluOutput->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x1, x2, filter, bias, reluOutput);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x1, x2, filter, bias, y] = buildConcatConvGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> x1Tensor(x1->get_dim(), x1->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> x2Tensor(x2->get_dim(), x2->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> filterTensor(filter->get_dim(),
filter->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> biasTensor(bias->get_dim(), bias->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[x1->get_uid()] = x1Tensor.memory().deviceData();
variantPack[x2->get_uid()] = x2Tensor.memory().deviceData();
variantPack[filter->get_uid()] = filterTensor.memory().deviceData();
variantPack[bias->get_uid()] = biasTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "ConcatConvPointwise graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
// Copyright © Advanced Micro Devices, Inc., or its affiliates.
// SPDX-License-Identifier: MIT
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
// params
const int64_t n = 1; // Batch size
const int64_t c = 32; // Number of channels
const int64_t h = 128; // Height
const int64_t w = 128; // Width
const int64_t k = 32; // Number of filters
const int64_t r = 3; // Height
const int64_t s = 3; // Width
const int64_t axis = 1;
// create graph
auto buildConcatConvGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
const auto inputType = hipdnn_frontend::getDataTypeEnumFromType<InputType>();
graph->set_name("concat_conv_pointwise_graph")
.set_io_data_type(inputType)
.set_intermediate_data_type(inputType)
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
// create concat
auto x1 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x1")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto x2 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x2")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c})
.set_data_type(inputType));
auto concatenateAttributes = hipdnn_frontend::graph::ConcatenateAttributes().set_axis(axis);
auto concatOutput = graph->concatenate({x1, x2}, concatenateAttributes);
// create conv
const int64_t c2 = c * 2;
auto filter = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("filter")
.set_dim({k, c2, r, s})
.set_stride({c2 * r * s, 1, c2 * s, c2}));
auto convFpropAttributes = hipdnn_frontend::graph::ConvFpropAttributes()
.set_name("conv_fprop_node")
.set_padding({1, 1})
.set_stride({1, 1})
.set_dilation({1, 1});
auto y = graph->conv_fprop(concatOutput, filter, convFpropAttributes);
// create bias
auto bias = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("bias")
.set_dim({1, k, 1, 1})
.set_stride({k, 1, k, k}));
auto biasAttributes = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("bias_node")
.set_mode(hipdnn_frontend::PointwiseMode_t::ADD);
auto biasOutput = graph->pointwise(y, bias, biasAttributes);
// create relu
auto reluAttributes = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("relu_node")
.set_mode(hipdnn_frontend::PointwiseMode_t::RELU_FWD)
.set_relu_lower_clip_slope(0.1f);
auto reluOutput = graph->pointwise(biasOutput, reluAttributes);
// create add
auto add = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("add")
.set_dim({n, k, h, w})
.set_stride({k * h * w, 1, k * w, k}));
auto addAttributes = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("add_node")
.set_mode(hipdnn_frontend::PointwiseMode_t::ADD);
auto output = graph->pointwise(reluOutput, add, addAttributes);
output->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x1, x2, filter, bias, add, output);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x1, x2, filter, bias, add, y] = buildConcatConvGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> x1Tensor(x1->get_dim(), x1->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> x2Tensor(x2->get_dim(), x2->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> filterTensor(filter->get_dim(),
filter->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> biasTensor(bias->get_dim(), bias->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> addTensor(add->get_dim(), add->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[x1->get_uid()] = x1Tensor.memory().deviceData();
variantPack[x2->get_uid()] = x2Tensor.memory().deviceData();
variantPack[filter->get_uid()] = filterTensor.memory().deviceData();
variantPack[bias->get_uid()] = biasTensor.memory().deviceData();
variantPack[add->get_uid()] = addTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "ConcatConvPointwise graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
// Copyright © Advanced Micro Devices, Inc., or its affiliates.
// SPDX-License-Identifier: MIT
#include <iostream>
#include "utils.hpp"
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
int main()
{
using InputType = hipdnn_data_sdk::types::half;
// params
const int64_t n = 1; // Batch size
const int64_t c = 16; // Number of channels
const int64_t h = 16; // Height
const int64_t w = 16; // Width
const int64_t axis = 0;
// create graph
auto buildConcatenateGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
const auto inputType = hipdnn_frontend::getDataTypeEnumFromType<InputType>();
graph->set_name("ConcatenateGraph")
.set_io_data_type(inputType)
.set_intermediate_data_type(inputType)
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto x1 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x1")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1})
.set_data_type(inputType));
auto x2 = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x2")
.set_dim({n, c, h, w})
.set_stride({c * h * w, h * w, w, 1})
.set_data_type(inputType));
auto concatenateAttributes = hipdnn_frontend::graph::ConcatenateAttributes().set_axis(axis);
auto y = graph->concatenate({x1, x2}, concatenateAttributes);
y->set_output(true);
// build graph
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x1, x2, y);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x1, x2, y] = buildConcatenateGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> x1Tensor(x1->get_dim(), x1->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> x2Tensor(x2->get_dim(), x2->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[x1->get_uid()] = x1Tensor.memory().deviceData();
variantPack[x2->get_uid()] = x2Tensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "Concatenate graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
// Copyright © Advanced Micro Devices, Inc., or its affiliates.
// SPDX-License-Identifier: MIT
#include <iostream>
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
#include "../utils.hpp"
int main()
{
using InputType = float;
const int64_t n = 4;
const int64_t c = 64;
const int64_t h = 16;
const int64_t w = 16;
const int64_t k = 32;
const int64_t r = 3;
const int64_t s = 3;
auto buildConvGenstatsGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("conv_genstats_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto x = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto filter = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("filter")
.set_dim({k, c, r, s})
.set_stride({c * r * s, 1, c * s, c}));
auto convAttrs = hipdnn_frontend::graph::ConvFpropAttributes()
.set_name("conv_fprop_node")
.set_padding({1, 1})
.set_stride({1, 1})
.set_dilation({1, 1});
auto y = graph->conv_fprop(x, filter, convAttrs);
auto genstatsAttrs = hipdnn_frontend::graph::GenstatsAttributes().set_name("genstats_node");
auto [sum, sqSum] = graph->genstats(y, genstatsAttrs);
y->set_output(true);
sum->set_output(true);
sqSum->set_output(true);
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, x, filter, y, sum, sqSum);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, x, filter, y, sum, sqSum] = buildConvGenstatsGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> xTensor(x->get_dim(), x->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> filterTensor(filter->get_dim(),
filter->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> sumTensor(sum->get_dim(), sum->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> sqSumTensor(sqSum->get_dim(),
sqSum->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[x->get_uid()] = xTensor.memory().deviceData();
variantPack[filter->get_uid()] = filterTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
variantPack[sum->get_uid()] = sumTensor.memory().deviceData();
variantPack[sqSum->get_uid()] = sqSumTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "ConvGenstats graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
// Copyright © Advanced Micro Devices, Inc., or its affiliates.
// SPDX-License-Identifier: MIT
#include <iostream>
#include <hipdnn_data_sdk/utilities/Tensor.hpp>
#include <hipdnn_data_sdk/utilities/Workspace.hpp>
#include <hipdnn_frontend.hpp>
#include "../utils.hpp"
int main()
{
using InputType = float;
const int64_t n = 1;
const int64_t c = 4;
const int64_t h = 32;
const int64_t w = 32;
auto buildMulMulAddAddGraph = [=](hipdnnHandle_t handle) {
auto graph = std::make_shared<hipdnn_frontend::graph::Graph>();
graph->set_name("mul_mul_add_add_graph")
.set_io_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_intermediate_data_type(hipdnn_frontend::getDataTypeEnumFromType<InputType>())
.set_compute_data_type(hipdnn_frontend::DataType::FLOAT);
auto a = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("a")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto x = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("x")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto b = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("b")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto y = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("y")
.set_dim({n, c, h, w})
.set_stride({c * h * w, 1, c * w, c}));
auto bias = std::make_shared<hipdnn_frontend::graph::TensorAttributes>(
hipdnn_frontend::graph::Tensor_attributes()
.set_name("bias")
.set_dim({1, c, 1, 1})
.set_stride({c, 1, c, c}));
auto mulAttrs0 = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("mul0_node")
.set_mode(hipdnn_frontend::PointwiseMode::MUL);
auto mulOut0 = graph->pointwise(a, x, mulAttrs0);
auto mulAttrs1 = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("mul1_node")
.set_mode(hipdnn_frontend::PointwiseMode::MUL);
auto mulOut1 = graph->pointwise(b, y, mulAttrs1);
auto addAttrs0 = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("add0_node")
.set_mode(hipdnn_frontend::PointwiseMode::ADD);
auto addOut0 = graph->pointwise(mulOut0, mulOut1, addAttrs0);
auto addAttrs1 = hipdnn_frontend::graph::PointwiseAttributes()
.set_name("add1_node")
.set_mode(hipdnn_frontend::PointwiseMode::ADD);
auto z = graph->pointwise(addOut0, bias, addAttrs1);
z->set_output(true);
HIPDNN_FE_CHECK(graph->build(handle));
return std::make_tuple(graph, a, x, b, y, bias, z);
};
auto backend = hipdnn_frontend::detail::hipdnnBackend();
if(!backend)
{
std::cout << "Creat backend failed. \n";
return 1;
}
hipdnnHandle_t handle;
HIPDNN_CHECK(backend->create(&handle));
auto [graph, a, x, b, y, bias, z] = buildMulMulAddAddGraph(handle);
hipdnn_data_sdk::utilities::Tensor<InputType> aTensor(a->get_dim(), a->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> xTensor(x->get_dim(), x->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> bTensor(b->get_dim(), b->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> yTensor(y->get_dim(), y->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> biasTensor(bias->get_dim(), bias->get_stride());
hipdnn_data_sdk::utilities::Tensor<InputType> zTensor(z->get_dim(), z->get_stride());
std::unordered_map<int64_t, void*> variantPack;
variantPack[a->get_uid()] = aTensor.memory().deviceData();
variantPack[x->get_uid()] = xTensor.memory().deviceData();
variantPack[b->get_uid()] = bTensor.memory().deviceData();
variantPack[y->get_uid()] = yTensor.memory().deviceData();
variantPack[bias->get_uid()] = biasTensor.memory().deviceData();
variantPack[z->get_uid()] = zTensor.memory().deviceData();
int64_t workspaceSize = 0;
HIPDNN_FE_CHECK(graph->get_workspace_size(workspaceSize));
const hipdnn_data_sdk::utilities::Workspace workspace(static_cast<size_t>(workspaceSize));
HIPDNN_FE_CHECK(graph->execute(handle, variantPack, workspace.get()));
std::cout << "MulMulAddAdd graph execution complete. \n";
HIPDNN_CHECK(backend->destroy(handle));
return 0;
}
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment