Unverified Commit a075d629 authored by Edgar Andrés Margffoy Tuay's avatar Edgar Andrés Margffoy Tuay Committed by GitHub
Browse files

PR: Add CMake build and function tracing tests (#2577)



* Add CMake build pipeline

* Add CMake build workflow

* Add executable permissions to script

* Install cmake on Windows/MacOS

* Install conda-build before setting up MSVC

* Install PyTorch from nightly

* Do not use conda-build variables

* Add path to CMake

* Install libpng and libjpeg

* Perform make

* Call msbuild on Windows

* Add missing yq

* Use vc_env_helper

* Use string instruction

* Escape configuration option

* Remove configuration flag

* Try to pass -p

* Use caret to escape equal sign

* Escape string option in Windows

* Try to call other bat

* Remove Windows/GPU CMake

* Add tracing cpp test

* Script model instead of tracing it

* Try to register operators manually

* Use manylinux-cuda102

* Activate conda env on Linux

* Build and run sample tracing test

* Add empty echo

* Remove unnecessary register

* Copy headers on Mac

* Revert to 2xlarge

* Include /usr/local/include on Mac

* Install pillow on Windows

* Install future

* Install torchvision on Windows

* Set include flag

* Add torchlib to PATH

* Normalize path via cygpath

* Register ops on Windows

* Minor error correction

* Register CPU/GPU ops on DLL library and register ops via reference

* Install dataclasses

* Install dataclasses using pip

* Address clang formatting issue

* Try to use an actual GPU instance on Linux

* Remove extra environment section

* Declare environment explicitly

* Regenerate

* Pass env variables to Dokcer

* Regenerate circleci

* Test tracing on GPU

* Use GPU medium

* Regenerate

* Use cuda101

* Regenerate

* Do not use pre-trained weights

Avoids having to download pretrained files, which could cause flaky tests
Co-authored-by: default avatarFrancisco Massa <fvsmassa@gmail.com>
parent 3b31b724
...@@ -551,6 +551,76 @@ jobs: ...@@ -551,6 +551,76 @@ jobs:
- store_test_results: - store_test_results:
path: test-results path: test-results
cmake_linux_cpu:
<<: *binary_common
docker:
- image: "pytorch/manylinux-cuda102"
resource_class: 2xlarge+
steps:
- checkout_merge
- run:
name: Setup conda
command: .circleci/unittest/linux/scripts/setup_env.sh
- run: packaging/build_cmake.sh
cmake_linux_gpu:
<<: *binary_common
machine:
image: ubuntu-1604-cuda-10.1:201909-23
resource_class: gpu.small
environment:
PYTHON_VERSION: << parameters.python_version >>
PYTORCH_VERSION: << parameters.pytorch_version >>
UNICODE_ABI: << parameters.unicode_abi >>
CU_VERSION: << parameters.cu_version >>
steps:
- checkout_merge
- run:
name: Setup conda
command: docker run -e CU_VERSION -e PYTHON_VERSION -e UNICODE_ABI -e PYTORCH_VERSION -t --gpus all -v $PWD:$PWD -w $PWD << parameters.wheel_docker_image >> .circleci/unittest/linux/scripts/setup_env.sh
- run:
name: Build torchvision C++ distribution and test
command: docker run -e CU_VERSION -e PYTHON_VERSION -e UNICODE_ABI -e PYTORCH_VERSION -t --gpus all -v $PWD:$PWD -w $PWD << parameters.wheel_docker_image >> packaging/build_cmake.sh
cmake_macos_cpu:
<<: *binary_common
macos:
xcode: "9.0"
steps:
- checkout_merge
- run:
command: |
curl -o conda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
sh conda.sh -b
source $HOME/miniconda3/bin/activate
conda install -yq conda-build cmake
packaging/build_cmake.sh
cmake_windows_cpu:
<<: *binary_common
executor:
name: windows-cpu
steps:
- checkout_merge
- run:
command: |
set -ex
source packaging/windows/internal/vc_install_helper.sh
packaging/build_cmake.sh
cmake_windows_gpu:
<<: *binary_common
executor:
name: windows-gpu
steps:
- checkout_merge
- run:
command: |
set -ex
source packaging/windows/internal/vc_install_helper.sh
packaging/windows/internal/cuda_install.bat
packaging/build_cmake.sh
workflows: workflows:
build: build:
jobs: jobs:
...@@ -921,6 +991,27 @@ workflows: ...@@ -921,6 +991,27 @@ workflows:
cu_version: cu101 cu_version: cu101
name: unittest_windows_gpu_py3.8 name: unittest_windows_gpu_py3.8
python_version: '3.8' python_version: '3.8'
cmake:
jobs:
- cmake_linux_cpu:
cu_version: cpu
name: cmake_linux_cpu
python_version: '3.8'
- cmake_linux_gpu:
cu_version: cu101
name: cmake_linux_gpu
python_version: '3.8'
wheel_docker_image: pytorch/manylinux-cuda101
- cmake_windows_cpu:
cu_version: cpu
name: cmake_windows_cpu
python_version: '3.8'
- cmake_macos_cpu:
cu_version: cpu
name: cmake_macos_cpu
python_version: '3.8'
nightly: nightly:
jobs: jobs:
- circleci_consistency - circleci_consistency
......
...@@ -551,6 +551,76 @@ jobs: ...@@ -551,6 +551,76 @@ jobs:
- store_test_results: - store_test_results:
path: test-results path: test-results
cmake_linux_cpu:
<<: *binary_common
docker:
- image: "pytorch/manylinux-cuda102"
resource_class: 2xlarge+
steps:
- checkout_merge
- run:
name: Setup conda
command: .circleci/unittest/linux/scripts/setup_env.sh
- run: packaging/build_cmake.sh
cmake_linux_gpu:
<<: *binary_common
machine:
image: ubuntu-1604-cuda-10.1:201909-23
resource_class: gpu.small
environment:
PYTHON_VERSION: << parameters.python_version >>
PYTORCH_VERSION: << parameters.pytorch_version >>
UNICODE_ABI: << parameters.unicode_abi >>
CU_VERSION: << parameters.cu_version >>
steps:
- checkout_merge
- run:
name: Setup conda
command: docker run -e CU_VERSION -e PYTHON_VERSION -e UNICODE_ABI -e PYTORCH_VERSION -t --gpus all -v $PWD:$PWD -w $PWD << parameters.wheel_docker_image >> .circleci/unittest/linux/scripts/setup_env.sh
- run:
name: Build torchvision C++ distribution and test
command: docker run -e CU_VERSION -e PYTHON_VERSION -e UNICODE_ABI -e PYTORCH_VERSION -t --gpus all -v $PWD:$PWD -w $PWD << parameters.wheel_docker_image >> packaging/build_cmake.sh
cmake_macos_cpu:
<<: *binary_common
macos:
xcode: "9.0"
steps:
- checkout_merge
- run:
command: |
curl -o conda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
sh conda.sh -b
source $HOME/miniconda3/bin/activate
conda install -yq conda-build cmake
packaging/build_cmake.sh
cmake_windows_cpu:
<<: *binary_common
executor:
name: windows-cpu
steps:
- checkout_merge
- run:
command: |
set -ex
source packaging/windows/internal/vc_install_helper.sh
packaging/build_cmake.sh
cmake_windows_gpu:
<<: *binary_common
executor:
name: windows-gpu
steps:
- checkout_merge
- run:
command: |
set -ex
source packaging/windows/internal/vc_install_helper.sh
packaging/windows/internal/cuda_install.bat
packaging/build_cmake.sh
workflows: workflows:
build: build:
{%- if True %} {%- if True %}
...@@ -564,6 +634,11 @@ workflows: ...@@ -564,6 +634,11 @@ workflows:
unittest: unittest:
jobs: jobs:
{{ unittest_workflows() }} {{ unittest_workflows() }}
cmake:
jobs:
{{ cmake_workflows() }}
nightly: nightly:
{%- endif %} {%- endif %}
jobs: jobs:
......
...@@ -184,6 +184,25 @@ def unittest_workflows(indentation=6): ...@@ -184,6 +184,25 @@ def unittest_workflows(indentation=6):
return indent(indentation, jobs) return indent(indentation, jobs)
def cmake_workflows(indentation=6):
jobs = []
python_version = '3.8'
for os_type in ['linux', 'windows', 'macos']:
# Right now CMake builds are failling on Windows (GPU)
device_types = ['cpu', 'gpu'] if os_type == 'linux' else ['cpu']
for device in device_types:
job = {
'name': f'cmake_{os_type}_{device}',
'python_version': python_version
}
job['cu_version'] = 'cu101' if device == 'gpu' else 'cpu'
if device == 'gpu':
job['wheel_docker_image'] = 'pytorch/manylinux-cuda101'
jobs.append({f'cmake_{os_type}_{device}': job})
return indent(indentation, jobs)
if __name__ == "__main__": if __name__ == "__main__":
d = os.path.dirname(__file__) d = os.path.dirname(__file__)
env = jinja2.Environment( env = jinja2.Environment(
...@@ -196,4 +215,5 @@ if __name__ == "__main__": ...@@ -196,4 +215,5 @@ if __name__ == "__main__":
f.write(env.get_template('config.yml.in').render( f.write(env.get_template('config.yml.in').render(
build_workflows=build_workflows, build_workflows=build_workflows,
unittest_workflows=unittest_workflows, unittest_workflows=unittest_workflows,
cmake_workflows=cmake_workflows,
)) ))
#!/bin/bash
set -ex
if [[ "$(uname)" != Darwin && "$OSTYPE" != "msys" ]]; then
eval "$(./conda/bin/conda shell.bash hook)"
conda activate ./env
fi
script_dir="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
. "$script_dir/pkg_helpers.bash"
export BUILD_TYPE=conda
setup_env 0.8.0
export SOURCE_ROOT_DIR="$PWD"
setup_conda_pytorch_constraint
setup_conda_cudatoolkit_plain_constraint
if [[ "$OSTYPE" == "msys" ]]; then
conda install -yq conda-build cmake pillow future
pip install dataclasses
fi
setup_visual_studio_constraint
setup_junit_results_folder
conda install -yq pytorch=$PYTORCH_VERSION $CONDA_CUDATOOLKIT_CONSTRAINT $CONDA_CPUONLY_FEATURE -c pytorch-nightly
TORCH_PATH=$(dirname $(python -c "import torch; print(torch.__file__)"))
if [[ "$(uname)" == Darwin || "$OSTYPE" == "msys" ]]; then
conda install -yq libpng jpeg
else
yum install -y libpng-devel libjpeg-turbo-devel
fi
mkdir cpp_build
pushd cpp_build
# Generate libtorchvision files
cmake .. -DTorch_DIR=$TORCH_PATH/share/cmake/Torch -DWITH_CUDA=$CMAKE_USE_CUDA
# Compile and install libtorchvision
if [[ "$OSTYPE" == "msys" ]]; then
"$script_dir/windows/internal/vc_env_helper.bat" "$script_dir/windows/internal/build_cmake.bat"
CONDA_PATH=$(dirname $(which python))
cp -r "C:/Program Files (x86)/torchvision/include/torchvision" $CONDA_PATH/include
else
make
make install
if [[ "$(uname)" == Darwin ]]; then
CONDA_PATH=$(dirname $(dirname $(which python)))
cp -r /usr/local/include/torchvision $CONDA_PATH/include/
export C_INCLUDE_PATH=/usr/local/include
export CPLUS_INCLUDE_PATH=/usr/local/include
fi
fi
popd
# Install torchvision locally
python setup.py develop
# Trace, compile and run project that uses Faster-RCNN
cd test/tracing/frcnn
mkdir build
# Trace model
python trace_model.py
cp fasterrcnn_resnet50_fpn.pt build
cd build
cmake .. -DTorch_DIR=$TORCH_PATH/share/cmake/Torch -DWITH_CUDA=$CMAKE_USE_CUDA
if [[ "$OSTYPE" == "msys" ]]; then
"$script_dir/windows/internal/vc_env_helper.bat" "$script_dir/windows/internal/build_frcnn.bat"
mv fasterrcnn_resnet50_fpn.pt Release
cd Release
export PATH=$(cygpath "C:/Program Files (x86)/torchvision/bin"):$(cygpath $TORCH_PATH)/lib:$PATH
else
make
fi
# Run traced program
./test_frcnn_tracing
...@@ -289,6 +289,39 @@ setup_conda_cudatoolkit_constraint() { ...@@ -289,6 +289,39 @@ setup_conda_cudatoolkit_constraint() {
fi fi
} }
setup_conda_cudatoolkit_plain_constraint() {
export CONDA_CPUONLY_FEATURE=""
export CMAKE_USE_CUDA=1
if [[ "$(uname)" == Darwin ]]; then
export CONDA_CUDATOOLKIT_CONSTRAINT=""
export CMAKE_USE_CUDA=0
else
case "$CU_VERSION" in
cu102)
export CONDA_CUDATOOLKIT_CONSTRAINT="cudatoolkit=10.2"
;;
cu101)
export CONDA_CUDATOOLKIT_CONSTRAINT="cudatoolkit=10.1"
;;
cu100)
export CONDA_CUDATOOLKIT_CONSTRAINT="cudatoolkit=10.0"
;;
cu92)
export CONDA_CUDATOOLKIT_CONSTRAINT="cudatoolkit=9.2"
;;
cpu)
export CONDA_CUDATOOLKIT_CONSTRAINT=""
export CONDA_CPUONLY_FEATURE="cpuonly"
export CMAKE_USE_CUDA=0
;;
*)
echo "Unrecognized CU_VERSION=$CU_VERSION"
exit 1
;;
esac
fi
}
# Build the proper compiler package before building the final package # Build the proper compiler package before building the final package
setup_visual_studio_constraint() { setup_visual_studio_constraint() {
if [[ "$OSTYPE" == "msys" ]]; then if [[ "$OSTYPE" == "msys" ]]; then
......
@echo on
msbuild "-p:Configuration=Release" torchvision.vcxproj
msbuild "-p:Configuration=Release" INSTALL.vcxproj
@echo on
set CL=/I"C:\Program Files (x86)\torchvision\include"
msbuild "-p:Configuration=Release" test_frcnn_tracing.vcxproj
cmake_minimum_required(VERSION 3.1 FATAL_ERROR)
project(test_frcnn_tracing)
find_package(Torch REQUIRED)
find_package(TorchVision REQUIRED)
# This due to some headers importing Python.h
find_package(Python3 COMPONENTS Development)
add_executable(test_frcnn_tracing test_frcnn_tracing.cpp)
target_compile_features(test_frcnn_tracing PUBLIC cxx_range_for)
target_link_libraries(test_frcnn_tracing ${TORCH_LIBRARIES} TorchVision::TorchVision Python3::Python)
set_property(TARGET test_frcnn_tracing PROPERTY CXX_STANDARD 14)
#include <ATen/ATen.h>
#include <torch/script.h>
#include <torch/torch.h>
#include <torchvision/ROIAlign.h>
#include <torchvision/cpu/vision_cpu.h>
#include <torchvision/nms.h>
#ifdef _WIN32
// Windows only
// This is necessary until operators are automatically registered on include
static auto _nms = &nms_cpu;
#endif
int main() {
torch::DeviceType device_type;
device_type = torch::kCPU;
torch::jit::script::Module module;
try {
std::cout << "Loading model\n";
// Deserialize the ScriptModule from a file using torch::jit::load().
module = torch::jit::load("fasterrcnn_resnet50_fpn.pt");
std::cout << "Model loaded\n";
} catch (const torch::Error& e) {
std::cout << "error loading the model\n";
return -1;
} catch (const std::exception& e) {
std::cout << "Other error: " << e.what() << "\n";
return -1;
}
// TorchScript models require a List[IValue] as input
std::vector<torch::jit::IValue> inputs;
// Faster RCNN accepts a List[Tensor] as main input
std::vector<torch::Tensor> images;
images.push_back(torch::rand({3, 256, 275}));
images.push_back(torch::rand({3, 256, 275}));
inputs.push_back(images);
auto output = module.forward(inputs);
std::cout << "ok\n";
std::cout << "output" << output << "\n";
if (torch::cuda::is_available()) {
// Move traced model to GPU
module.to(torch::kCUDA);
// Add GPU inputs
images.clear();
inputs.clear();
torch::TensorOptions options = torch::TensorOptions{torch::kCUDA};
images.push_back(torch::rand({3, 256, 275}, options));
images.push_back(torch::rand({3, 256, 275}, options));
inputs.push_back(images);
auto output = module.forward(inputs);
std::cout << "ok\n";
std::cout << "output" << output << "\n";
}
return 0;
}
import os.path as osp
import torch
import torchvision
HERE = osp.dirname(osp.abspath(__file__))
ASSETS = osp.dirname(osp.dirname(HERE))
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=False)
model.eval()
traced_model = torch.jit.script(model)
traced_model.save("fasterrcnn_resnet50_fpn.pt")
#pragma once #pragma once
#include <torch/extension.h> #include <torch/extension.h>
std::tuple<at::Tensor, at::Tensor> ROIPool_forward_cpu( #ifdef _WIN32
#if defined(torchvision_EXPORTS)
#define VISION_API __declspec(dllexport)
#else
#define VISION_API __declspec(dllimport)
#endif
#else
#define VISION_API
#endif
VISION_API std::tuple<at::Tensor, at::Tensor> ROIPool_forward_cpu(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& rois, const at::Tensor& rois,
const float spatial_scale, const float spatial_scale,
const int pooled_height, const int pooled_height,
const int pooled_width); const int pooled_width);
at::Tensor ROIPool_backward_cpu( VISION_API at::Tensor ROIPool_backward_cpu(
const at::Tensor& grad, const at::Tensor& grad,
const at::Tensor& rois, const at::Tensor& rois,
const at::Tensor& argmax, const at::Tensor& argmax,
...@@ -20,7 +30,7 @@ at::Tensor ROIPool_backward_cpu( ...@@ -20,7 +30,7 @@ at::Tensor ROIPool_backward_cpu(
const int height, const int height,
const int width); const int width);
at::Tensor ROIAlign_forward_cpu( VISION_API at::Tensor ROIAlign_forward_cpu(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& rois, const at::Tensor& rois,
const double spatial_scale, const double spatial_scale,
...@@ -29,7 +39,7 @@ at::Tensor ROIAlign_forward_cpu( ...@@ -29,7 +39,7 @@ at::Tensor ROIAlign_forward_cpu(
const int64_t sampling_ratio, const int64_t sampling_ratio,
const bool aligned); const bool aligned);
at::Tensor ROIAlign_backward_cpu( VISION_API at::Tensor ROIAlign_backward_cpu(
const at::Tensor& grad, const at::Tensor& grad,
const at::Tensor& rois, const at::Tensor& rois,
const double spatial_scale, const double spatial_scale,
...@@ -42,14 +52,14 @@ at::Tensor ROIAlign_backward_cpu( ...@@ -42,14 +52,14 @@ at::Tensor ROIAlign_backward_cpu(
const int64_t sampling_ratio, const int64_t sampling_ratio,
const bool aligned); const bool aligned);
std::tuple<at::Tensor, at::Tensor> PSROIPool_forward_cpu( VISION_API std::tuple<at::Tensor, at::Tensor> PSROIPool_forward_cpu(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& rois, const at::Tensor& rois,
const float spatial_scale, const float spatial_scale,
const int pooled_height, const int pooled_height,
const int pooled_width); const int pooled_width);
at::Tensor PSROIPool_backward_cpu( VISION_API at::Tensor PSROIPool_backward_cpu(
const at::Tensor& grad, const at::Tensor& grad,
const at::Tensor& rois, const at::Tensor& rois,
const at::Tensor& mapping_channel, const at::Tensor& mapping_channel,
...@@ -61,7 +71,7 @@ at::Tensor PSROIPool_backward_cpu( ...@@ -61,7 +71,7 @@ at::Tensor PSROIPool_backward_cpu(
const int height, const int height,
const int width); const int width);
std::tuple<at::Tensor, at::Tensor> PSROIAlign_forward_cpu( VISION_API std::tuple<at::Tensor, at::Tensor> PSROIAlign_forward_cpu(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& rois, const at::Tensor& rois,
const float spatial_scale, const float spatial_scale,
...@@ -69,7 +79,7 @@ std::tuple<at::Tensor, at::Tensor> PSROIAlign_forward_cpu( ...@@ -69,7 +79,7 @@ std::tuple<at::Tensor, at::Tensor> PSROIAlign_forward_cpu(
const int pooled_width, const int pooled_width,
const int sampling_ratio); const int sampling_ratio);
at::Tensor PSROIAlign_backward_cpu( VISION_API at::Tensor PSROIAlign_backward_cpu(
const at::Tensor& grad, const at::Tensor& grad,
const at::Tensor& rois, const at::Tensor& rois,
const at::Tensor& mapping_channel, const at::Tensor& mapping_channel,
...@@ -82,12 +92,12 @@ at::Tensor PSROIAlign_backward_cpu( ...@@ -82,12 +92,12 @@ at::Tensor PSROIAlign_backward_cpu(
const int height, const int height,
const int width); const int width);
at::Tensor nms_cpu( VISION_API at::Tensor nms_cpu(
const at::Tensor& dets, const at::Tensor& dets,
const at::Tensor& scores, const at::Tensor& scores,
const double iou_threshold); const double iou_threshold);
at::Tensor DeformConv2d_forward_cpu( VISION_API at::Tensor DeformConv2d_forward_cpu(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& weight, const at::Tensor& weight,
const at::Tensor& offset, const at::Tensor& offset,
...@@ -98,7 +108,7 @@ at::Tensor DeformConv2d_forward_cpu( ...@@ -98,7 +108,7 @@ at::Tensor DeformConv2d_forward_cpu(
int groups, int groups,
int deformable_groups); int deformable_groups);
std::tuple<at::Tensor, at::Tensor, at::Tensor, at::Tensor> VISION_API std::tuple<at::Tensor, at::Tensor, at::Tensor, at::Tensor>
DeformConv2d_backward_cpu( DeformConv2d_backward_cpu(
const at::Tensor& grad_out, const at::Tensor& grad_out,
const at::Tensor& input, const at::Tensor& input,
......
#pragma once #pragma once
#include <torch/extension.h> #include <torch/extension.h>
at::Tensor ROIAlign_forward_cuda( #ifdef _WIN32
#if defined(torchvision_EXPORTS)
#define VISION_API __declspec(dllexport)
#else
#define VISION_API __declspec(dllimport)
#endif
#else
#define VISION_API
#endif
VISION_API at::Tensor ROIAlign_forward_cuda(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& rois, const at::Tensor& rois,
const double spatial_scale, const double spatial_scale,
...@@ -10,7 +20,7 @@ at::Tensor ROIAlign_forward_cuda( ...@@ -10,7 +20,7 @@ at::Tensor ROIAlign_forward_cuda(
const int64_t sampling_ratio, const int64_t sampling_ratio,
const bool aligned); const bool aligned);
at::Tensor ROIAlign_backward_cuda( VISION_API at::Tensor ROIAlign_backward_cuda(
const at::Tensor& grad, const at::Tensor& grad,
const at::Tensor& rois, const at::Tensor& rois,
const double spatial_scale, const double spatial_scale,
...@@ -23,14 +33,14 @@ at::Tensor ROIAlign_backward_cuda( ...@@ -23,14 +33,14 @@ at::Tensor ROIAlign_backward_cuda(
const int64_t sampling_ratio, const int64_t sampling_ratio,
const bool aligned); const bool aligned);
std::tuple<at::Tensor, at::Tensor> ROIPool_forward_cuda( VISION_API std::tuple<at::Tensor, at::Tensor> ROIPool_forward_cuda(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& rois, const at::Tensor& rois,
const float spatial_scale, const float spatial_scale,
const int pooled_height, const int pooled_height,
const int pooled_width); const int pooled_width);
at::Tensor ROIPool_backward_cuda( VISION_API at::Tensor ROIPool_backward_cuda(
const at::Tensor& grad, const at::Tensor& grad,
const at::Tensor& rois, const at::Tensor& rois,
const at::Tensor& argmax, const at::Tensor& argmax,
...@@ -42,14 +52,14 @@ at::Tensor ROIPool_backward_cuda( ...@@ -42,14 +52,14 @@ at::Tensor ROIPool_backward_cuda(
const int height, const int height,
const int width); const int width);
std::tuple<at::Tensor, at::Tensor> PSROIPool_forward_cuda( VISION_API std::tuple<at::Tensor, at::Tensor> PSROIPool_forward_cuda(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& rois, const at::Tensor& rois,
const float spatial_scale, const float spatial_scale,
const int pooled_height, const int pooled_height,
const int pooled_width); const int pooled_width);
at::Tensor PSROIPool_backward_cuda( VISION_API at::Tensor PSROIPool_backward_cuda(
const at::Tensor& grad, const at::Tensor& grad,
const at::Tensor& rois, const at::Tensor& rois,
const at::Tensor& mapping_channel, const at::Tensor& mapping_channel,
...@@ -61,7 +71,7 @@ at::Tensor PSROIPool_backward_cuda( ...@@ -61,7 +71,7 @@ at::Tensor PSROIPool_backward_cuda(
const int height, const int height,
const int width); const int width);
std::tuple<at::Tensor, at::Tensor> PSROIAlign_forward_cuda( VISION_API std::tuple<at::Tensor, at::Tensor> PSROIAlign_forward_cuda(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& rois, const at::Tensor& rois,
const float spatial_scale, const float spatial_scale,
...@@ -69,7 +79,7 @@ std::tuple<at::Tensor, at::Tensor> PSROIAlign_forward_cuda( ...@@ -69,7 +79,7 @@ std::tuple<at::Tensor, at::Tensor> PSROIAlign_forward_cuda(
const int pooled_width, const int pooled_width,
const int sampling_ratio); const int sampling_ratio);
at::Tensor PSROIAlign_backward_cuda( VISION_API at::Tensor PSROIAlign_backward_cuda(
const at::Tensor& grad, const at::Tensor& grad,
const at::Tensor& rois, const at::Tensor& rois,
const at::Tensor& mapping_channel, const at::Tensor& mapping_channel,
...@@ -82,12 +92,12 @@ at::Tensor PSROIAlign_backward_cuda( ...@@ -82,12 +92,12 @@ at::Tensor PSROIAlign_backward_cuda(
const int height, const int height,
const int width); const int width);
at::Tensor nms_cuda( VISION_API at::Tensor nms_cuda(
const at::Tensor& dets, const at::Tensor& dets,
const at::Tensor& scores, const at::Tensor& scores,
const double iou_threshold); const double iou_threshold);
at::Tensor DeformConv2d_forward_cuda( VISION_API at::Tensor DeformConv2d_forward_cuda(
const at::Tensor& input, const at::Tensor& input,
const at::Tensor& weight, const at::Tensor& weight,
const at::Tensor& offset, const at::Tensor& offset,
...@@ -98,7 +108,7 @@ at::Tensor DeformConv2d_forward_cuda( ...@@ -98,7 +108,7 @@ at::Tensor DeformConv2d_forward_cuda(
int groups, int groups,
int deformable_groups); int deformable_groups);
std::tuple<at::Tensor, at::Tensor, at::Tensor, at::Tensor> VISION_API std::tuple<at::Tensor, at::Tensor, at::Tensor, at::Tensor>
DeformConv2d_backward_cuda( DeformConv2d_backward_cuda(
const at::Tensor& grad_out, const at::Tensor& grad_out,
const at::Tensor& input, const at::Tensor& input,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment