- 22 Nov, 2021 1 commit
-
-
Joao Gomes authored
* change to stable sort in nms implementations
-
- 15 Feb, 2021 1 commit
-
-
Vasilis Vryniotis authored
* Replace type T with accumulator. * Upcast tensors of box ops to avoid overflow in multiplications.
-
- 29 Jan, 2021 1 commit
-
-
Francisco Massa authored
Co-authored-by:Vasilis Vryniotis <datumbox@users.noreply.github.com>
-
- 04 Jan, 2021 1 commit
-
-
Vasilis Vryniotis authored
* Adding TORCH_SELECTIVE_* macros on op registration. * Adding torchvision namespace.
-
- 10 Dec, 2020 1 commit
-
-
Vasilis Vryniotis authored
Summary: * Reduce unnecessary header inclusions in models and io. * Move autocast to separate folder and hide autograd implementation in an anonymous namespace. * Moving files in subfolders. Reviewed By: fmassa Differential Revision: D25461523 fbshipit-source-id: 756eeb6848aacaa474de4825ed4c1045d17e2cea
-
- 08 Dec, 2020 1 commit
-
-
Vasilis Vryniotis authored
* Moving deform_conv2d op registration. * Moving nms op registration. * Moving new_empty_tensor op registration. * Moving ps_roi_align op registration. * Moving ps_roi_pool op registration. * Moving roi_align op registration. * Moving roi_pool op registration. * Restoring headers for forward/backward and fixing styles. * Restoring the test hack on windows. * Stricter header inclusion.
-
- 02 Dec, 2020 1 commit
-
-
Vasilis Vryniotis authored
* Encapsulate and standardize deform_conv2d (#3074) * Rename files. * Standardizing method names. * Adding anonymous namespaces. * Applying C++ naming rules and alinging variable names across headers and cpp files. * Syncing names across implementations. * Rename deform_conv2d.h to deform_conv2d.cpp * Use header files: - Create header files for kernel implementation and remove definitions from vision_*.h files. - Eliminate unnecessary headers and ensure all cpp include their headers. * Change the naming convention for kernel implementations. * Remove the _param postfix from the variables and standardizing names. * Exposing public forward/backward methods to the C++ API and moving methods around to minimize git blame changes. * Encapsulate and standardize nms (#3081) * Syncing, where possible, the names of functions across devices. * Adding all internal functions in anonymous namespaces. * Renaming C++/CUDA kernel files and moving operator code from header to cpp file. * Create foreach cpp file a separate header file with "public" functions. * Removing unnecessary repeated includes. * Update CMakeLists.txt to include all headers. * Encapsulate and standardize ps_roi_align (#3082) * Renaming C++ files & methods according to recommended naming conventions and aligning them with Python's API. Syncing, where possible, the names of functions across devices. * Adding all internal functions in anonymous namespaces. * Renaming C++/CUDA kernel files and moving operator code from header to cpp file. * Create foreach cpp file a separate header file with "public" functions. * Removing unnecessary repeated includes. * Encapsulate and standardize ps_roi_pool (#3084) * Renaming C++ files & methods according to recommended naming conventions and aligning them with Python's API. * Adding all internal functions in anonymous namespaces. * Renaming C++/CUDA kernel files and moving operator code from header to cpp file. * Create foreach cpp file a separate header file with "public" functions. * Removing unnecessary repeated includes. * Encapsulate and standardize roi_align (#3085) * Renaming C++ files & methods according to recommended naming conventions and aligning them with Python's API. * Adding all internal functions in anonymous namespaces. * Renaming C++/CUDA kernel files and moving operator code from header to cpp file. * Create foreach cpp file a separate header file with "public" functions. * Removing unnecessary repeated includes. * Encapsulate and standardize roi_pool (#3088) * Renaming C++ files & methods according to recommended naming conventions and aligning them with Python's API. * Adding all internal functions in anonymous namespaces. * Syncing variable names between the cpp files and their header files. * Renaming C++/CUDA kernel files and moving operator code from header to cpp file. * Create foreach cpp file a separate header file with "public" functions. * Removing unnecessary repeated includes. * Encapsulate and standardize new_empty_tensor_op (#3089) * Renaming C++ files & methods according to recommended naming conventions and aligning them with Python's API. * Create foreach cpp file a separate header file with "public" functions. * Adding all internal functions in anonymous namespaces. * Convert to const ref all possible parameters. * Removing unnecessary repeated includes. * Encapsulate and standardize C++ Ops - Clean up (#3094) * Removing unnecessary repeated includes. * Remove unnecessary vision_cpu.h, vision_cuda.h, autocast.h. * Fixing naming convention and correcting method names on macros. * Turn on clang formatter for cu files and fixing broken styles. * Replace "#ifndef ... #define ... #endif" with "#pragma once" on header files. * Adding operator methods in vision::ops namespace. (#3096) * Adding operator methods in vision::ops namespace. * Replace general.h with macros.h * Adding vision.h to the necessary cpp files.
-
- 30 Oct, 2020 1 commit
-
-
Vasilis Vryniotis authored
* Clean up and refactor ROIAlign implementation: - Remove primitive const declaration from method names. - Remove unnecessary headers. - Aligning method names between cpu and cuda. * Adding back include for cpu. * Restoring method names of private methods to avoid conflicts. * Restore include headers.
-
- 13 Oct, 2020 1 commit
-
-
vfdev authored
* Added rois shape check in C++ * Fixes code formatting * Remove accidental include * - Updated code according to the review - Replaced old AT_ASSERT/ERROR by new TORCH_CHECK
-
- 02 Sep, 2020 1 commit
-
-
Ashish Farmer authored
* add autocasting on ROCm * enable ROIAlign autocasting on ROCm * enable NMS autocasting on ROCm * fix to use correct torch CUDA APIs
-
- 09 Jul, 2020 1 commit
-
-
mcarilli authored
* Fixes Xiao's repro * Ports nms to use full dispatcher * Move HIPGuard to nms_cuda * clang-format * run models in test_models.py on GPU if available * Francisco's comment, also disable cuda model tests to see if CPU alone still passes * cuda tests now pass locally, although still not comparing to saved numerics * add note for thing to ask francisco * Allow cuda and cpu tests to share a data file * ignore suffix if unneeded * Skip autocast numerics checks for a few models * Add roi_align test Co-authored-by:Michael Carilli <mcarilli@nvidia.com>
-
- 04 May, 2020 1 commit
-
-
Gao, Xiang authored
* Don't include CUDAApplyUtils.cuh * fix format * fix atomic
-
- 23 Apr, 2020 1 commit
-
-
Yuxin Wu authored
* fix the use of contiguous() in kernels * clang-format * add a contiguous in nms Co-authored-by:Yuxin Wu <ppwwyyxx@users.noreply.github.com>
-
- 07 Apr, 2020 2 commits
-
-
Brian Hart authored
Torchvision includes at least 3 bits of code that calculate box Intersection over Union values (and usually compare to a threshold): - box_iou in torchvision/ops/boxes.py - devIoU in torchvision/csrc/cuda/nms_cuda.cu - nms_cpu_kernel in torchvision/csrc/cpu/nms_cpu.cpp The calculations were performed slightly differently between those, leading to occasional differences in results. Update devIoU to use the same method as the others for better consistency. This change improves agreement between the CPU and CUDA calculations but the results can still differ slightly. Setting NVCC_FLAGS to include "--fmad=true" would provide still better agreement, but with likely cost to performance.
-
AhnDW authored
* Replace **.is_cuda() to just is_cuda() * Replace type to scalar_type * Fix lint, clang-format * Fix lint, clang-format
-
- 02 Jan, 2020 1 commit
-
-
Yuxin Wu authored
1. Let the IOU function compare with threshold. This avoid a division. Similar strategy is also used in https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/non_max_suppression_op.cu.cc 2. Only compute the upper triangle of the mask. This speeds up the kernel about 20% (tested on GTX 1080Ti, with 20 input cases dumped from a Mask R-CNN inference job).
-
- 05 Nov, 2019 1 commit
-
-
Francisco Massa authored
* Fix inconsistent NMS implementation * Improve tests for NMS * Remove unnecessary using statement
-
- 29 Aug, 2019 1 commit
-
-
Yuxin Wu authored
* Use Tensor.data_ptr instead of .data * use pytorch-nightly in CI
-
- 23 May, 2019 2 commits
-
-
Francisco Massa authored
* #944 MSBuild Compile time casting Error * #944 MSBuild Error static_cast<Long> to static_cast<int64_t> * Add eval.py Not Work find_contours * Remove unnecessary file * Lint
-
Varun Agrawal authored
Updated nms_cuda signature to accept detections and scores as separate tensors. This also required updating the indexing in the NMS CUDA kernel. Also made the iou_threshold parameter name consistent across implementations.
-
- 07 May, 2019 1 commit
-
-
Francisco Massa authored
* Initial layout for layers with cpp extensions * Move files around * Fix import after move * Add support for multiple types to ROIAlign * Different organization CUDA extensions work now * Cleanups * Reduce memory requirements for backwards * Replace runtime_error by AT_ERROR * Add nms test * Add support for compilation using CPP extensions * Change folder structure * Add ROIPool cuda * Cleanups * Add roi_pool.py * Fix lint * Add initial structures folder for bounding boxes * Assertion macros compatible with pytorch master (#540) * Support for ROI Pooling (#592) * ROI Pooling with tests. Fix for cuda context in ROI Align. * renamed bottom and top to follow torch conventions * remove .type().tensor() calls in favor of the new approach to tensor initialization (#626) * Consistent naming for rois variable (#627) * remove .type().tensor() calls in favor of the new approach to tensor initialization * Consistent naming for rois variable in ROIPool * ROIPool: Support for all datatypes (#632) * Use of torch7 naming scheme for ROIAlign forward and backward * use common cuda helpers in ROIAlign * use .options() in favor of .type() where applicable * Added tests for forward pass of ROIAlign, as well as more consistent naming scheme for CPU vs CUDA * working ROIAlign cuda backwards pass * working ROIAlign backwards pass for CPU * added relevant headers for ROIAlign backwards * tests for ROIAlign layer * replace .type() with .options() for tensor initialization in ROIAlign layers * support for Half types in ROIAlign * gradcheck tests for ROIAlign * updated ROIPool on CPU to work with all datatypes * updated and cleaned tests for ROI Pooling * Fix rebase problem * Remove structures folder * Improve cleanup and bugfix in test_layers * Update C++ headers * Add CUDAGuard to cu files * Add more checks to layers * Add CUDA NMS and tests * Add multi-type support for NMS CUDA * Avoid using THCudaMalloc * Add clang-format and reformat c++ code * Remove THC includes * Rename layers to ops * Add documentation and rename functions * Improve the documentation a bit * Fix some lint errors * Fix remaining lint inssues * Area computation doesn't add +1 in NMS * Update CI to use PyTorch nightly * Make NMS return indices sorted according to the score * Address reviewer comments * Lint fixes * Improve doc for roi_align and roi_pool * move to xenial * Fix bug pointed by @lopuhin * Fix RoIPool reference implementation in Python 2 Also fixes a bug in the clip_boxes_to_image -- this function needs a test! * Remove change in .travis
-