Use copy_if and parallel execution by leveraging TBB.
- Add support for TBB in MIGraphX - Add include for TBB in DockerFile - Replace inner loop with copy_if and use std::execution:par to filter - Change heap to vector and sort in parallel in filter_boxes_per_score() With the help of Paul this cuts down NMS in ref from around 43-44s to about 2s
Showing
Please register or sign in to comment