improve stability of test_nms_cuda (#2044)

* improve stability of test_nms_cuda This change addresses two issues: _create_tensors_with_iou() creates test data for the NMS tests. It takes care to ensure at least one pair of boxes (1st and last) have IoU around the threshold for the test. However, the constructed IoU for that pair is _so_ close to the threshold that rounding differences (presumably) between CPU and CUDA implementations may result in one suppressing a box in the pair and the other not. Adjust the construction to ensure the IoU for the box pair is near the threshold, but far-enough above that both implementations should agree. Where 2 boxes have nearly or exactly the same score, the CPU and CUDA implementations may order them differently. Adjust test_nms_cuda() to check only that the non-suppressed box lists include the same members, without regard for ordering. * adjust assertion in test_nms_cuda The CPU and CUDA nms implementations each sort the box scores as part of their work, but the sorts they use are not stable. So boxes with the same score maybe be processed in opposite order by the two implmentations. Relax the assertion in test_nms_cuda (following the model in pytorch's test_topk()) to allow the test to pass if the output differences are caused by similarly-scored boxes. * improve stability of test_nms_cuda Adjust _create_tensors_with_iou() to ensure we create at least one box just over threshold that should be suppressed.

improve stability of test_nms_cuda (#2044)
* improve stability of test_nms_cuda This change addresses two issues: _create_tensors_with_iou() creates test data for the NMS tests. It takes care to ensure at least one pair of boxes (1st and last) have IoU around the threshold for the test. However, the constructed IoU for that pair is _so_ close to the threshold that rounding differences (presumably) between CPU and CUDA implementations may result in one suppressing a box in the pair and the other not. Adjust the construction to ensure the IoU for the box pair is near the threshold, but far-enough above that both implementations should agree. Where 2 boxes have nearly or exactly the same score, the CPU and CUDA implementations may order them differently. Adjust test_nms_cuda() to check only that the non-suppressed box lists include the same members, without regard for ordering. * adjust assertion in test_nms_cuda The CPU and CUDA nms implementations each sort the box scores as part of their work, but the sorts they use are not stable. So boxes with the same score maybe be processed in opposite order by the two implmentations. Relax the assertion in test_nms_cuda (following the model in pytorch's test_topk()) to allow the test to pass if the output differences are caused by similarly-scored boxes. * improve stability of test_nms_cuda Adjust _create_tensors_with_iou() to ensure we create at least one box just over threshold that should be suppressed.
e61538cb · Brian Hart · GitHub · 9ed2fa3c · e61538cb
Unverified Commit e61538cb authored Apr 03, 2020 by Brian Hart Committed by GitHub Apr 03, 2020
Hide whitespace changes
Inline Side-by-side

Showing with 10 additions and 1 deletion

test/test_ops.py test/test_ops.py +10 -1

No files found.
--- a/test/test_ops.py
+++ b/test/test_ops.py
@@ -374,10 +374,14 @@ class NMSTester(unittest.TestCase):
        # let b0 be [x0, y0, x1, y1], and b1 be [x0, y0, x1 + d, y1],
        # then, in order to satisfy ops.iou(b0, b1) == iou_thresh,
        # we need to have d = (x1 - x0) * (1 - iou_thresh) / iou_thresh
+        # Adjust the threshold upward a bit with the intent of creating
+        # at least one box that exceeds (barely) the threshold and so
+        # should be suppressed.
        boxes = torch.rand(N, 4) * 100
        boxes[:, 2:] += boxes[:, :2]
        boxes[-1, :] = boxes[0, :]
        x0, y0, x1, y1 = boxes[-1].tolist()
+        iou_thresh += 1e-5
        boxes[-1, 2] += (x1 - x0) * (1 - iou_thresh) / iou_thresh
        scores = torch.rand(N)
        return boxes, scores
@@ -399,7 +403,12 @@ class NMSTester(unittest.TestCase):
            r_cpu = ops.nms(boxes, scores, iou)
            r_cuda = ops.nms(boxes.cuda(), scores.cuda(), iou)
-            self.assertTrue(torch.allclose(r_cpu, r_cuda.cpu()), err_msg.format(iou))
+            is_eq = torch.allclose(r_cpu, r_cuda.cpu())
+            if not is_eq:
+                # if the indices are not the same, ensure that it's because the scores
+                # are duplicate
+                is_eq = torch.allclose(scores[r_cpu], scores[r_cuda.cpu()])
+            self.assertTrue(is_eq, err_msg.format(iou))
 class NewEmptyTensorTester(unittest.TestCase):