stabilize the training of deformable DETR with box refinement
Summary: Major changes - As described in details in appendix A.4 in deformable DETR paper (https://arxiv.org/abs/2010.04159), the gradient back-propagation is blocked at inverse_sigmoid(bounding box x/y/w/h from last decoder layer). This can be implemented by detaching tensor from compute graph in pytorch. However, currently we detach at an incorrect tensor, preventing update the layers which predicts delta x/y/w/h. Fix this bug. - Add more comments to annotate data types and tensor shape in the code. This should NOT affect the actual implementation. Reviewed By: zhanghang1989 Differential Revision: D29048363 fbshipit-source-id: c5b5e89793c86d530b077a7b999769881f441b69
Showing
This diff is collapsed.
Please register or sign in to comment