• Fran莽ois Lagunas's avatar
    Fixes bug that appears when using QA bert and distilation. (#12026) · f8bd8c6c
    Fran莽ois Lagunas authored
    * Fixing bug that appears when using distilation (and potentially other uses).
    During backward pass Pytorch complains with:
    RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
    This happens because the QA model code modifies the start_positions and end_positions input tensors, using clamp_ function: as a consequence the teacher and the student both modifies the inputs, and backward pass fails.
    
    * Fixing all models QA clamp_ bug.
    f8bd8c6c
modeling_albert.py 54.2 KB