• Volodymyr Kyrylov's avatar
    rotary: update cos/sin cache when switching from inference mode · 70ab266a
    Volodymyr Kyrylov authored
    This resolves RuntimeErrors after running evaluation in inference mode:
    
    ```
      File "/home/proger/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/home/proger/.local/lib/python3.10/site-packages/flash_attn/modules/mha.py", line 492, in forward
        qkv = self.rotary_emb(qkv)
      File "/home/proger/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "/home/proger/.local/lib/python3.10/site-packages/flash_attn/layers/rotary.py", line 229, in forward
        return apply_rotary_emb_qkv_(
      File "/home/proger/.local/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
        return super().apply(*args, **kwargs)  # type: ignore[misc]
    RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.
    ```
    70ab266a
rotary.py 12.4 KB