1. 20 Aug, 2023 1 commit
  2. 19 Aug, 2023 1 commit
  3. 18 Aug, 2023 4 commits
  4. 17 Aug, 2023 4 commits
  5. 16 Aug, 2023 1 commit
  6. 15 Aug, 2023 1 commit
    • Xuechen Li's avatar
      enable loading hf llama checkpoints for training (#446) · 0f7853c6
      Xuechen Li authored
      * prelim.
      
      * add hf convertion fn.
      
      * mlp.
      
      * change name.
      
      * fix bug.
      
      * inverse permute.
      
      * change comment.
      
      * revert style changes.
      
      * fix.
      
      * add doc.
      
      * revert.
      
      * enable load safe.
      
      * fix safe load.
      
      * fix import.
      
      * fix typing-related lints.
      
      * fix ckpt loading logic.
      
      * make single gpu work.
      
      * test with parallel.
      
      * ckpt format.
      
      * enable pretrained state dict.
      
      * remove unused imports.
      
      * remove unused.
      
      * mark idea related.
      0f7853c6
  7. 14 Aug, 2023 3 commits
  8. 13 Aug, 2023 2 commits
  9. 10 Aug, 2023 1 commit
  10. 01 Aug, 2023 3 commits
  11. 29 Jul, 2023 1 commit
  12. 28 Jul, 2023 3 commits
  13. 27 Jul, 2023 1 commit
  14. 26 Jul, 2023 2 commits
  15. 23 Jul, 2023 6 commits
  16. 22 Jul, 2023 1 commit
  17. 21 Jul, 2023 1 commit
  18. 18 Jul, 2023 1 commit
  19. 17 Jul, 2023 2 commits
  20. 08 Jul, 2023 1 commit
    • Volodymyr Kyrylov's avatar
      rotary: update cos/sin cache when switching from inference mode · 70ab266a
      Volodymyr Kyrylov authored
      This resolves RuntimeErrors after running evaluation in inference mode:
      
      ```
        File "/home/proger/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
          return forward_call(*args, **kwargs)
        File "/home/proger/.local/lib/python3.10/site-packages/flash_attn/modules/mha.py", line 492, in forward
          qkv = self.rotary_emb(qkv)
        File "/home/proger/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
          return forward_call(*args, **kwargs)
        File "/home/proger/.local/lib/python3.10/site-packages/flash_attn/layers/rotary.py", line 229, in forward
          return apply_rotary_emb_qkv_(
        File "/home/proger/.local/lib/python3.10/site-packages/torch/autograd/function.py", line 506, in apply
          return super().apply(*args, **kwargs)  # type: ignore[misc]
      RuntimeError: Inference tensors cannot be saved for backward. To work around you can make a clone to get a normal tensor and use it in autograd.
      ```
      70ab266a