1. 27 Feb, 2020 1 commit
  2. 25 Feb, 2020 3 commits
  3. 24 Feb, 2020 1 commit
    • Kevin Stephano's avatar
      Change to Multihead Attention to allow Batched GEMMs larger than 64K. (#728) · 1733946a
      Kevin Stephano authored
      * Adding C++ Multihead Attention implementation to contrib.
      
      * Add reference test that at least works for forward.
      
      * Remove CublasLt support from multihead attention.
      
      * Add new Python version of self attention.
      
      * Update python model of MHA with backward pass.
      
      * Fixed Output Linear connection in MHA.
      
      * Clean up compiles and add documentation to PySelfAttention.
      
      * Add Encdec Python version of multihead attention.  Cleanup files.
      
      * Tests for self and encdec multihead attention.
      
      * Add reference pytorch implementation of attention with norm and add.
      
      * Add cutlass branch definition.
      
      * Add cutlass download to compile.
      
      * Add norm/add tests.
      
      * Add biases to pytorch python versions.
      
      * Add tests and fix issues with python version of attention masking.
      
      * Create README.md
      
      * Update README.md
      
      * Update README.md
      
      * Update perf test parameters.
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * Add files via upload
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * Fix matmul1 output tensor size.  Fix tests that missed issue.
      
      * Allow for Z dimensions of 64K and greater on batched GEMMs.
      
      * remove redundant imports
      
      * general cleanup, remove deprecated or unused functions
      1733946a
  4. 15 Feb, 2020 1 commit
  5. 10 Feb, 2020 1 commit
  6. 06 Feb, 2020 1 commit
    • Kevin Stephano's avatar
      Add Fast Multihead Attention to APEX Contrib (#697) · 3f94528e
      Kevin Stephano authored
      * Adding C++ Multihead Attention implementation to contrib.
      
      * Add reference test that at least works for forward.
      
      * Remove CublasLt support from multihead attention.
      
      * Add new Python version of self attention.
      
      * Update python model of MHA with backward pass.
      
      * Fixed Output Linear connection in MHA.
      
      * Clean up compiles and add documentation to PySelfAttention.
      
      * Add Encdec Python version of multihead attention.  Cleanup files.
      
      * Tests for self and encdec multihead attention.
      
      * Add reference pytorch implementation of attention with norm and add.
      
      * Add cutlass branch definition.
      
      * Add cutlass download to compile.
      
      * Add norm/add tests.
      
      * Add biases to pytorch python versions.
      
      * Add tests and fix issues with python version of attention masking.
      
      * Create README.md
      
      * Update README.md
      
      * Update README.md
      
      * Update perf test parameters.
      
      * Update README.md
      
      * Update README.md
      
      * Update README.md
      
      * Add f...
      3f94528e
  7. 05 Feb, 2020 1 commit
  8. 27 Jan, 2020 1 commit
  9. 21 Jan, 2020 1 commit
  10. 08 Jan, 2020 1 commit
  11. 18 Dec, 2019 2 commits
  12. 05 Dec, 2019 1 commit
  13. 03 Dec, 2019 1 commit
  14. 22 Nov, 2019 1 commit
  15. 06 Nov, 2019 1 commit
  16. 30 Oct, 2019 1 commit
  17. 23 Oct, 2019 1 commit
  18. 22 Oct, 2019 1 commit
  19. 19 Oct, 2019 1 commit
  20. 10 Oct, 2019 2 commits
  21. 09 Oct, 2019 2 commits
  22. 08 Oct, 2019 1 commit
  23. 04 Oct, 2019 1 commit
  24. 03 Oct, 2019 1 commit
  25. 02 Oct, 2019 1 commit
  26. 13 Sep, 2019 1 commit
  27. 12 Sep, 2019 1 commit
  28. 11 Sep, 2019 1 commit
  29. 06 Sep, 2019 2 commits
  30. 03 Sep, 2019 2 commits
  31. 30 Aug, 2019 1 commit
  32. 27 Aug, 2019 2 commits
    • ptrblck's avatar
      Enable Checkpointing (#420) · dec4fdd6
      ptrblck authored
      * add state_dict, load_state_dict
      
      * add test_restoring, test_loss_scale_decrease
      
      * disable amp outputs for checkpoint tests
      
      * add test for amp.state_dict, cleanup
      
      * add state_dict patch, add test
      
      * fixed testing, cleanup
      
      * add readme for checkpointing
      
      * add docs to source/amp
      
      * add review changes to doc
      dec4fdd6
    • Michael Carilli's avatar
      Deleting test_fp16_optimizer.py · 30ed793e
      Michael Carilli authored
      30ed793e