1. 24 Mar, 2022 3 commits
  2. 23 Mar, 2022 2 commits
  3. 18 Mar, 2022 1 commit
  4. 16 Mar, 2022 1 commit
  5. 15 Mar, 2022 4 commits
  6. 11 Mar, 2022 1 commit
  7. 08 Mar, 2022 4 commits
  8. 01 Mar, 2022 1 commit
  9. 27 Feb, 2022 1 commit
  10. 26 Feb, 2022 1 commit
  11. 25 Feb, 2022 3 commits
  12. 23 Feb, 2022 4 commits
  13. 15 Feb, 2022 1 commit
  14. 12 Feb, 2022 1 commit
  15. 11 Feb, 2022 1 commit
  16. 10 Feb, 2022 1 commit
  17. 07 Feb, 2022 1 commit
  18. 04 Feb, 2022 1 commit
  19. 01 Feb, 2022 2 commits
    • ChongyuNVIDIA's avatar
      Add the permutation related support as the extension for asp lib. (#1194) · 89edb819
      ChongyuNVIDIA authored
      * Add the permutation related support as the extension for asp lib.
      
      * [Fix] Track the permutation sequence for progressive channel swap strategy.
      
      * Fix the corner case that one layer is not sparse, but need to apply permutation due to its siblings.
      
      * Fix the deprecated functions in ASP unit tests.
      
      * Fix the sparsity info typo in ASP lib.
      
      * [Enhancement] Set the identical random seed for all GPUs to make sure the same results generated in permutation search.
      
      * Update the README.md with identical random seed setting and NeurIPS info.
      
      * Integrate the Pybind11 enhancement of permutation search into ASP lib.
      89edb819
    • Masaki Kozuki's avatar
      transformer: Allows for custom sync context in no pipelining forward backward function (#1281) · 79c01877
      Masaki Kozuki authored
      * add kwarg of `custom_sync_context_handler`
      
      * add kwargs to ignore custom_sync_context_handler which mistakenly passed to fwd/bwd funcs
      79c01877
  20. 31 Jan, 2022 2 commits
  21. 29 Jan, 2022 1 commit
  22. 28 Jan, 2022 2 commits
    • Masaki Kozuki's avatar
      small changes in test and logger format (#1278) · b1c75f6f
      Masaki Kozuki authored
      * cosmetic refactor in test
      
      * log with PID
      
      * log more info: rank, pid, filename, lineNo
      b1c75f6f
    • Masaki Kozuki's avatar
      allow for `None` batch (#1280) · a960fe8c
      Masaki Kozuki authored
      * have get_kth_microbatch deal with None batch
      
      * broadcast based on tensor parallel rank
      
      * dtype
      
      * remove unnecessary .cuda()
      
      Processes of tensor parallel rank != 0 doesn't need to prepare one or more `torch.utils.data.DataLoader` instances, which means the argument of `batch` of `get_kth_microbatch` function can be `None` but the current function implementation doesn't allow for it.
      a960fe8c
  23. 21 Jan, 2022 1 commit