1. 19 Apr, 2022 1 commit
  2. 07 Apr, 2022 2 commits
    • Masaki Kozuki's avatar
      Deprecation warning: `pyprof` & `reparameterization` (#1348) · 727a6452
      Masaki Kozuki authored
      * add warning to pyprof
      
      * add warning to reparameterization
      
      note: this module is already not import-able as follows:
      
      ```
      (base) root@c4bb3f161482:/vscode/apex# python -c 'import torch; import
      apex; from apex import reparameterization'
      /vscode/apex/apex/pyprof/__init__.py:5: FutureWarning: pyprof will be
      removed by the end of June, 2022
        warnings.warn("pyprof will be removed by the end of June, 2022",
      FutureWarning)
      /vscode/apex/apex/reparameterization/__init__.py:2: FutureWarning:
      reparameterization will be removed by the end of June, 2022
        warnings.warn("reparameterization will be removed by the end of June,
      2022", FutureWarning)
      Traceback (most recent call last):
        File "<string>", line 1, in <module>
        File "/vscode/apex/apex/reparameterization/__init__.py", line 4, in
      <module>
          from .weight_norm import WeightNorm
        File "/vscode/apex/apex/reparameterization/weight_norm.py", line 3, in
      <module>
          from ..fp16_utils import Fused_Weight_Norm
      ImportError: cannot import name 'Fused_Weight_Norm' from
      'apex.fp16_utils' (/vscode/apex/apex/fp16_utils/__init__.py)
      ```
      727a6452
    • Masaki Kozuki's avatar
      [transformer] add microbatches test (#1349) · 7d903878
      Masaki Kozuki authored
      * add test
      
      * destroy model parallel was missing
      7d903878
  3. 30 Mar, 2022 1 commit
  4. 25 Mar, 2022 3 commits
    • yjk21's avatar
      update fmha (#1344) · 3c88451a
      yjk21 authored
      3c88451a
    • Masaki Kozuki's avatar
      [transformer] Format & Test Refactoring (#1325) · a0ed4151
      Masaki Kozuki authored
      * try PyTorch custom TestCase class
      
      * revert
      
      * initial working example
      
      * update
      
      * data utils
      
      * fix imports
      
      * hardcode backend to nccl
      
      * fix signature
      
      * fix typo
      
      * mapping
      
      * set device
      
      * init
      
      * refactor x entropy
      
      * remove unused import & destroy model parallel
      
      * refactor random
      
      * fix test
      
      * remove migrated tests
      
      * refactor
      
      * init
      
      * separate affine weight init
      
      * init model parallel
      
      * split more
      
      * weight init fix part 1
      
      * use cpu init for consistency btwn native and tensor parallel
      
      * black
      
      * add col parallel
      
      * use a 3D tensor of square matrix for column parallel linear
      
      * skip the failing cases
      
      * migrate layers test
      
      * pipeline parallel forward/backward
      
      * fix typo
      
      * fix typo
      
      * fix
      
      * fix pipeline world size
      
      * black
      
      * rm `run_pipeline_parallel_test` in favor of test_pipeline_parallel_fwd_bwd.py
      
      * stop logging
      
      * set log level
      
      * black
      
      * license and format
      
      * fix
      
      * skip tf32 as matrices are small
      
      * remove potentially inappropriate license
      
      * Apply suggestions from code review
      
      * remove `TODO` comment
      
      * `torch.testing.assert_allclose` -> `torch.testing.assert_close`
      
      * remove comment-outs
      
      * remote unused import
      
      * minor fix
      a0ed4151
    • Masaki Kozuki's avatar
      [transformer] `parallel_state`: Position Embedding (#1343) · f10b4b89
      Masaki Kozuki authored
      * update
      
      * Add comment to `destroy_model_parallel`
      f10b4b89
  5. 24 Mar, 2022 1 commit
    • Masaki Kozuki's avatar
      Add CUDA Focal Loss Implementation (#1337) · 28f8539c
      Masaki Kozuki authored
      
      
      Take-over of #1097
      
      * Add fast CUDA focal loss implementation.
      
      * Enable fast math for CUDA focal loss.
      
      * Correct typo.
      
      * replace deprecated macros
      
      * Add fast CUDA focal loss implementation.
      
      * Enable fast math for CUDA focal loss.
      
      * Correct typo.
      
      * replace deprecated macros
      
      * TORCH_CUDA_CHECK -> AT_CUDA_CHECK
      
      The former is defined in torch/csrc/profiler/cuda.cpp so it's not available usually.
      The latter however is defined in ATen/cuda/Exceptions.h as an alias of C10_CUDA_CHECK.
      
      * add test
      
      * clean up
      
      * guard for torchvision
      Co-authored-by: default avatarWil Kong <alpha0422@gmail.com>
      28f8539c
  6. 18 Mar, 2022 1 commit
  7. 16 Mar, 2022 1 commit
  8. 15 Mar, 2022 4 commits
  9. 11 Mar, 2022 1 commit
  10. 08 Mar, 2022 4 commits
  11. 01 Mar, 2022 1 commit
  12. 27 Feb, 2022 1 commit
  13. 26 Feb, 2022 1 commit
  14. 25 Feb, 2022 3 commits
  15. 23 Feb, 2022 4 commits
  16. 15 Feb, 2022 1 commit
  17. 12 Feb, 2022 1 commit
  18. 11 Feb, 2022 1 commit
  19. 10 Feb, 2022 1 commit
  20. 07 Feb, 2022 1 commit
  21. 04 Feb, 2022 1 commit
  22. 01 Feb, 2022 2 commits
    • ChongyuNVIDIA's avatar
      Add the permutation related support as the extension for asp lib. (#1194) · 89edb819
      ChongyuNVIDIA authored
      * Add the permutation related support as the extension for asp lib.
      
      * [Fix] Track the permutation sequence for progressive channel swap strategy.
      
      * Fix the corner case that one layer is not sparse, but need to apply permutation due to its siblings.
      
      * Fix the deprecated functions in ASP unit tests.
      
      * Fix the sparsity info typo in ASP lib.
      
      * [Enhancement] Set the identical random seed for all GPUs to make sure the same results generated in permutation search.
      
      * Update the README.md with identical random seed setting and NeurIPS info.
      
      * Integrate the Pybind11 enhancement of permutation search into ASP lib.
      89edb819
    • Masaki Kozuki's avatar
      transformer: Allows for custom sync context in no pipelining forward backward function (#1281) · 79c01877
      Masaki Kozuki authored
      * add kwarg of `custom_sync_context_handler`
      
      * add kwargs to ignore custom_sync_context_handler which mistakenly passed to fwd/bwd funcs
      79c01877
  23. 31 Jan, 2022 2 commits
  24. 29 Jan, 2022 1 commit