1. 13 Dec, 2021 1 commit
  2. 09 Dec, 2021 4 commits
  3. 08 Dec, 2021 1 commit
  4. 06 Dec, 2021 2 commits
    • Hubert Lu's avatar
      Replace THCudaCheck with C10_CUDA_CHECK · fec3141c
      Hubert Lu authored
      fec3141c
    • Masaki Kozuki's avatar
      remove THC headers/functions (#1192) · 2155dabf
      Masaki Kozuki authored
      Changes include
      - THC headers removal
      - TH macros replacement
      - fix some typo in comment
       Conflicts:
      	apex/contrib/csrc/multihead_attn/additive_masked_softmax_dropout_cuda.cu
      	apex/contrib/csrc/multihead_attn/encdec_multihead_attn_cuda.cu
      	apex/contrib/csrc/multihead_attn/encdec_multihead_attn_norm_add_cuda.cu
      	apex/contrib/csrc/multihead_attn/masked_softmax_dropout_cuda.cu
      	apex/contrib/csrc/multihead_attn/self_multihead_attn_bias_additive_mask_cuda.cu
      	apex/contrib/csrc/multihead_attn/self_multihead_attn_bias_cuda.cu
      	apex/contrib/csrc/multihead_attn/self_multihead_attn_cuda.cu
      	apex/contrib/csrc/multihead_attn/self_multihead_attn_norm_add_cuda.cu
      	apex/contrib/csrc/multihead_attn/strided_batched_gemm.h
      2155dabf
  5. 03 Dec, 2021 2 commits
  6. 02 Dec, 2021 4 commits
  7. 01 Dec, 2021 2 commits
  8. 29 Nov, 2021 1 commit
  9. 22 Nov, 2021 1 commit
  10. 19 Nov, 2021 5 commits
  11. 18 Nov, 2021 1 commit
  12. 17 Nov, 2021 2 commits
  13. 10 Nov, 2021 3 commits
  14. 02 Nov, 2021 3 commits
  15. 01 Nov, 2021 3 commits
  16. 29 Oct, 2021 2 commits
  17. 28 Oct, 2021 1 commit
  18. 27 Oct, 2021 2 commits
    • Masaki Kozuki's avatar
      `FastLayerNorm` compat with `autocast` (#1203) · ae757634
      Masaki Kozuki authored
      
      
      * Persistent LayerNorm: Multi-CTA Rewrite
      
      * autocast support
      Co-authored-by: default avatarYoung-Jun Ko <youngjun.ko@gmail.com>
      ae757634
    • Masaki Kozuki's avatar
      Pipeline Model Parallel (#1202) · 63d5dd63
      Masaki Kozuki authored
      * Init apex.ppu (pipeline model parallel utility)
      
      Reference commit:
      
      ```
      commit 5ab646376d67831601d5552c193241d017f1b35c (HEAD -> main, internal/main)
      Merge: 14f2c684 7b293d9b
      Author: Mohammad Shoeybi <mshoeybi@nvidia.com>
      Date:   Wed Sep 22 22:57:54 2021 -0700
      
          Merge branch 'add_BOS' into 'main'
      
          Add Beginning of Sentence token option and adding semaphore while multi-threading to prevent crashes and hangs due to connection keep-alives
      
          See merge request ADLR/megatron-lm!328
      ```
      
      * removing get_args and replace import - phase 1
      
      * removing get_args and replace import - phase 2
      
      * move ppu to apex.transformer.pipeline_parallel
      
      * update two __init__.py
      
      * update READMEs
      
      * mpu -> parallel_state & tensor_parallel
      
      * fix
      
      * remove not pipeline files
      
      * separate schedules.py - phase 1
      
      * dissect schedules.py
      
      * data_iterators -> batch
      
      * remove optimizer from forward_backward_step funcs
      
      * init test
      
      * Apply 2 suggestion(s...
      63d5dd63