- 06 Dec, 2021 1 commit
-
-
Masaki Kozuki authored
Changes include - THC headers removal - TH macros replacement - fix some typo in comment Conflicts: apex/contrib/csrc/multihead_attn/additive_masked_softmax_dropout_cuda.cu apex/contrib/csrc/multihead_attn/encdec_multihead_attn_cuda.cu apex/contrib/csrc/multihead_attn/encdec_multihead_attn_norm_add_cuda.cu apex/contrib/csrc/multihead_attn/masked_softmax_dropout_cuda.cu apex/contrib/csrc/multihead_attn/self_multihead_attn_bias_additive_mask_cuda.cu apex/contrib/csrc/multihead_attn/self_multihead_attn_bias_cuda.cu apex/contrib/csrc/multihead_attn/self_multihead_attn_cuda.cu apex/contrib/csrc/multihead_attn/self_multihead_attn_norm_add_cuda.cu apex/contrib/csrc/multihead_attn/strided_batched_gemm.h
-
- 03 Dec, 2021 2 commits
-
-
hubertlu-tw authored
-
hubertlu-tw authored
-
- 02 Dec, 2021 4 commits
-
-
Jithun Nair authored
* Use --cuda_ext flag to build all supported extensions * Don't remove --cuda_ext since it'll be needed to build other extensions * Need to clear all cmdline args so setup.py doesn't complain
-
Hubert Lu authored
Add more unit tests for both distributed and extensions
-
hubertlu-tw authored
-
Hubert Lu authored
-
- 01 Dec, 2021 2 commits
- 29 Nov, 2021 1 commit
-
-
X Wang authored
-
- 22 Nov, 2021 1 commit
-
-
Hubert Lu authored
Change python3.6 to python
-
- 19 Nov, 2021 2 commits
- 18 Nov, 2021 1 commit
-
-
Abhishree authored
-
- 17 Nov, 2021 2 commits
-
-
X Wang authored
-
Masaki Kozuki authored
-
- 02 Nov, 2021 3 commits
-
-
Hubert Lu authored
Enable multihead atten
-
Hubert Lu authored
Co-authored-by:Jeff Daily <jeff.daily@amd.com>
-
hubertlu-tw authored
-
- 01 Nov, 2021 3 commits
-
-
hubertlu-tw authored
Fix rocblas_gemmex namespace Fix namespace Clean up comments
-
hubertlu-tw authored
Enable HIP floa to hald conversion
-
hubertlu-tw authored
Fix some spacing
-
- 29 Oct, 2021 2 commits
-
-
Peng authored
-
hubertlu-tw authored
-
- 28 Oct, 2021 1 commit
-
-
hubertlu-tw authored
-
- 27 Oct, 2021 1 commit
-
- 26 Oct, 2021 1 commit
-
-
hubertlu authored
-
- 21 Oct, 2021 2 commits
-
-
Jeff Daily authored
-
Jeff Daily authored
-
- 20 Oct, 2021 1 commit
-
-
Hubert Lu authored
-
- 19 Oct, 2021 4 commits
- 15 Oct, 2021 1 commit
-
-
hubertlu-tw authored
-
- 14 Oct, 2021 2 commits
-
-
Burc Eryilmaz authored
change chunking scheme for full-allreduce case, add parameter order argument, both to enable contiguous chunking of allgather (#1190)
-
Nan Zheng authored
1. remove the weight broadcast in the constructor 2. disable unnecessary allreduces for clip-after-ar
-
- 13 Oct, 2021 1 commit
-
-
eqy authored
-
- 08 Oct, 2021 2 commits
-
-
Masaki Kozuki authored
* run backward * remove custom_fwd/custom_bwd
-
eqy authored
-