- 24 May, 2021 1 commit
-
-
Colin authored
- mask some tensors of tokens for fmoe forward - pass a list of expert classes to specify what experts in what order want to use
-
- 23 May, 2021 1 commit
-
-
Colin authored
-
- 19 May, 2021 2 commits
- 13 May, 2021 1 commit
-
-
Rich Ho authored
-
- 27 Apr, 2021 1 commit
-
-
Rick Ho authored
-
- 26 Apr, 2021 1 commit
-
-
Rick Ho authored
-
- 23 Mar, 2021 2 commits
-
-
TiagoMAntunes authored
Bias now being calculated directly in MOELinear layer. Added corresponding CUDA changes. Updated forward and backward functions of MOELinear
-
TiagoMAntunes authored
-
- 22 Mar, 2021 1 commit
-
-
Sengxian authored
-
- 13 Mar, 2021 1 commit
-
-
Rick Ho authored
-
- 09 Mar, 2021 1 commit
-
-
Sengxian authored
-
- 26 Feb, 2021 1 commit
-
-
Rick Ho authored
-
- 25 Feb, 2021 4 commits
-
-
Jiezhong Qiu authored
i.e., expand bias using torch.repeat_interleave directly
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
- 24 Feb, 2021 1 commit
-
-
Rick Ho authored
-
- 23 Feb, 2021 4 commits
- 22 Feb, 2021 1 commit
-
-
Rick Ho authored
-
- 21 Feb, 2021 4 commits
-
-
Jiezhong Qiu authored
-
Rick Ho authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
- 20 Feb, 2021 3 commits
-
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
- 08 Feb, 2021 2 commits
- 07 Feb, 2021 2 commits
- 05 Feb, 2021 2 commits
- 04 Feb, 2021 2 commits
- 03 Feb, 2021 1 commit
-
-
Rick Ho authored
-
- 02 Feb, 2021 1 commit
-
-
Rick Ho authored
-