- 01 Mar, 2021 4 commits
-
-
Jiezhong Qiu authored
-
Rick Ho authored
-
Rick Ho authored
-
Rick Ho authored
-
- 28 Feb, 2021 10 commits
-
-
Rick Ho authored
-
Rick Ho authored
-
Rick Ho authored
-
Rick Ho authored
-
Rick Ho authored
-
Rick Ho authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
remove author related info
-
- 26 Feb, 2021 9 commits
- 25 Feb, 2021 12 commits
-
-
Jiezhong Qiu authored
i.e., expand bias using torch.repeat_interleave directly
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
the true bias has been added in FMoeLinear
-
Jiezhong Qiu authored
move away layernorm/residual/dropout
-
Rick Ho authored
Reproducibility
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
although order of dp and relu doesn't matter
-
Rick Ho authored
Test Transformer-XL
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
- 24 Feb, 2021 5 commits
-
-
Jiezhong Qiu authored
activation passed by func can not be saved
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Rick Ho authored
-
Jiezhong Qiu authored
-