- 28 Feb, 2021 4 commits
-
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
remove author related info
-
- 26 Feb, 2021 9 commits
- 25 Feb, 2021 12 commits
-
-
Jiezhong Qiu authored
i.e., expand bias using torch.repeat_interleave directly
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
the true bias has been added in FMoeLinear
-
Jiezhong Qiu authored
move away layernorm/residual/dropout
-
Rick Ho authored
Reproducibility
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
although order of dp and relu doesn't matter
-
Rick Ho authored
Test Transformer-XL
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
- 24 Feb, 2021 9 commits
-
-
Jiezhong Qiu authored
activation passed by func can not be saved
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Rick Ho authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
-
Jiezhong Qiu authored
follow suggestions from https://github.com/pytorch/pytorch/issues/36035#issuecomment-770960405
-
Rick Ho authored
-
- 23 Feb, 2021 6 commits