"vscode:/vscode.git/clone" did not exist on "1896b1f7c1c740648cf163c82efdce5c2c861207"
- 17 Jun, 2021 5 commits
-
-
Rick Ho authored
use single variable for returned value
-
Jiezhong Qiu authored
the old impl raised error "too many values to unpack (expected 1)"
-
Jiezhong Qiu authored
Fix grad of balance loss
-
Rick Ho authored
-
Rick Ho authored
-
- 16 Jun, 2021 1 commit
-
-
Rick Ho authored
* use single variable instead of vector in c functions * expert count kernel * remove all lists * fix old tests
-
- 09 Jun, 2021 2 commits
-
-
Rick Ho authored
Fixed asynchronous streams in column reduce kernel call
-
TiagoMAntunes authored
-
- 31 May, 2021 7 commits
- 30 May, 2021 1 commit
-
-
Rick Ho authored
Fix bugs to run megatron with gshard gate
-
- 29 May, 2021 1 commit
-
-
Rick Ho authored
-
- 24 May, 2021 6 commits
-
-
Rick Ho authored
Update test_gates.py
-
GODVIX authored
-
Rick Ho authored
Add random routing in gshard gate
-
Rich Ho authored
-
Rick Ho authored
mask and experts list
-
Colin authored
- mask some tensors of tokens for fmoe forward - pass a list of expert classes to specify what experts in what order want to use
-
- 23 May, 2021 1 commit
-
-
Colin authored
-
- 21 May, 2021 2 commits
- 20 May, 2021 4 commits
- 19 May, 2021 5 commits
- 18 May, 2021 2 commits
- 13 May, 2021 3 commits