"src/diffusers/models/modeling_flax_utils.py" did not exist on "c6629e6f111114e4432678ddd4beb83571dc25b9"
- 30 Jul, 2021 1 commit
-
-
Rick Ho authored
-
- 20 Jul, 2021 2 commits
-
-
Rick Ho authored
fix fp16 training with balance loss
-
-
- 08 Jul, 2021 4 commits
- 07 Jul, 2021 1 commit
-
-
Rick Ho authored
-
- 30 Jun, 2021 2 commits
-
-
Rick Ho authored
Fix typo in readme
-
Yimin Jiang authored
-
- 29 Jun, 2021 1 commit
-
-
Jiezhong Qiu authored
Fix pytorch compatibility issue !55
-
- 28 Jun, 2021 2 commits
- 18 Jun, 2021 1 commit
-
-
Tiago Antunes authored
* Added default weight initializations to FMoELinear and NoisyGate * Following torch's naming convention
-
- 17 Jun, 2021 5 commits
-
-
Rick Ho authored
use single variable for returned value
-
Jiezhong Qiu authored
the old impl raised error "too many values to unpack (expected 1)"
-
Jiezhong Qiu authored
Fix grad of balance loss
-
Rick Ho authored
-
Rick Ho authored
-
- 16 Jun, 2021 1 commit
-
-
Rick Ho authored
* use single variable instead of vector in c functions * expert count kernel * remove all lists * fix old tests
-
- 09 Jun, 2021 2 commits
-
-
Rick Ho authored
Fixed asynchronous streams in column reduce kernel call
-
TiagoMAntunes authored
-
- 31 May, 2021 7 commits
- 30 May, 2021 1 commit
-
-
Rick Ho authored
Fix bugs to run megatron with gshard gate
-
- 29 May, 2021 1 commit
-
-
Rick Ho authored
-
- 24 May, 2021 6 commits
-
-
Rick Ho authored
Update test_gates.py
-
GODVIX authored
-
Rick Ho authored
Add random routing in gshard gate
-
Rich Ho authored
-
Rick Ho authored
mask and experts list
-
Colin authored
- mask some tensors of tokens for fmoe forward - pass a list of expert classes to specify what experts in what order want to use
-
- 23 May, 2021 1 commit
-
-
Colin authored
-
- 21 May, 2021 2 commits