-
Paul Fultz II authored
Improves the constant propagation for bert models. Larger batch size no longer use as large of constants. Also improves the speed of model compilation
a8ace295
Improves the constant propagation for bert models. Larger batch size no longer use as large of constants. Also improves the speed of model compilation