Merge branch 'activations-bias' into 'main'
Add swiglu and squared relu activations and ability to disable bias. See merge request ADLR/megatron-lm!553
Showing
Please register or sign in to comment
Add swiglu and squared relu activations and ability to disable bias. See merge request ADLR/megatron-lm!553