Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttent… (#4323)
* Fix FFN dropout in TFAlbertLayer, and split dropout in TFAlbertAttention into two separate dropout layers. * Same dropout fixes for PyTorch.
Showing
Please register or sign in to comment