"tests/vscode:/vscode.git/clone" did not exist on "660e0b97bd652bd3a0dfd5f847e5cf62502d0469"
Add LayerScale to NAT/DiNAT (#20325)
* Add LayerScale to NAT/DiNAT.
Completely dropped the ball on LayerScale in the original PR (#20219).
This is just an optional argument in both models, and is only activated for larger variants in order to provide training stability.
* Add LayerScale to NAT/DiNAT.
Minor error fixed.
Co-authored-by:
Ali Hassani <ahassanijr@gmail.com>
Showing
Please register or sign in to comment