"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "03a3becc48f14a481b578c4d1c02273da9a1cc81"
Add a use_parallel_residual argument to control the residual computing way (#18695)
* Add a gpt_j_residual argument to control the residual computing way * Put duplicate code outside of the if block * Rename parameter "gpt_j_residual" to "use_parallel_residual" and set the default value to True
Showing
Please register or sign in to comment