[Bugfix] `OPTDecoderLayer` does not return attentions when...
[Bugfix] `OPTDecoderLayer` does not return attentions when `gradient_checkpointing` and `training` is enabled. (#23367) Update modeling_opt.py
Showing
Please register or sign in to comment