Fix ORTTrainer failure on gpt2 fp16 training (#18017)
* Ensure value and attn weights have the same dtype * Remove prints * Modify decision transformers copied from gpt2 * Nit device Co-authored-by:Lysandre Debut <lysandre@huggingface.co> * Fix style Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
Showing
Please register or sign in to comment