Enable training Llama with model or pipeline parallelism (#22329)
* Llama - Move target tokens to final pipeline device if needed * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Showing
Please register or sign in to comment