"...git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "c523b241c2e50c3ed035bb76b938b6a944fed7e5"
HF <-> megatron checkpoint reshaping and conversion for GPT (#19317)
* HF <-> megatron checkpoint conversion handling reshaping from different tensor and parallel sizes * Apply suggestions from code review Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * add doc strings and
🐛 fixes Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Showing
This diff is collapsed.
Please register or sign in to comment