"docs/source/fr/index.md" did not exist on "269b05493917af2f7e86bafc735576a1a22caf4f"
Inconsistency in PreTrainedModel.resize_token_embeddings When ZeRO3 Is Enabled (#25394)
* Inconsistency in PreTrainedModel.resize_token_embeddings This PR addresses https://github.com/huggingface/transformers/issues/25241 . In previous implementation when ZeRO stage 3 was enbaled, resize_token_embeddings would create independent PyTorch weights on each device. Here we ensure that new embeddings are created with DeepSpeed init, and are properly partitioned accros devices. * formatting with black * adding the removed comments back in --------- Co-authored-by:Sina Moeini <smoeini@amazon.com>
Showing
Please register or sign in to comment