Unverified Commit ea41e18c authored by Ivan Sorokin's avatar Ivan Sorokin Committed by GitHub
Browse files

improve from_pretrained for zero3 multi gpus mode (#24964)



* improve from_pretrained for zero3 multi gpus mode

* Add check if torch.distributed.is_initialized

* Revert torch.distributed

---------
Co-authored-by: default avatarStas Bekman <stas@stason.org>
parent 95f96b45
...@@ -457,7 +457,11 @@ def load_state_dict(checkpoint_file: Union[str, os.PathLike]): ...@@ -457,7 +457,11 @@ def load_state_dict(checkpoint_file: Union[str, os.PathLike]):
) )
return safe_load_file(checkpoint_file) return safe_load_file(checkpoint_file)
try: try:
return torch.load(checkpoint_file, map_location="cpu") if is_deepspeed_zero3_enabled() and torch.distributed.is_initialized() and torch.distributed.get_rank() > 0:
map_location = "meta"
else:
map_location = "cpu"
return torch.load(checkpoint_file, map_location=map_location)
except Exception as e: except Exception as e:
try: try:
with open(checkpoint_file) as f: with open(checkpoint_file) as f:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment