"git@developer.sourcefind.cn:OpenDAS/ktransformers.git" did not exist on "193d6300bfa40be659155a50f4f6e226ef6a9553"
Commit 4e4a865c authored by Yanghan Wang's avatar Yanghan Wang Committed by Facebook GitHub Bot
Browse files

support specifying concurrency level for interleave

Summary:
Pull Request resolved: https://github.com/facebookresearch/d2go/pull/481

X-link: https://github.com/facebookresearch/mobile-vision/pull/139

also support specifying number of concurrency for interleaving.

Reviewed By: mattcyu1

Differential Revision: D43522445

fbshipit-source-id: 790a8527c6b42c9098ef82c4fc01ec1a528e2418
parent 34a5a3e8
...@@ -149,13 +149,15 @@ class FSDPCheckpointer(QATCheckpointer): ...@@ -149,13 +149,15 @@ class FSDPCheckpointer(QATCheckpointer):
self.tag_last_checkpoint(basename) self.tag_last_checkpoint(basename)
def _save_file(self, data, filename): def _save_file(self, data, filename):
with interleave_by_rank(): # allow 8 GPUs to write to manifold at the same time
with interleave_by_rank(concurrency_limit=8):
self.logger.info("Saving checkpoint to {}".format(filename)) self.logger.info("Saving checkpoint to {}".format(filename))
with self.path_manager.open(filename, "wb") as f: with self.path_manager.open(filename, "wb") as f:
torch.save(data, cast(IO[bytes], f)) torch.save(data, cast(IO[bytes], f))
def _load_file(self, f: str): def _load_file(self, f: str):
with interleave_by_rank(): # allow 8 GPUs to read from manifold at the same time
with interleave_by_rank(concurrency_limit=8):
return super()._load_file(f) return super()._load_file(f)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment