"git@developer.sourcefind.cn:zhaoyu6/sglang.git" did not exist on "56321e9fc213daaa4ae47c0e5e262c500180e787"
Unverified Commit 6ef7186b authored by Sadra's avatar Sadra Committed by GitHub
Browse files

fixed crash when deleting older checkpoint and a file f"{checkpoint_prefix}-*" exist (#16686)

I create an archive of older checkpoints during training the checkpoint has a  name with `f"{checkpoint_prefix}-*.zip/.tar ` 
previously `glob(f"{checkpoint_prefix}-*")` takes all files/folders starting with the name checkpoint, and later `shutil.rmtree(checkpoint)` takes a folder name; since at some point it my get a zip file; it crashes training; adding this `if os.path.isdir(x)` allows only folders on `glob_checkpoints`
parent b0bf3011
...@@ -2200,7 +2200,7 @@ class Trainer: ...@@ -2200,7 +2200,7 @@ class Trainer:
) -> List[str]: ) -> List[str]:
ordering_and_checkpoint_path = [] ordering_and_checkpoint_path = []
glob_checkpoints = [str(x) for x in Path(output_dir).glob(f"{checkpoint_prefix}-*")] glob_checkpoints = [str(x) for x in Path(output_dir).glob(f"{checkpoint_prefix}-*") if os.path.isdir(x)]
for path in glob_checkpoints: for path in glob_checkpoints:
if use_mtime: if use_mtime:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment