"git@developer.sourcefind.cn:zhaoyu6/sglang.git" did not exist on "56321e9fc213daaa4ae47c0e5e262c500180e787"
fixed crash when deleting older checkpoint and a file f"{checkpoint_prefix}-*" exist (#16686)
I create an archive of older checkpoints during training the checkpoint has a name with `f"{checkpoint_prefix}-*.zip/.tar `
previously `glob(f"{checkpoint_prefix}-*")` takes all files/folders starting with the name checkpoint, and later `shutil.rmtree(checkpoint)` takes a folder name; since at some point it my get a zip file; it crashes training; adding this `if os.path.isdir(x)` allows only folders on `glob_checkpoints`
Showing
Please register or sign in to comment