"superbench/benchmarks/reducer.py" did not exist on "86c390a91283fab2a29273103fc68804da2d3b76"
  • Yifan Xiong's avatar
    Enhance timeout cleanup to avoid possible hanging (#405) · 8afaa376
    Yifan Xiong authored
    Enhance timeout cleanup to avoid possible hanging.
    
    __Major Revisions__
    * Skip postprocess (mainly torch.dist.barrier and destroy) when exception happens (e.g., timeout, GPU crashed) to avoid subprocesses hanging.
    * Add cleanup to kill sb exec processes when Ansible run failed for certain benchmark.
    
    __Minor Revisions__
    * Update extra Ansible timeout from 300s to 60s.
    8afaa376
runner.py 19.4 KB