• one's avatar
    Benchmark: Update overlap and sharding matmul benchmarks (#19) · a961ebd4
    one authored
    - Enable `computation-communication-overlap` and `sharding-matmul` in
    some configs through the existing PyTorch distributed mode.
    - Use `torchrun --standalone` for single-node `torch.distributed` runs
    to avoid fixed rendezvous port conflicts on 29500.
    - Update runner command-generation test expectation for the new
    single-node torchrun behavior.
    a961ebd4
test_runner.py 27.6 KB