• Benjamin Lefaudeux's avatar
    [feat] OSS/SDP : bucketing (#122) · 341d8b2b
    Benjamin Lefaudeux authored
    Same bucketing strategy for OSS and SDP:
    sort everything ahead of time, per rank and per size, smaller tensors first. Bucket the smallest elements in a fixed buffer, send async, then send all the others async, and get back to the bucket. Once done then scatter the contents if needed
    341d8b2b
oss.py 8.38 KB