• Benjamin Lefaudeux's avatar
    [feat] ShardedDataParallel with autoreduce (#157) · ad933b34
    Benjamin Lefaudeux authored
    * rewrite using autograd and Variable execution queue to make the reduce automatic
    * share buckets with OSS to remove duplication
    * some speed still likely on the table since the speed vs. bucketing does not match expectations, could be a follow up
    ad933b34
config.yml 8.53 KB