• Benjamin Lefaudeux's avatar
    [feat] OSS flatten state dict (#65) · 4f597233
    Benjamin Lefaudeux authored
    Changes the structure of the returned state dict with respect to the param_groups to make it closer to what a vanilla optimizer would return (un-shard them). Shard again when loading
    4f597233
oss.py 5.33 KB