-
Deepak Narayanan authored
Refactor code according to Jared's comments: move pipelining and non-pipelining training loops into separate methods Also, use mpu.get_*_model_parallel_size() instead of args.*_model_parallel_size
1979c242
Refactor code according to Jared's comments: move pipelining and non-pipelining training loops into separate methods Also, use mpu.get_*_model_parallel_size() instead of args.*_model_parallel_size