Refactor code according to Jared's comments: move pipelining and...
Refactor code according to Jared's comments: move pipelining and non-pipelining training loops into separate methods Also, use mpu.get_*_model_parallel_size() instead of args.*_model_parallel_size
Showing
Please register or sign in to comment