Interleaved pipeline execution and code refactoring
- Split a model's computation into multiple virtual stages as needed, and schedule communication correctly between these virtual stages - Move schedule code into `schedules.py` and communication code into `p2p_communication.py` - Use hyphens instead of spaces in all time logging for consistency - Factor out code in megatron/training.py into helper functions - Refactor evaluate() function: make it use forward_backward_schedule functions
Showing
megatron/schedules.py
0 → 100644
This diff is collapsed.
Please register or sign in to comment