• Kirigaya Kazuto's avatar
    [pipeline/rpc] implement distributed optimizer | test with assert_close (#1486) · 9145aef2
    Kirigaya Kazuto authored
    * support p2p communication with any type of object | pass test
    
    * reconstruct pipeline schedule with p2p_v2.py(support communication with List[Any]) | pass test
    
    * [engin/schedule] use p2p_v2 to recontruct pipeline_schedule
    
    * [pipeline/rpc] implement a demo for PP with cuda rpc framework
    
    * [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B
    
    * [pipeline/rpc] implement distributed optimizer | test with assert_close
    
    * [pipeline/rpc] implement distributed optimizer | test with assert_close
    9145aef2
test_cuda_rpc_pipeline.py 1.46 KB