1. 12 Dec, 2022 1 commit
  2. 08 Dec, 2022 1 commit
  3. 29 Nov, 2022 1 commit
  4. 18 Oct, 2022 1 commit
    • Super Daniel's avatar
      [fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions... · 393f5940
      Super Daniel authored
      [fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions with compatibility checks / remove color debug (#1710)
      
      * [fx] move meta registration
      
      * [fx] fix tests.
      
      * [fx] fix test.
      
      * [fx] fix.
      
      * [meta] refactor meta registration.py.
      
      * [fx] add compatibility descriptions.
      
      * [fx] polish import.
      
      * [fx] add a decorator.
      
      * [fx] fix tests.
      
      * [fx] remove print.
      
      * [fx] edit raise error.
      
      * [fx] edit raise error.
      
      * [fx] add type hint.
      
      * [fx] fix import in experimental.
      
      * [rpc] remove color debug.
      
      * [meta] fix naming.
      393f5940
  5. 20 Sep, 2022 1 commit
    • Kirigaya Kazuto's avatar
      [pipeline/chimera] test chimera | fix bug of initializing (#1615) · 170fa810
      Kirigaya Kazuto authored
      * [pipeline/tuning] improve dispatch performance both time and space cost
      
      * [pipeline/converge] add interface for testing convergence
      
      * [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style
      
      * Update PipelineBase.py
      
      * [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera
      
      * [pipeline/chimera] test chimera | fix bug of initializing
      170fa810
  6. 07 Sep, 2022 1 commit
  7. 01 Sep, 2022 1 commit
    • Kirigaya Kazuto's avatar
      [pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local... · f1e18362
      Kirigaya Kazuto authored
      [pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP (#1508)
      
      * support p2p communication with any type of object | pass test
      
      * reconstruct pipeline schedule with p2p_v2.py(support communication with List[Any]) | pass test
      
      * [engin/schedule] use p2p_v2 to recontruct pipeline_schedule
      
      * [pipeline/rpc] implement a demo for PP with cuda rpc framework
      
      * [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B
      
      * [pipeline/rpc] implement distributed optimizer | test with assert_close
      
      * [pipeline/rpc] implement distributed optimizer | test with assert_close
      
      * [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy
      
      * [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy
      
      * [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy
      
      * [pipeline/pipleline_process_group] finish PipelineProcessGroup to manage local abd global rank in TP,DP and PP
      
      * [pipeline/pipleline_process_group] remove comment
      
      * [pipeline/pipleline_process_group] remove comment
      
      * [pipeline/pipleline_process_group] skip process group test
      
      * [pipeline/pipleline_process_group] remove test named function
      f1e18362
  8. 26 Aug, 2022 1 commit
    • Kirigaya Kazuto's avatar
      [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy (#1497) · 5a6fd71f
      Kirigaya Kazuto authored
      * support p2p communication with any type of object | pass test
      
      * reconstruct pipeline schedule with p2p_v2.py(support communication with List[Any]) | pass test
      
      * [engin/schedule] use p2p_v2 to recontruct pipeline_schedule
      
      * [pipeline/rpc] implement a demo for PP with cuda rpc framework
      
      * [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B
      
      * [pipeline/rpc] implement distributed optimizer | test with assert_close
      
      * [pipeline/rpc] implement distributed optimizer | test with assert_close
      
      * [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy
      
      * [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy
      
      * [pipeline/rpc] update outstanding mechanism | optimize dispatching strategy
      5a6fd71f
  9. 25 Aug, 2022 1 commit
    • Kirigaya Kazuto's avatar
      [pipeline/rpc] implement distributed optimizer | test with assert_close (#1486) · 9145aef2
      Kirigaya Kazuto authored
      * support p2p communication with any type of object | pass test
      
      * reconstruct pipeline schedule with p2p_v2.py(support communication with List[Any]) | pass test
      
      * [engin/schedule] use p2p_v2 to recontruct pipeline_schedule
      
      * [pipeline/rpc] implement a demo for PP with cuda rpc framework
      
      * [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B
      
      * [pipeline/rpc] implement distributed optimizer | test with assert_close
      
      * [pipeline/rpc] implement distributed optimizer | test with assert_close
      9145aef2
  10. 24 Aug, 2022 1 commit
    • Kirigaya Kazuto's avatar
      [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when... · a6c87491
      Kirigaya Kazuto authored
      [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B (#1483)
      
      * support p2p communication with any type of object | pass test
      
      * reconstruct pipeline schedule with p2p_v2.py(support communication with List[Any]) | pass test
      
      * [engin/schedule] use p2p_v2 to recontruct pipeline_schedule
      
      * [pipeline/rpc] implement a demo for PP with cuda rpc framework
      
      * [pipeline/rpc] support interleaving | fix checkpoint bug | change logic when dispatch data in work_list to ensure steady 1F1B
      a6c87491
  11. 22 Aug, 2022 1 commit
    • Kirigaya Kazuto's avatar
      [pipeline/rpc] implement a demo for PP with cuda rpc framework (#1470) · bb5f5289
      Kirigaya Kazuto authored
      * support p2p communication with any type of object | pass test
      
      * reconstruct pipeline schedule with p2p_v2.py(support communication with List[Any]) | pass test
      
      * [engin/schedule] use p2p_v2 to recontruct pipeline_schedule
      
      * [pipeline/rpc] implement a demo for PP with cuda rpc framework
      
      * Delete p2p_v2.py
      
      * Delete _pipeline_schedule_v2.py
      
      * Delete test_object_list_p2p_v2.py
      
      * Delete test_boardcast_send_recv_v2.py
      
      * Delete test_cifar_with_data_pipeline_tensor_v2.py
      bb5f5289