1. 13 Oct, 2023 1 commit
  2. 12 Oct, 2023 4 commits
  3. 11 Oct, 2023 4 commits
    • littsk's avatar
      ffd9a3cb
    • ppt0011's avatar
      1dcaf249
    • Xu Kai's avatar
      fix test llama (#4884) · fdec650b
      Xu Kai authored
      fdec650b
    • Bin Jia's avatar
      [Pipeline Inference] Sync pipeline inference branch to main (#4820) · 08a9f76b
      Bin Jia authored
      * [pipeline inference] pipeline inference (#4492)
      
      * add pp stage manager as circle stage
      
      * fix a bug when create process group
      
      * add ppinfer basic framework
      
      * add micro batch manager and support kvcache-pp gpt2 fwd
      
      * add generate schedule
      
      * use mb size to control mb number
      
      * support generate with kv cache
      
      * add output, remove unused code
      
      * add test
      
      * reuse shardformer to build model
      
      * refactor some code and use the same attribute name of hf
      
      * fix review and add test for generation
      
      * remove unused file
      
      * fix CI
      
      * add cache clear
      
      * fix code error
      
      * fix typo
      
      * [Pipeline inference] Modify to tieweight (#4599)
      
      * add pp stage manager as circle stage
      
      * fix a bug when create process group
      
      * add ppinfer basic framework
      
      * add micro batch manager and support kvcache-pp gpt2 fwd
      
      * add generate schedule
      
      * use mb size to control mb number
      
      * support generate with kv cache
      
      * add output, remove unused code
      
      * add test
      
      * reuse shardformer to build model
      
      * refactor some code and use the same attribute name of hf
      
      * fix review and add test for generation
      
      * remove unused file
      
      * modify the way of saving newtokens
      
      * modify to tieweight
      
      * modify test
      
      * remove unused file
      
      * solve review
      
      * add docstring
      
      * [Pipeline inference] support llama pipeline inference (#4647)
      
      * support llama pipeline inference
      
      * remove tie weight operation
      
      * [pipeline inference] Fix the blocking of communication when ppsize is 2 (#4708)
      
      * add benchmark verbose
      
      * fix export tokens
      
      * fix benchmark verbose
      
      * add P2POp style to do p2p communication
      
      * modify schedule as p2p type when ppsize is 2
      
      * remove unused code and add docstring
      
      * [Pipeline inference] Refactor code, add docsting, fix bug (#4790)
      
      * add benchmark script
      
      * update argparse
      
      * fix fp16 load
      
      * refactor code style
      
      * add docstring
      
      * polish code
      
      * fix test bug
      
      * [Pipeline inference] Add pipeline inference docs (#4817)
      
      * add readme doc
      
      * add a ico
      
      * Add performance
      
      * update table of contents
      
      * refactor code (#4873)
      08a9f76b
  4. 10 Oct, 2023 5 commits
  5. 07 Oct, 2023 5 commits
  6. 06 Oct, 2023 2 commits
  7. 05 Oct, 2023 2 commits
  8. 04 Oct, 2023 2 commits
  9. 02 Oct, 2023 2 commits
    • Yuanheng Zhao's avatar
      [Infer] Serving example w/ ray-serve (multiple GPU case) (#4841) · 573f2705
      Yuanheng Zhao authored
      * fix imports
      
      * add ray-serve with Colossal-Infer tp
      
      * trivial: send requests script
      
      * add README
      
      * fix worker port
      
      * fix readme
      
      * use app builder and autoscaling
      
      * trivial: input args
      
      * clean code; revise readme
      
      * testci (skip example test)
      
      * use auto model/tokenizer
      
      * revert imports fix (fixed in other PRs)
      573f2705
    • Yuanheng Zhao's avatar
      [Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) (#4771) · 3a74eb4b
      Yuanheng Zhao authored
      * add Colossal-Inference serving example w/ TorchServe
      
      * add dockerfile
      
      * fix dockerfile
      
      * fix dockerfile: fix commit hash, install curl
      
      * refactor file structure
      
      * revise readme
      
      * trivial
      
      * trivial: dockerfile format
      
      * clean dir; revise readme
      
      * fix comments: fix imports and configs
      
      * fix formats
      
      * remove unused requirements
      3a74eb4b
  10. 28 Sep, 2023 2 commits
  11. 27 Sep, 2023 8 commits
  12. 26 Sep, 2023 3 commits