"applications/Colossal-LLaMA/README.md" did not exist on "3dbbf83f1c46ae2a3b2947e1a5925c2b8af9f7b1"
  1. 18 Oct, 2023 1 commit
  2. 11 Oct, 2023 1 commit
    • Bin Jia's avatar
      [Pipeline Inference] Sync pipeline inference branch to main (#4820) · 08a9f76b
      Bin Jia authored
      * [pipeline inference] pipeline inference (#4492)
      
      * add pp stage manager as circle stage
      
      * fix a bug when create process group
      
      * add ppinfer basic framework
      
      * add micro batch manager and support kvcache-pp gpt2 fwd
      
      * add generate schedule
      
      * use mb size to control mb number
      
      * support generate with kv cache
      
      * add output, remove unused code
      
      * add test
      
      * reuse shardformer to build model
      
      * refactor some code and use the same attribute name of hf
      
      * fix review and add test for generation
      
      * remove unused file
      
      * fix CI
      
      * add cache clear
      
      * fix code error
      
      * fix typo
      
      * [Pipeline inference] Modify to tieweight (#4599)
      
      * add pp stage manager as circle stage
      
      * fix a bug when create process group
      
      * add ppinfer basic framework
      
      * add micro batch manager and support kvcache-pp gpt2 fwd
      
      * add generate schedule
      
      * use mb size to control mb number
      
      * support generate with kv cache
      
      * add output, remove unused code
      
      * add test
      
      * reuse shardformer to build model
      
      * refactor some code and use the same attribute name of hf
      
      * fix review and add test for generation
      
      * remove unused file
      
      * modify the way of saving newtokens
      
      * modify to tieweight
      
      * modify test
      
      * remove unused file
      
      * solve review
      
      * add docstring
      
      * [Pipeline inference] support llama pipeline inference (#4647)
      
      * support llama pipeline inference
      
      * remove tie weight operation
      
      * [pipeline inference] Fix the blocking of communication when ppsize is 2 (#4708)
      
      * add benchmark verbose
      
      * fix export tokens
      
      * fix benchmark verbose
      
      * add P2POp style to do p2p communication
      
      * modify schedule as p2p type when ppsize is 2
      
      * remove unused code and add docstring
      
      * [Pipeline inference] Refactor code, add docsting, fix bug (#4790)
      
      * add benchmark script
      
      * update argparse
      
      * fix fp16 load
      
      * refactor code style
      
      * add docstring
      
      * polish code
      
      * fix test bug
      
      * [Pipeline inference] Add pipeline inference docs (#4817)
      
      * add readme doc
      
      * add a ico
      
      * Add performance
      
      * update table of contents
      
      * refactor code (#4873)
      08a9f76b