1. 02 Oct, 2023 1 commit
    • Yuanheng Zhao's avatar
      [Infer] Serving example w/ ray-serve (multiple GPU case) (#4841) · 573f2705
      Yuanheng Zhao authored
      * fix imports
      
      * add ray-serve with Colossal-Infer tp
      
      * trivial: send requests script
      
      * add README
      
      * fix worker port
      
      * fix readme
      
      * use app builder and autoscaling
      
      * trivial: input args
      
      * clean code; revise readme
      
      * testci (skip example test)
      
      * use auto model/tokenizer
      
      * revert imports fix (fixed in other PRs)
      573f2705