1. 06 Dec, 2023 1 commit
    • Lyu Han's avatar
      Report the inference benchmark of models with different size (#794) · ebe90bc9
      Lyu Han authored
      * update test scripts for models with different sizes
      
      * update
      
      * only test after tunning gemm
      
      * chmod +x
      
      * fix typo
      
      * benchmark on a100
      
      * fix typo
      
      * fix typo
      
      * per-token latency percentile in profile_throughput
      
      * fix
      
      * fix
      
      * rename
      
      * make the script accept parameters
      
      * minor fix
      
      * indent
      
      * reformat table
      
      * change to 3000
      
      * minor fix
      ebe90bc9
  2. 04 Dec, 2023 1 commit
  3. 29 Nov, 2023 1 commit
    • Lyu Han's avatar
      Update benchmark user guide (#763) · d3e2cee4
      Lyu Han authored
      * user guide of benchmark generation
      
      * update benchmark generation guide
      
      * update profiling throughput guide
      
      * update profiling api_server guide
      
      * rename file names
      
      * update profile tis user guide
      
      * update
      
      * fix according to review comments
      
      * update
      
      * update according to review comments
      
      * updaste
      
      * add an example
      
      * update
      d3e2cee4