• Bin Jia's avatar
    [Pipeline inference] Combine kvcache with pipeline inference (#4938) · 1db67276
    Bin Jia authored
    * merge kvcache with pipeline inference and refactor the code structure
    
    * support ppsize > 2
    
    * refactor pipeline code
    
    * do pre-commit
    
    * modify benchmark
    
    * fix bench mark
    
    * polish code
    
    * add docstring and update readme
    
    * refactor the code
    
    * fix some logic bug of ppinfer
    
    * polish readme
    
    * fix typo
    
    * skip infer test
    1db67276
benchmark.py 6.27 KB