• qinyiqun's avatar
    Support Quantization (#996) · eb89439d
    qinyiqun authored
    
    
    demo131 - multiple issues regarding quantization, qy, and so forth
    
    * issue/843: success per_channel_quant_int8
    
    * issue/843: success qy quant
    
    * issue/843: modified quant
    
    * Add w8a8int8 performance tests
    
    * add infinicore op linear_w8a8i8
    
    * w8a8 linear module functional nn
    
    * issue/843: QY-GPU Support Int8 scale_mm (#68)
    
    * issue/843: success qy scaled_mm
    
    * issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh
    
    * fix parallel slic in w8
    
    * w8: support multiple batch size
    
    * temp: 修改quantconfig处理
    
    * fix format and delete redundancy code
    
    * fix format
    
    * fix format
    
    * fix format
    
    * Refactor: add new API alongside legacy interfaces with deprecation warnings
    
    * 添加w4 inifnicore相关内容,以及将Quantization config划入InfiniCore
    
    * 量化算子支持图
    
    * solve cub version problem and fix code structure
    
    * fix format
    
    * demo131 - remove commented lines
    
    ---------
    Co-authored-by: default avatarxgqdut2016 <kenan_gewei@163.com>
    Co-authored-by: default avatarxgqdut2016 <140036308+xgqdut2016@users.noreply.github.com>
    Co-authored-by: default avatarwooway777 <wooway777@gmail.com>
    eb89439d
xmake.lua 13.5 KB