1. 09 Mar, 2026 1 commit
  2. 03 Mar, 2026 1 commit
  3. 11 Feb, 2026 2 commits
    • zhushuang's avatar
    • qinyiqun's avatar
      Support Quantization (#996) · eb89439d
      qinyiqun authored
      
      
      demo131 - multiple issues regarding quantization, qy, and so forth
      
      * issue/843: success per_channel_quant_int8
      
      * issue/843: success qy quant
      
      * issue/843: modified quant
      
      * Add w8a8int8 performance tests
      
      * add infinicore op linear_w8a8i8
      
      * w8a8 linear module functional nn
      
      * issue/843: QY-GPU Support Int8 scale_mm (#68)
      
      * issue/843: success qy scaled_mm
      
      * issue/843: modified kernel.cuh as per_channel_dequant_int8.cuh
      
      * fix parallel slic in w8
      
      * w8: support multiple batch size
      
      * temp: 修改quantconfig处理
      
      * fix format and delete redundancy code
      
      * fix format
      
      * fix format
      
      * fix format
      
      * Refactor: add new API alongside legacy interfaces with deprecation warnings
      
      * 添加w4 inifnicore相关内容,以及将Quantization config划入InfiniCore
      
      * 量化算子支持图
      
      * solve cub version problem and fix code structure
      
      * fix format
      
      * demo131 - remove commented lines
      
      ---------
      Co-authored-by: default avatarxgqdut2016 <kenan_gewei@163.com>
      Co-authored-by: default avatarxgqdut2016 <140036308+xgqdut2016@users.noreply.github.com>
      Co-authored-by: default avatarwooway777 <wooway777@gmail.com>
      eb89439d
  4. 04 Feb, 2026 1 commit
  5. 27 Jan, 2026 1 commit
  6. 12 Jan, 2026 1 commit
  7. 08 Jan, 2026 1 commit
  8. 30 Dec, 2025 1 commit
  9. 29 Dec, 2025 1 commit
  10. 26 Dec, 2025 2 commits
  11. 25 Dec, 2025 1 commit
  12. 24 Dec, 2025 1 commit
  13. 22 Dec, 2025 1 commit
  14. 17 Dec, 2025 1 commit
  15. 22 Nov, 2025 2 commits
  16. 21 Nov, 2025 1 commit
  17. 19 Nov, 2025 1 commit
  18. 28 Oct, 2025 1 commit
  19. 23 Oct, 2025 1 commit
  20. 16 Oct, 2025 1 commit
  21. 29 Sep, 2025 2 commits
  22. 23 Sep, 2025 1 commit
  23. 18 Sep, 2025 1 commit
  24. 17 Sep, 2025 1 commit
  25. 16 Sep, 2025 1 commit
  26. 10 Sep, 2025 1 commit
  27. 02 Sep, 2025 1 commit
  28. 14 Aug, 2025 1 commit
  29. 13 Aug, 2025 1 commit
  30. 14 Jul, 2025 1 commit
  31. 11 Jul, 2025 1 commit
  32. 09 Jul, 2025 1 commit
  33. 07 Jul, 2025 1 commit
  34. 04 Jul, 2025 1 commit
  35. 01 Jul, 2025 1 commit
    • 蒋帅宏(Shuaihong_Jiang)'s avatar
      issue/254: 添加算子在CPU和CUDA上对BF16的支持,并增加相应的测试代码 (#255) · f88d4ad8
      蒋帅宏(Shuaihong_Jiang) authored
      
      
      * issue/254: 添加算子在CPU和CUDA上对BF16的支持,并增加相应的测试代码
      
      * issue/254: 将修改后的算子格式化后重新提交
      
      * 修改与最新main的冲突
      
      * 解决冲突后rms_norm原本的精度过不了了,现在由
      {"atol": 5e-3, "rtol": 5e-3}更改为
      {"atol": 8e-3, "rtol": 8e-3}
      
      * rms_norm在debug模式下FP16的测试用例失败了(本地测试能通过,github上过不了),
      所以将容差增大了两倍进行测试
      
      * 将rms_normd的测试输入缩放0.5,将容差改回原始值来进行ci测试
      
      * issue/254: 1.使用CHECK_DTYPE宏来进行数据类型检验
      2.在test的utils.py中添加了设备对BF16支持的检验
      
      * issue/254: rms_norm测试fp16容差由
      torch.float16: {"atol": 1e-3, "rtol": 1e-3},
      改为torch.float16: {"atol": 2e-3, "rtol": 2e-3},
      并删除对输入0.5的放缩
      
      * issue/254: 在utils.py中debug方法和debug_all方法中
      添加了对BF16的特判
      
      * 修改支持BF16测试的设备类型检查方法
      
      * 修改支持BF16测试的设备检查
      
      * issue/254: reduce redundancy in rms_norm.py
      
      * issue/254: add back the missing comment in rms_norm.py
      
      * issue/254: add fp32 tolerance condition in causal_softmax.py
      
      ---------
      Co-authored-by: default avatarZimin Li <coollizimin@gmail.com>
      f88d4ad8
  36. 20 Jun, 2025 1 commit