- 05 Mar, 2026 2 commits
-
-
PanZezhong authored
-
PanZezhong authored
-
- 11 Feb, 2026 1 commit
-
-
qinyiqun authored
* issue/204 - support graph in server scripts * issue/208 - adapt to ali ppu * issue/194 - add quantization modify configs accordingly 支持nv w8 1batch 1tp 增加json支持 InfiniLM 增加量化层和global config 以一种比较优雅的方式增加了quant config的支持 修改部分代码结构,删除无用代码 跟随inifnicore修改 删除所有的model_config,统一使用global_config 跟随InfiniLM最新代码修改 修改函数参数顺序 改名global config 为model config Refactor: add new API alongside legacy interfaces with deprecation warnings 添加w4 inifnicore相关内容,以及将Quantization config划入InfiniCore 添加w4 inifnicore相关内容,以及将Quantization config划入InfiniCore * issue/175 - qy device support qy_page_131: add qy device success qy inference_server.py * Issue/170 - Add HYGON support and improve device type handling. * Issue/193: feats for deployment Signed-off-by:
Ceng23333 <441651826@qq.com> * skip responding eos token Signed-off-by:
Ceng23333 <441651826@qq.com> * issue/143 use add_rmsnorm, nt flash attn, nt kv caching * issue/204 - support graph in server scripts * issue/208 - adapt to ali ppu * rebase main * issue/216 feat: support static kv cache in server * fix llm server cache config * demo131 - resolve mishandled conflicts * demo131 - further adjust attn and caching logic * demo131 - resolve merge requirements --------- Signed-off-by:
Ceng23333 <441651826@qq.com> Co-authored-by:
wooway777 <wooway777@gmail.com> Co-authored-by:
xgqdut2016 <kenan_gewei@163.com> Co-authored-by:
gongchensu <zhuyue_134@qq.com> Co-authored-by:
Ceng23333 <441651826@qq.com> Co-authored-by:
PanZezhong <panzezhong@qiyuanlab.com> Co-authored-by:
MaYuhang <2902139028@qq.com>
-
- 10 Feb, 2026 1 commit
-
-
PanZezhong authored
-
- 08 Jan, 2026 1 commit
-
-
PanZezhong authored
-
- 06 Jan, 2026 1 commit
-
-
PanZezhong authored
-
- 26 Dec, 2025 1 commit
-
-
PanZezhong authored
-
- 23 Dec, 2025 2 commits
-
-
PanZezhong authored
-
Jiacheng Huang authored
-
- 08 Dec, 2025 2 commits
- 02 Dec, 2025 1 commit
-
-
Ceng23333 authored
Signed-off-by:Ceng23333 <441651826@qq.com>
-