demo131 - multiple issues regarding quatization, qy, etc. (71c70586) · Commits · jerrrrry / infinilm

Unverified Commit 71c70586 authored Feb 11, 2026 by

qinyiqun Committed by GitHub Feb 11, 2026

demo131 - multiple issues regarding quatization, qy, etc.



* issue/204 - support graph in server scripts

* issue/208 - adapt to ali ppu

* issue/194 - add quantization modify configs accordingly

支持nv w8 1batch 1tp

增加json支持

InfiniLM 增加量化层和global config

以一种比较优雅的方式增加了quant config的支持

修改部分代码结构，删除无用代码

跟随inifnicore修改

删除所有的model_config，统一使用global_config

跟随InfiniLM最新代码修改

修改函数参数顺序

改名global config 为model config

Refactor: add new API alongside legacy interfaces with deprecation warnings

添加w4 inifnicore相关内容，以及将Quantization config划入InfiniCore

添加w4 inifnicore相关内容，以及将Quantization config划入InfiniCore

* issue/175 - qy device support

qy_page_131: add qy device

success qy inference_server.py

* Issue/170 - Add HYGON support and improve device type handling.

* Issue/193: feats for deployment
Signed-off-by: Ceng23333 <441651826@qq.com>

* skip responding eos token
Signed-off-by: Ceng23333 <441651826@qq.com>

* issue/143 use add_rmsnorm, nt flash attn, nt kv caching

* issue/204 - support graph in server scripts

* issue/208 - adapt to ali ppu

* rebase main

* issue/216 feat: support static kv cache in server

* fix llm server cache config

* demo131 - resolve mishandled conflicts

* demo131 - further adjust attn and caching logic

* demo131 - resolve merge requirements

---------
Signed-off-by: Ceng23333 <441651826@qq.com>
Co-authored-by: wooway777 <wooway777@gmail.com>
Co-authored-by: xgqdut2016 <kenan_gewei@163.com>
Co-authored-by: gongchensu <zhuyue_134@qq.com>
Co-authored-by: Ceng23333 <441651826@qq.com>
Co-authored-by: PanZezhong <panzezhong@qiyuanlab.com>
Co-authored-by: MaYuhang <2902139028@qq.com>

parent ee59b3f5

Hide whitespace changes

Inline Side-by-side

json @ 5ed07097

Subproject commit 5ed07097faa6c50199c4a3b66e5ed37d4fbfccc2

View file @ 71c70586

@@ -6,6 +6,7 @@ set_toolchains("gcc")
 -- Add spdlog from third_party directory
 add_includedirs("third_party/spdlog/include")
 add_includedirs("third_party/json/single_include/")
 target("infinicore_infer")
     set_kind("shared")
-...

Please register or to comment