Commits · 96e53dbb5a3f2582b0d3985c4455aee045f6de70 · jerrrrry / infinilm

29 Dec, 2025 1 commit

issue/160: 梳理 InferEngine 相关接口 · 96e53dbb

Jiacheng Huang authored Dec 29, 2025

* 将 `cpp.LlamaForCausalLM` 提出，变为 `infinilm.infer_engine.InferEngine`

* 将 `Config` 构造逻辑拆分至 `AutoConfig` 中

* 在 `examples` 脚本中直接构造 `InferEngine`

* 将 `random_sample` 计算放入模型中

* 为 `InferEngine` 单独实现 `generate`

* 允许通过 `GenerationConfig` 传递 `temperature`、`top_k`、`top_p`

* 将 `random_sample` 处理从 `LlamaForCausalLM` 中转移到 `RankWorker` 里

* 在 `InferEngine.generate` 中直接 `append(output_id)`

* 修复 commit `13aa90c57de369f9985593c0066b6b06a7508b24` 引入的分布式卡死问题

* 将 `InferEngine.forward` 的接口与 C++ 层的 `InferEngine.Input` 对齐

* 提供了 `_measure_and_log_time` 参数来开启之前的 `generate` 内部计时功能

96e53dbb

26 Dec, 2025 1 commit
- issue/125 添加Paged KV Cache接口 · f147eb02
  PanZezhong authored Dec 26, 2025
  
  f147eb02
23 Dec, 2025 1 commit
- issue/125 统一Cache接口 · ff00b5c8
  PanZezhong authored Dec 23, 2025
  
  ff00b5c8
19 Dec, 2025 1 commit
- issue/135: 统一 `InferEngine::forward` 和 `Model::forward` 接口 · 5b5ff780
  Jiacheng Huang authored Dec 19, 2025
  
  5b5ff780
17 Dec, 2025 1 commit
- issue/134: 统一模型配置 · cdce626e
  Jiacheng Huang authored Dec 17, 2025
  
  cdce626e
11 Dec, 2025 1 commit
- Merge pull request #124 from InfiniTensor/issue/121 · 6498332e
  thatPepe authored Dec 11, 2025
```
Issue/121 - cache managements
```
  6498332e
09 Dec, 2025 1 commit
- issue/114 - 添加读取.bin文件权重的代码，更新readme · 300470cb
  pengcheng888 authored Dec 09, 2025
  
  300470cb
08 Dec, 2025 1 commit
- issue/106 适配模型9G7B · 9c256a17
  Ceng authored Dec 08, 2025
  
  9c256a17
07 Dec, 2025 1 commit
- issue/102 - 添加逐文件和逐tensor从model.savetensor读取权重的函数 · 7128a9a5
  pengcheng888 authored Dec 07, 2025
  
  7128a9a5
06 Dec, 2025 2 commits
- issue/92 将run改为异步 · 9c4020a4
  PanZezhong authored Dec 06, 2025
  
  9c4020a4
- issue/92 添加InferEngine，支持多线程推理 · 3d328d61
  PanZezhong1725 authored Dec 06, 2025
  
  3d328d61