1、同步到最新版本;2、增加batch推理接口;3、解决内存泄漏问题;4、修复llama系列流式输出不流畅的问题
Showing
This diff is collapsed.
src/models/glm.cpp
0 → 100644
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
test/ops/cppOps.cpp
0 → 100644
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
tools/scripts/glm_export.py
0 → 100644
This diff is collapsed.
Please register or sign in to comment