multi-person & animate & podcast (#554)
- 服务化功能新增(前端+后端):
1、seko-talk 模型支持多人输入
2、支持播客合成与管理
3、支持wan2.2 animate 模型
- 后端接口新增:
1、 基于火山的播客websocket合成接口,支持边合成边听
2、播客的查询管理接口
3、基于 yolo 的多人人脸检测接口
4、音频多人切分接口
- 推理代码侵入式修改
1、将 animate 相关的 输入文件路径(mask/image/pose等)从固定写死的config中移除到可变的input_info中
2、animate的预处理相关代码包装成接口供服务化使用
@xinyiqin
---------
Co-authored-by:
qinxinyi <qxy118045534@163.com>
Showing
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Please register or sign in to comment