Commit 64710242 authored by dengjb's avatar dengjb
Browse files

update

parents
MIT License
Copyright (c) 2026 Meituan
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
\ No newline at end of file
# LongCat-Next_pytorch
## 论文
[LongCat-Next Technical Report](https://github.com/meituan-longcat/LongCat-Next/blob/main/tech_report.pdf)
## 模型简介
G本工作主要通过一种强调简洁性的设计理念来解决原生多模态的根本障碍,即将视觉和音频视为语言的内在延伸。作为实现这一目标的重要一步,我们提出了 LongCat-Next——一个离散原生多模态模型,它在离散框架内实现了工业级性能,同时在众多专业领域保持高度竞争力。该模型基于 LongCat-Flash-Lite MoE 主干网络(A3B)作为_多任务_学习器,将语言、视觉和音频统一于单一的离散框架之中。本文的主要贡献如下:
- 🌟 离散原生自回归范式(DiNA)。
我们提出了 DiNA,这是一种统一的范式,将语言中的下一个 token 预测扩展至原生多模态领域,将多种模态内化到共享的 token 空间中。该范式通过构建模态感知的分词器-反分词器对,并利用大语言模型成熟的训练基础设施,简化了多模态建模过程。
- 🌟 离散视觉表示的语义完整性。
我们通过将语义对齐编码器(Semantic-and-Aligned Encoders, SAE)与残差向量量化(Residual Vector Quantization, RVQ)相结合,改进了离散视觉建模。这种集成创建了分层的离散 token,既保留了语义抽象,又保留了细粒度的视觉细节,超越了传统表示方法的局限性。
- 🌟 离散原生分辨率视觉 Transformer(dNaViT)。
类比于语言 tokenizer,我们提出了 dNaViT,作为一种高度灵活、统一的视觉离散接口,它将语义特征提取为“视觉词”,构建了一个支持动态分词与反分词的分层表示空间。dNaViT 能无缝集成到大语言模型中,在不造成性能下降的前提下确保高性能。
- 🌟 在统一模型中实现卓越的看、创、说能力。
在 DiNA 框架内,视觉理解与生成被优雅地重新表述为同一预测过程的两种表现形式,且不牺牲性能。该表述弥合了长期以来的架构鸿沟,同时在传统上相互竞争的目标之间引入极小的干扰,并保留了核心的语言能力。值得注意的是,LongCat-Next 在理解任务上达到了与专用理解模型相当的性能,即使在 28 倍压缩率下仍保持强大的生成质量(尤其在文本渲染方面),同时在高级语音理解、低延迟语音对话和可定制语音克隆方面也表现出色。
<div align=center>
<img src="./assets/overview.png"/>
</div>
## 环境依赖
| 软件 | 版本 |
| :------: | :------: |
| DTK | dtk25.04 |
| python |3.10 |
| transformers | 5.3.0 |
| torch | 2.5.1+das.opt1.dtk25042 |
| torchaudio | 2.5.1+das.opt1.dtk25042 |
推荐使用镜像: image.sourcefind.cn:5000/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.2-py3.10
挂载地址-v 根据实际模型情况修改
```bash
docker run -it \
--shm-size 200g \
--network=host \
--name LongCat-Next \
--privileged \
--device=/dev/kfd \
--device=/dev/dri \
--device=/dev/mkfd \
--group-add video \
--cap-add=SYS_PTRACE \
--security-opt seccomp=unconfined \
-u root \
-v /opt/hyhal/:/opt/hyhal/:ro \
-v /path/your_code_data/:/path/your_code_data/ \
image.sourcefind.cn:5000/dcu/admin/base/vllm:0.9.2-ubuntu22.04-dtk25.04.2-py3.10 bash
```
更多镜像可前往[光源](https://sourcefind.cn/#/service-list)下载使用。
关于本项目DCU显卡所需的特殊深度学习库可从[光合](https://developer.sourcefind.cn/tool/)开发者社区下载安装
需要单独安装:
```
pip install -r requirements.txt
```
## 数据集
暂无
## 训练
暂无
## 推理
### pytorch
#### 单机推理
推理脚本参考
```
HIP_VISIBLE_DEVICES=0,1,2,3 python longcat-next_inference.py
```
## 效果展示
|:------:|:----:|
| 输入 | 输出 |
|<div align=center> <img src="./assets/book.png"/></div>| <div align=center> <img src="./assets/result.jpg"/></div> |
### 精度
DCU与GPU精度一致,推理框架:vllm。
## 预训练权重
| 模型名称 | 权重大小 | DCU型号 | 最低卡数需求 | 下载地址 |
|:------:|:----:|:----------:|:------:|:---------------------:|
| LongCat-Next | 68.5B | BW1000 | 4 | [Model Scope](https://www.modelscope.cn/models/meituan-longcat/LongCat-Next) |
## 源码仓库及问题反馈
- https://developer.sourcefind.cn/codes/modelzoo/longcat-next_pytorch
## 参考资料
- https://www.modelscope.cn/models/meituan-longcat/LongCat-Next
- https://github.com/meituan-longcat/LongCat-Next
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" fill="none" version="1.1" width="379.2490234375" height="100" viewBox="0 0 379.2490234375 100"><g><g><path d="M118,71.72L148.78,71.72L148.78,63.8L127.72,63.8L127.72,29.719997L118,29.719997L118,71.72ZM168.7,72.2C178.96,72.2,186.4,65.3,186.4,55.58C186.4,45.86,178.96,38.96,168.7,38.96C158.44,38.96,150.94,45.86,150.94,55.58C150.94,65.3,158.44,72.2,168.7,72.2ZM168.7,64.52000000000001C164.02,64.52000000000001,160.42000000000002,61.16,160.42000000000002,55.58C160.42000000000002,50,164.02,46.64,168.7,46.64C173.38,46.64,176.92000000000002,50,176.92000000000002,55.58C176.92000000000002,61.16,173.38,64.52000000000001,168.7,64.52000000000001ZM212.26,38.96C207.88,38.96,204.1,40.46,201.57999999999998,43.22L201.57999999999998,39.44L192.64,39.44L192.64,71.72L202,71.72L202,55.760000000000005C202,49.82,205.24,47.06,209.74,47.06C213.88,47.06,216.28,49.46,216.28,54.68L216.28,71.72L225.64,71.72L225.64,53.239999999999995C225.64,43.4,219.88,38.96,212.26,38.96ZM258.88,39.44L258.88,43.58C256.41999999999996,40.46,252.7,38.96,248.2,38.96C239.26,38.96,232.06,45.14,232.06,54.56C232.06,63.98,239.26,70.16,248.2,70.16C252.4,70.16,255.94,68.84,258.4,66.14L258.4,67.52000000000001C258.4,73.34,255.52,76.34,248.92,76.34C244.78,76.34,240.28,74.9,237.51999999999998,72.68L233.8,79.4C237.57999999999998,82.34,243.51999999999998,83.84,249.64,83.84C261.28,83.84,267.76,78.32,267.76,66.32L267.76,39.44L258.88,39.44ZM250.06,62.48C245.14,62.48,241.54000000000002,59.3,241.54000000000002,54.56C241.54000000000002,49.82,245.14,46.64,250.06,46.64C254.98,46.64,258.52,49.82,258.52,54.56C258.52,59.3,254.98,62.48,250.06,62.48ZM297.22,72.44C304.53999999999996,72.44,310.6,69.8,314.56,64.94L308.32,59.18C305.5,62.48,301.96000000000004,64.16,297.76,64.16C289.9,64.16,284.32,58.64,284.32,50.72C284.32,42.8,289.9,37.28,297.76,37.28C301.96000000000004,37.28,305.5,38.96,308.32,42.2L314.56,36.44C310.6,31.64,304.53999999999996,29,297.28,29C284.2,29,274.48,38.06,274.48,50.72C274.48,63.38,284.2,72.44,297.22,72.44ZM333.22,38.96C328.06,38.96,322.84000000000003,40.34,319.3,42.86L322.65999999999997,49.4C325,47.54,328.53999999999996,46.4,331.96000000000004,46.4C337,46.4,339.4,48.739999999999995,339.4,52.760000000000005L331.96000000000004,52.760000000000005C322.12,52.760000000000005,318.1,56.72,318.1,62.42C318.1,68,322.6,72.2,330.15999999999997,72.2C334.9,72.2,338.26,70.64,340,67.7L340,71.72L348.76,71.72L348.76,53.3C348.76,43.519999999999996,343.06,38.96,333.22,38.96ZM332.5,65.9C329.2,65.9,327.22,64.34,327.22,62C327.22,59.84,328.6,58.22,332.98,58.22L339.4,58.22L339.4,61.52C338.32,64.46000000000001,335.62,65.9,332.5,65.9ZM375.52,63.56C374.5,64.34,373.12,64.75999999999999,371.74,64.75999999999999C369.22,64.75999999999999,367.72,63.26,367.72,60.5L367.72,47.36L375.76,47.36L375.76,40.16L367.72,40.16L367.72,32.3L358.36,32.3L358.36,40.16L353.38,40.16L353.38,47.36L358.36,47.36L358.36,60.620000000000005C358.36,68.3,362.8,72.2,370.41999999999996,72.2C373.3,72.2,376.12,71.53999999999999,378.04,70.16L375.52,63.56Z" fill="#000000" fill-opacity="1"/><path d="M280.279,67.0531Q287.015,73.44,297.22,73.44Q308.924,73.44,315.33500000000004,65.57169999999999L315.929,64.8429L308.235,57.7404L307.56,58.5303Q303.603,63.16,297.76,63.16Q292.26800000000003,63.16,288.798,59.7018Q285.32,56.2342,285.32,50.72Q285.32,45.2057,288.798,41.7382Q292.26800000000003,38.28,297.76,38.28Q303.582,38.28,307.56600000000003,42.8565L308.24199999999996,43.6332L315.933,36.53324L315.331,35.80362Q308.89300000000003,28,297.28,28Q287.03499999999997,28,280.288,34.38623Q273.48,40.8296,273.48,50.72Q273.48,60.605599999999995,280.279,67.0531ZM117,28.719997L117,72.72L149.78,72.72L149.78,62.8L128.72,62.8L128.72,28.719997L117,28.719997ZM314.56,64.94C310.6,69.8,304.53999999999996,72.44,297.22,72.44C284.2,72.44,274.48,63.38,274.48,50.72C274.48,38.06,284.2,29,297.28,29C304.53999999999996,29,310.6,31.64,314.56,36.44L309.055,41.5214L308.32,42.2C308.098,41.9454,307.872,41.7004,307.642,41.4651C304.94,38.70632,301.63,37.28,297.76,37.28C289.9,37.28,284.32,42.8,284.32,50.72C284.32,58.64,289.9,64.16,297.76,64.16C301.63300000000004,64.16,304.945,62.7313,307.648,59.921099999999996C307.877,59.6839,308.101,59.436800000000005,308.32,59.18L309.055,59.8588L314.56,64.94ZM127.72,29.719997L127.72,63.8L148.78,63.8L148.78,71.72L118,71.72L118,29.719997L127.72,29.719997ZM375.976,61.9521L379.249,70.5226L378.624,70.97200000000001Q375.524,73.2,370.41999999999996,73.2Q364.23900000000003,73.2,360.848,70.0193Q357.36,66.7466,357.36,60.620000000000005L357.36,48.36L352.38,48.36L352.38,39.16L357.36,39.16L357.36,31.3L368.72,31.3L368.72,39.16L376.76,39.16L376.76,48.36L368.72,48.36L368.72,60.5Q368.72,63.76,371.74,63.76Q373.61199999999997,63.76,374.913,62.7656L375.976,61.9521ZM358.36,60.620000000000005L358.36,47.36L353.38,47.36L353.38,40.16L354.38,40.16L358.36,40.16L358.36,33.3L358.36,32.3L367.72,32.3L367.72,39.16L367.72,40.16L368.72,40.16L374.76,40.16L375.76,40.16L375.76,47.36L367.72,47.36L367.72,60.5C367.72,63.26,369.22,64.75999999999999,371.74,64.75999999999999C372.76,64.75999999999999,373.78,64.53059999999999,374.654,64.0961C374.963,63.9426,375.254,63.7636,375.52,63.56L375.89,64.5282L378.04,70.16C376.12,71.53999999999999,373.3,72.2,370.41999999999996,72.2C362.8,72.2,358.36,68.3,358.36,60.620000000000005ZM212.26,37.96Q218.688,37.96,222.507,41.6951Q226.64,45.737700000000004,226.64,53.239999999999995L226.64,72.72L215.28,72.72L215.28,54.68Q215.28,48.06,209.74,48.06Q206.6429,48.06,204.90460000000002,49.874300000000005Q203,51.8621,203,55.760000000000005L203,72.72L191.64,72.72L191.64,38.44L202.57999999999998,38.44L202.57999999999998,40.9396Q206.4309,37.96,212.26,37.96ZM257.88,38.44L268.76,38.44L268.76,66.32Q268.76,75.61619999999999,263.749,80.3099Q258.91200000000003,84.84,249.64,84.84Q239.16500000000002,84.84,233.186,80.1894L232.511,79.66409999999999L237.221,71.1559L238.147,71.9008Q240.07,73.4481,243.00799999999998,74.3903Q245.969,75.34,248.92,75.34Q253.405,75.34,255.444,73.38980000000001Q257.155,71.7529,257.369,68.4865Q257.371,68.4559,257.373,68.42439999999999Q253.737,71.16,248.2,71.16Q240.957,71.16,236.095,66.62360000000001Q231.06,61.9249,231.06,54.56Q231.06,47.1951,236.095,42.4964Q240.957,37.96,248.2,37.96Q253.842,37.96,257.47,40.772999999999996Q257.678,40.9345,257.88,41.105199999999996L257.88,38.44ZM318.015,42.547200000000004L322.328,50.941500000000005L323.28200000000004,50.1828Q326.783,47.4,331.96000000000004,47.4Q337.76800000000003,47.4,338.33799999999997,51.760000000000005L331.96000000000004,51.760000000000005Q324.53700000000003,51.760000000000005,320.73400000000004,54.657Q317.1,57.425200000000004,317.1,62.42Q317.1,67.16980000000001,320.65,70.1666Q324.243,73.2,330.15999999999997,73.2Q335.892,73.2,339,70.46770000000001L339,72.72L349.76,72.72L349.76,53.3Q349.76,45.6443,345.345,41.7154Q341.124,37.96,333.22,37.96Q324.459,37.96,318.72,42.0453L318.015,42.547200000000004ZM155.291,68.26689999999999Q160.5892,73.2,168.7,73.2Q176.80599999999998,73.2,182.08069999999998,68.2652Q187.4,63.2887,187.4,55.58Q187.4,47.871300000000005,182.08069999999998,42.894800000000004Q176.80599999999998,37.96,168.7,37.96Q160.5891,37.96,155.291,42.893100000000004Q149.94,47.8756,149.94,55.58Q149.94,63.2844,155.291,68.26689999999999ZM186.4,55.58C186.4,65.3,178.96,72.2,168.7,72.2C158.44,72.2,150.94,65.3,150.94,55.58C150.94,45.86,158.44,38.96,168.7,38.96C178.96,38.96,186.4,45.86,186.4,55.58ZM212.26,38.96C208.42610000000002,38.96,205.05180000000001,40.1093,202.57999999999998,42.2469C202.2279,42.5514,201.8942,42.8759,201.57999999999998,43.22L201.57999999999998,39.44L192.64,39.44L192.64,71.72L202,71.72L202,55.760000000000005C202,49.82,205.24,47.06,209.74,47.06C213.88,47.06,216.28,49.46,216.28,54.68L216.28,71.72L225.64,71.72L225.64,53.239999999999995C225.64,43.4,219.88,38.96,212.26,38.96ZM258.4,66.14C258.084,66.48660000000001,257.751,66.81049999999999,257.4,67.1118C255.02,69.1572,251.861,70.16,248.2,70.16C239.26,70.16,232.06,63.98,232.06,54.56C232.06,45.14,239.26,38.96,248.2,38.96C252.127,38.96,255.46,40.1023,257.88,42.4667C258.233,42.8117,258.567,43.1827,258.88,43.58L258.88,39.44L267.76,39.44L267.76,66.32C267.76,78.32,261.28,83.84,249.64,83.84C243.51999999999998,83.84,237.57999999999998,82.34,233.8,79.4L237.029,73.5669L237.51999999999998,72.68C237.776,72.8861,238.048,73.0855,238.332,73.27770000000001C241.113,75.155,245.164,76.34,248.92,76.34C255.504,76.34,258.38599999999997,73.3546,258.4,67.5625L258.4,67.52000000000001L258.4,66.14ZM339.4,52.760000000000005C339.4,52.414100000000005,339.382,52.080600000000004,339.347,51.760000000000005C338.967,48.354600000000005,336.56600000000003,46.4,331.96000000000004,46.4C328.919,46.4,325.783,47.3013,323.48,48.8087C323.193,48.9965,322.919,49.193799999999996,322.65999999999997,49.4L322.195,48.4942L319.3,42.86C322.84000000000003,40.34,328.06,38.96,333.22,38.96C343.06,38.96,348.76,43.519999999999996,348.76,53.3L348.76,71.72L340,71.72L340,67.7C339.711,68.1883,339.377,68.6386,339,69.04990000000001C337.106,71.1152,334.113,72.2,330.15999999999997,72.2C322.6,72.2,318.1,68,318.1,62.42C318.1,56.72,322.12,52.760000000000005,331.96000000000004,52.760000000000005L339.4,52.760000000000005ZM160.42000000000002,55.58C160.42000000000002,61.16,164.02,64.52000000000001,168.7,64.52000000000001C173.38,64.52000000000001,176.92000000000002,61.16,176.92000000000002,55.58C176.92000000000002,50,173.38,46.64,168.7,46.64C164.02,46.64,160.42000000000002,50,160.42000000000002,55.58ZM241.54000000000002,54.56C241.54000000000002,59.3,245.14,62.48,250.06,62.48C254.98,62.48,258.52,59.3,258.52,54.56C258.52,49.82,254.98,46.64,250.06,46.64C245.14,46.64,241.54000000000002,49.82,241.54000000000002,54.56ZM161.42000000000002,55.58Q161.42000000000002,51.8909,163.52429999999998,49.712199999999996Q165.5257,47.64,168.7,47.64Q171.8645,47.64,173.8422,49.708600000000004Q175.92000000000002,51.8818,175.92000000000002,55.58Q175.92000000000002,59.2782,173.8422,61.4514Q171.8645,63.52,168.7,63.52Q165.5257,63.52,163.52429999999998,61.4478Q161.42000000000002,59.2691,161.42000000000002,55.58ZM242.54000000000002,54.56Q242.54000000000002,51.4744,244.63,49.56Q246.727,47.64,250.06,47.64Q253.384,47.64,255.456,49.5566Q257.52,51.4658,257.52,54.56Q257.52,57.6542,255.456,59.5634Q253.384,61.48,250.06,61.48Q246.727,61.48,244.63,59.56Q242.54000000000002,57.6456,242.54000000000002,54.56ZM327.22,62C327.22,64.34,329.2,65.9,332.5,65.9C335.62,65.9,338.32,64.46000000000001,339.4,61.52L339.4,58.22L332.98,58.22C328.6,58.22,327.22,59.84,327.22,62ZM338.4,59.22L338.4,61.3361Q338.252,61.712,338.06899999999996,62.0482Q336.519,64.9,332.5,64.9Q330.38300000000004,64.9,329.224,64.0283Q328.22,63.2736,328.22,62Q328.22,59.22,332.98,59.22L338.4,59.22Z" fill-rule="evenodd" fill="#FFFFFF" fill-opacity="1"/></g><g><g></g><g><path d="M4.6261399999999995,81.99955625C3.305082,81.99955625,2.346943,80.74145625,2.6980898,79.46785625L19.566,18.29138625C20.287,15.67620525,23.347,14.52206525,25.6155,16.00962825L47.8104,30.56365625C49.1404,31.43575625,50.8606,31.43715625,52.192,30.56705625L74.481,16.00164425C76.7516,14.51787825,79.8092,15.67628425,80.5267,18.29205625L97.3064,79.47045625C97.6555,80.74345625,96.6976,81.99955625,95.3776,81.99955625L73.9544,81.99955625C77.8543,77.48405625000001,80.0001,71.71665625,80.0001,65.75015625L80.0001,65.05385625C80.0001,59.21775625,77.8771,53.58085625,74.0273,49.19455625L71.2779,35.39425625C71.1164,34.58345625,70.4047,33.99955625,69.5779,33.99955625C69.2028,33.99955625,68.8379,34.12115625,68.5378,34.34615625L58.1719,42.120656249999996C57.4337,42.67425625,56.4789,42.85065625,55.5916,42.59705625C51.937,41.55295625,48.0631,41.55295625,44.4085,42.59705625C43.5212,42.85065625,42.5664,42.67425625,41.8282,42.120656249999996L31.4577,34.34275625C31.1607,34.11995625,30.7993,33.99955625,30.428,33.99955625C29.6005,33.99955625,28.8908,34.58995625,28.7403,35.40355625L26.0729,49.82405625C22.181,53.78165625,20.0001,59.11005625,20.0001,64.66065625L20.0001,65.97975625000001C20.0001,71.80015625,22.0822,77.42865625,25.87,81.84785625L26.0001,81.99955625L4.6261399999999995,81.99955625Z" fill-rule="evenodd" fill="#29E154" fill-opacity="1"/></g><g><path d="M39,70L45,70L45,56L39.909091,56L39,70ZM61,70L55,70L55,56L60.090900000000005,56L61,70Z" fill="#000000" fill-opacity="1"/><path d="M37.93296,71L46,71L46,55L38.9719199,55L37.93296,71ZM54,71L62.067,71L61.028099999999995,55L54,55L54,71ZM44,70L39,70L39.909091,56L45,56L45,70L44,70ZM60.9351,69L61,70L59.9979,70L56,70L55,70L55,56L60.090900000000005,56L60.9351,69Z" fill-rule="evenodd" fill="#FFFFFF" fill-opacity="1"/></g></g></g></svg>
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoProcessor
# Load model
model_name = "/home/dengjb/download/meituan-longcat/LongCat-Next/"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True,
)
model.eval()
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True, fix_mistral_regex=True)
model.text_tokenizer = tokenizer # Dynamic binding
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
# Set messages
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What book is this?<longcat_img_start>./assets/book.png<longcat_img_end>"}
]
# Apply chat-template
text_input = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
print(f"{text_input=}")
# Preprocessing
text_inputs, visual_inputs, audio_inputs = processor(text=text_input, return_tensors="pt")
text_inputs = text_inputs.to(model.device)
if visual_inputs is not None:
visual_inputs = visual_inputs.to(model.device)
if audio_inputs is not None:
audio_inputs = audio_inputs.to(model.device)
# AR
with torch.no_grad():
outputs = model.generate(
input_ids=text_inputs["input_ids"],
visual_inputs=visual_inputs,
audio_inputs=audio_inputs,
return_dict_in_generate=True,
)
# Text decoding
output_input_ids = outputs.sequences
text_output = tokenizer.decode(output_input_ids[0][len(text_inputs["input_ids"][0]):], skip_special_tokens=True)
print(f"{text_output=}")
# Images decoding
output_visual_ids = outputs.visual_ids
if output_visual_ids.size(0) > 0:
image_path_list = model.model.decode_visual_ids_and_save(
output_visual_ids,
save_prefix="./output_image",
**model.generation_config.visual_generation_config["custom_params"],
)
print(f"{image_path_list=}")
# Audio decoding
output_audio_text_ids = outputs.audio_text_ids
output_audio_ids = outputs.audio_ids
if output_audio_text_ids.size(-1) > 0:
audio_text = tokenizer.decode(output_audio_text_ids[0], skip_special_tokens=True)
print(f"{audio_text=}")
if output_audio_ids.size(0) > 0:
audio_path_list = model.model.decode_audio_ids_and_save(
output_audio_ids,
save_prefix="./output_audio",
**model.generation_config.audio_generation_config["custom_params"],
)
print(f"{audio_path_list=}")
\ No newline at end of file
# 模型唯一标识
modelCode=2245
# 模型名称
modelName=longcat-next_pytorch
# 模型描述
modelDescription=LongCat-Next 是一款基于 Meituan 自研的多模态大模型,具有强大的视觉理解能力和自然语言处理能力,能够处理多种模态的输入,并生成高质量的文本输出。
# 运行过程
processType=推理
# 算法类别
appCategory=对话问答
# 框架类型
frameType=pytorch
# 加速卡类型
accelerateType=BW1000
\ No newline at end of file
librosa==0.11.0
diffusers==0.34.0
\ No newline at end of file
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment