Explore GitLab
Discover projects, groups and snippets. Share your projects with others
-
Stable Diffusion 3 Medium is a Multimodal Diffusion Transformer (MMDiT) text-to-image model.
-
-
X-Decoder: Generalized Decoding for Pixel, Image, and Language
-
-
骨干网络仅含0.45B参数,支持口音强度控制,适于实时语音交互,能满足不同场景下对语音口音克隆的多样化需求。
-
实时开放词汇目标检测模型YOLO-World的训练、推理
-
A Family of Open Large Multimodal Models
-
-
-
End-to-End Object Detection with Transformers
-
-
-
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
-
一个基于VITS简单易用的变声框架,使用少量数据进行训练也能得到较好结果,方便直播娱乐。
-
-
-
-
-
-