Explore GitLab
Discover projects, groups and snippets. Share your projects with others
-
-
Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules.
-
Qwen2-Audio能够接受各种音频信号输入,并根据语音指令执行音频分析或直接响应文本。
-
phi-4 是一个 140 亿参数的模型,基于 Transformer 架构,专注于提升推理和 STEM 领域的问答能力。
-
-
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
-
unsloth框架基于triton优化模型训练速度和显存占用,使用Unsloth微调Mistral、Gemma、Llama时,速度可提高2-5倍,内存使用可减少70%!
-
VibeVoice 是一个新颖的框架,旨在从文本生成富有表现力的长篇多说话人对话音频,例如播客。
-
-
-
-
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training. -
-
RepViT在iPhone 12上以1ms的延迟实现了超过80%的top-1准确率,本算法基于RepViT进一步优化后准确率超过82%。
-
-
An open-source computational fluid dynamics software accelerated by GPU called OpenCFD-SCU for direct numerical simulation of compressible turbulence. 一款使用了GPU加速主要用于可压缩湍流直接数值模拟的开源软件,OpenCFD-SCU。
-
Personal TTS,即个性化语音合成,以阿里的KAN-TTS框架实现
-
-