[Feat] Enable T5 inference and offload overlap for improved efficiency (#423)
Co-authored-by:
gushiqiao <975033167@qq.ocm>
Showing
configs/seko_talk/L40s/2gpu/seko_talk_bf16.json
100644 → 100755
configs/seko_talk/L40s/4gpu/seko_talk_bf16.json
100644 → 100755
configs/seko_talk/L40s/8gpu/seko_talk_bf16.json
100644 → 100755
This diff is collapsed.
Please register or sign in to comment