ggml: Disable unused pipeline parallelism
We're not currently using it, even in cases where we could. Disabling it improves generation performance by 10-30% with multiple GPUs.
Showing
Please register or sign in to comment
We're not currently using it, even in cases where we could. Disabling it improves generation performance by 10-30% with multiple GPUs.