@@ -84,6 +84,14 @@ Overall, with these optimizations, we have achieved up to a 7x acceleration in o
## FAQ
**Question**: What should I do if model loading takes too long and NCCL timeout occurs?
1.**Question**: What should I do if model loading takes too long and NCCL timeout occurs?
Answer: You can try to add `--dist-timeout 3600` when launching the model, this allows for 1-hour timeout.i
**Answer**: You can try to add `--dist-timeout 3600` when launching the model, this allows for 1-hour timeout.
2.**Question**: How to use quantized DeepSeek models?
**Answer**: DeepSeek's MLA does not have support for quantization. You need to add the `--disable-mla` flag to run the quantized model successfully. Meanwhile, AWQ does not support BF16, so add the `--dtype half` flag if AWQ is used for quantization. One example is as follows: