misc: speedup load safetensors (#1319)

Co-authored-by: ispobock <ISPObaoke@163.com>

misc: speedup load safetensors (#1319)
Co-authored-by: ispobock <ISPObaoke@163.com>
dc67d976 · Yineng Zhang · GitHub · 1e495e08 · dc67d976
Unverified Commit dc67d976 authored Sep 04, 2024 by Yineng Zhang Committed by GitHub Sep 04, 2024
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 0 deletions

python/sglang/srt/model_executor/model_runner.py python/sglang/srt/model_executor/model_runner.py +1 -0

No files found.
--- a/python/sglang/srt/model_executor/model_runner.py
+++ b/python/sglang/srt/model_executor/model_runner.py
@@ -162,6 +162,7 @@ class ModelRunner:
        return min_per_gpu_memory
    def load_model(self):
+        torch.set_num_threads(1)
        logger.info(
            f"Load weight begin. avail mem={get_available_gpu_memory(self.gpu_id):.2f} GB"
        )