docs: update README and README_zh-CN to reflect changes in parameter names for...

docs: update README and README_zh-CN to reflect changes in parameter names for tensor parallelism and data parallelism

docs: update README and README_zh-CN to reflect changes in parameter names for...
docs: update README and README_zh-CN to reflect changes in parameter names for tensor parallelism and data parallelism
9a37dca5 · myhloli · 997cc9d8 · 9a37dca5 · 9a37dca5
Commit 9a37dca5 authored Jul 06, 2025 by myhloli
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 6 deletions

README.md README.md +3 -3

README_zh-CN.md README_zh-CN.md +3 -3

No files found.
--- a/README.md
+++ b/README.md
@@ -737,9 +737,9 @@ mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1
 > Below are some suggestions and notes for using the sglang acceleration mode:  
 > - The sglang acceleration mode currently supports operation on Turing architecture GPUs with a minimum of 8GB VRAM, but you may encounter VRAM shortages on GPUs with less than 24GB VRAM. You can optimize VRAM usage with the following parameters:  
 >   - If running on a single GPU and encountering VRAM shortage, reduce the KV cache size by setting `--mem-fraction-static 0.5`. If VRAM issues persist, try lowering it further to `0.4` or below.  
->   - If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode: `--tp 2`  
+>   - If you have more than one GPU, you can expand available VRAM using tensor parallelism (TP) mode: `--tp-size 2`  
 > - If you are already successfully using sglang to accelerate VLM inference but wish to further improve inference speed, consider the following parameters:  
->   - If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode: `--dp 2`  
+>   - If using multiple GPUs, increase throughput using sglang's multi-GPU parallel mode: `--dp-size 2`  
 >   - You can also enable `torch.compile` to accelerate inference speed by about 15%: `--enable-torch-compile`  
 > - For more information on using sglang parameters, please refer to the [sglang official documentation](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)  
 > - All sglang-supported parameters can be passed to MinerU via command-line arguments, including those used with the following commands: `mineru`, `mineru-sglang-server`, `mineru-gradio`, `mineru-api`
@@ -761,7 +761,7 @@ mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1
 > - Below are some possible use cases:  
 >   - If you have multiple GPUs and need to specify GPU 0 and GPU 1 to launch 'sglang-server' in multi-GPU mode, you can use the following command:  
 >   ```bash
->   CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp 2
+>   CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
 >   ```
 >   - If you have multiple GPUs and need to launch two `fastapi` services on GPU 0 and GPU 1 respectively, listening on different ports, you can use the following commands:  
 >   ```bash

--- a/README_zh-CN.md
+++ b/README_zh-CN.md
@@ -726,9 +726,9 @@ mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1
 > 以下是一些使用sglang加速模式的建议和注意事项：
 > - sglang加速模式目前支持在最低8G显存的Turing架构显卡上运行，但在显存<24G的显卡上可能会遇到显存不足的问题, 可以通过使用以下参数来优化显存使用：
 >   - 如果您使用单张显卡遇到显存不足的情况时，可能需要调低KV缓存大小，`--mem-fraction-static 0.5`，如仍出现显存不足问题，可尝试进一步降低到`0.4`或更低。
->   - 如您有两张以上显卡，可尝试通过张量并行（TP）模式简单扩充可用显存：`--tp 2`
+>   - 如您有两张以上显卡，可尝试通过张量并行（TP）模式简单扩充可用显存：`--tp-size 2`
 > - 如果您已经可以正常使用sglang对vlm模型进行加速推理，但仍然希望进一步提升推理速度，可以尝试以下参数：
->   - 如果您有超过多张显卡，可以使用sglang的多卡并行模式来增加吞吐量：`--dp 2`
+>   - 如果您有超过多张显卡，可以使用sglang的多卡并行模式来增加吞吐量：`--dp-size 2`
 >   - 同时您可以启用`torch.compile`来将推理速度加速约15%：`--enable-torch-compile`
 > - 如果您想了解更多有关`sglang`的参数使用方法，请参考 [sglang官方文档](https://docs.sglang.ai/backend/server_arguments.html#common-launch-commands)
 > - 所有sglang官方支持的参数都可用通过命令行参数传递给 MinerU，包括以下命令:`mineru`、`mineru-sglang-server`、`mineru-gradio`、`mineru-api`
@@ -750,7 +750,7 @@ mineru -p <input_path> -o <output_path> -b vlm-sglang-client -u http://127.0.0.1
 > - 以下是一些可能的使用场景：
 >   - 如果您有多张显卡，需要指定卡0和卡1，并使用多卡并行来启动'sglang-server'，可以使用以下命令：
 >   ```bash
->   CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp 2
+>   CUDA_VISIBLE_DEVICES=0,1 mineru-sglang-server --port 30000 --dp-size 2
 >   ```
 >   - 如果您有多张显卡，需要在卡0和卡1上启动两个`fastapi`服务，并分别监听不同的端口，可以使用以下命令：
 >   ```bash