Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
wangsen
MinerU
Commits
89c98537
Commit
89c98537
authored
Jun 13, 2025
by
myhloli
Browse files
feat: update Dockerfile and README_zh-CN for mineru installation and model download improvements
parent
8c6a9dba
Changes
4
Hide whitespace changes
Inline
Side-by-side
Showing
4 changed files
with
16 additions
and
103 deletions
+16
-103
README_zh-CN.md
README_zh-CN.md
+4
-4
docker/ascend_npu/Dockerfile
docker/ascend_npu/Dockerfile
+0
-51
docker/china/Dockerfile
docker/china/Dockerfile
+6
-24
docker/global/Dockerfile
docker/global/Dockerfile
+6
-24
No files found.
README_zh-CN.md
View file @
89c98537
...
@@ -476,23 +476,23 @@ https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c
...
@@ -476,23 +476,23 @@ https://github.com/user-attachments/assets/4bea02c9-6d54-4cd6-97ed-dff14340982c
```
bash
```
bash
pip
install
--upgrade
pip
pip
install
--upgrade
pip
pip
install
uv
pip
install
uv
uv pip
install
"mineru[core]>=2.0.0"
-i
https://mirrors.aliyun.com/pypi/simple
uv pip
install
"mineru[core]>=2.0.0"
```
```
您也可以通过源码安装
您也可以通过源码安装
```
bash
```
bash
git clone https://github.com/opendatalab/MinerU.git
git clone https://github.com/opendatalab/MinerU.git
cd
MinerU
cd
MinerU
uv pip
install
-e
.[core]
-i
https://mirrors.aliyun.com/pypi/simple
uv pip
install
-e
.[core]
```
```
如果您需要使用sglang加速vlm模型推理,请直接安装MinerU的完整版本
如果您需要使用sglang加速vlm模型推理,请直接安装MinerU的完整版本
```
bash
```
bash
uv pip
install
"mineru[all]>=2.0.0"
-i
https://mirrors.aliyun.com/pypi/simple
uv pip
install
"mineru[all]>=2.0.0"
```
```
或
或
```
bash
```
bash
uv pip
install
-e
.[all]
-i
https://mirrors.aliyun.com/pypi/simple
uv pip
install
-e
.[all]
```
```
#### 2.使用 MinerU
#### 2.使用 MinerU
...
...
docker/ascend_npu/Dockerfile
deleted
100644 → 0
View file @
8c6a9dba
# Use the official Ubuntu base image
FROM
swr.cn-central-221.ovaijisuan.com/mindformers/mindformers1.2_mindspore2.3:20240722
USER
root
# Set environment variables to non-interactive to avoid prompts during installation
ENV
DEBIAN_FRONTEND=noninteractive
# Update the package list and install necessary packages
RUN
apt-get update
&&
\
apt-get
install
-y
\
software-properties-common
&&
\
add-apt-repository
-y
ppa:deadsnakes/ppa
&&
\
apt-get update
&&
\
apt-get
install
-y
\
python3.10
\
python3.10-venv
\
python3.10-distutils
\
python3.10-dev
\
python3-pip
\
wget
\
git
\
libgl1
\
libglib2.0-0
\
&&
rm
-rf
/var/lib/apt/lists/
*
# Set Python 3.10 as the default python3
RUN
update-alternatives
--install
/usr/bin/python3 python3 /usr/bin/python3.10 1
# Create a virtual environment for MinerU
RUN
python3
-m
venv /opt/mineru_venv
# Copy the configuration file template and install magic-pdf latest
RUN
/bin/bash
-c
"wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/magic-pdf.template.json &&
\
cp magic-pdf.template.json /root/magic-pdf.json &&
\
source /opt/mineru_venv/bin/activate &&
\
pip3 install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple &&
\
pip3 install torch==2.3.1 torchvision==0.18.1 -i https://mirrors.aliyun.com/pypi/simple &&
\
pip3 install -U magic-pdf[full] 'numpy<2' decorator attrs absl-py cloudpickle ml-dtypes tornado einops -i https://mirrors.aliyun.com/pypi/simple &&
\
wget https://gitee.com/ascend/pytorch/releases/download/v6.0.rc2-pytorch2.3.1/torch_npu-2.3.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl &&
\
pip3 install torch_npu-2.3.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl"
# Download models and update the configuration file
RUN
/bin/bash
-c
"source /opt/mineru_venv/bin/activate &&
\
pip3 install modelscope -i https://mirrors.aliyun.com/pypi/simple &&
\
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/scripts/download_models.py -O download_models.py &&
\
python3 download_models.py &&
\
sed -i 's|cpu|npu|g' /root/magic-pdf.json"
# Set the entry point to activate the virtual environment and run the command line tool
ENTRYPOINT
["/bin/bash", "-c", "source /opt/mineru_venv/bin/activate && exec \"$@\"", "--"]
docker/china/Dockerfile
View file @
89c98537
...
@@ -18,37 +18,19 @@ RUN apt-get update && \
...
@@ -18,37 +18,19 @@ RUN apt-get update && \
wget
\
wget
\
git
\
git
\
libgl1
\
libgl1
\
libreoffice
\
fonts-noto-cjk
\
fonts-wqy-zenhei
\
fonts-wqy-microhei
\
ttf-mscorefonts-installer
\
fontconfig
\
libglib2.0-0
\
libglib2.0-0
\
libxrender1
\
libsm6
\
libxext6
\
poppler-utils
\
&&
rm
-rf
/var/lib/apt/lists/
*
&&
rm
-rf
/var/lib/apt/lists/
*
# Set Python 3.10 as the default python3
# Set Python 3.10 as the default python3
RUN
update-alternatives
--install
/usr/bin/python3 python3 /usr/bin/python3.10 1
RUN
update-alternatives
--install
/usr/bin/python3 python3 /usr/bin/python3.10 1
# Create a virtual environment for MinerU
# install mineru latest
RUN
python3
-m
venv /opt/mineru_venv
RUN
/bin/bash
-c
"pip3 install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple &&
\
pip3 install uv -i https://mirrors.aliyun.com/pypi/simple &&
\
# Copy the configuration file template and install magic-pdf latest
uv pip install 'mineru[all]>=2.0.0' -i https://mirrors.aliyun.com/pypi/simple"
RUN
/bin/bash
-c
"wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/magic-pdf.template.json &&
\
cp magic-pdf.template.json /root/magic-pdf.json &&
\
source /opt/mineru_venv/bin/activate &&
\
pip3 install --upgrade pip -i https://mirrors.aliyun.com/pypi/simple &&
\
pip3 install -U magic-pdf[full] -i https://mirrors.aliyun.com/pypi/simple"
# Download models and update the configuration file
# Download models and update the configuration file
RUN
/bin/bash
-c
"pip3 install modelscope -i https://mirrors.aliyun.com/pypi/simple &&
\
RUN
/bin/bash
-c
"mineru-models-download -s modelscope -m all"
wget https://gcore.jsdelivr.net/gh/opendatalab/MinerU@master/scripts/download_models.py -O download_models.py &&
\
python3 download_models.py &&
\
sed -i 's|cpu|cuda|g' /root/magic-pdf.json"
# Set the entry point to activate the virtual environment and run the command line tool
# Set the entry point to activate the virtual environment and run the command line tool
ENTRYPOINT
["/bin/bash", "-c", "
source /opt/mineru_venv/bin/activate
&& exec \"$@\"", "--"]
ENTRYPOINT
["/bin/bash", "-c", "
export MINERU_MODEL_SOURCE=local
&& exec \"$@\"", "--"]
\ No newline at end of file
docker/global/Dockerfile
View file @
89c98537
...
@@ -18,37 +18,19 @@ RUN apt-get update && \
...
@@ -18,37 +18,19 @@ RUN apt-get update && \
wget
\
wget
\
git
\
git
\
libgl1
\
libgl1
\
libreoffice
\
fonts-noto-cjk
\
fonts-wqy-zenhei
\
fonts-wqy-microhei
\
ttf-mscorefonts-installer
\
fontconfig
\
libglib2.0-0
\
libglib2.0-0
\
libxrender1
\
libsm6
\
libxext6
\
poppler-utils
\
&&
rm
-rf
/var/lib/apt/lists/
*
&&
rm
-rf
/var/lib/apt/lists/
*
# Set Python 3.10 as the default python3
# Set Python 3.10 as the default python3
RUN
update-alternatives
--install
/usr/bin/python3 python3 /usr/bin/python3.10 1
RUN
update-alternatives
--install
/usr/bin/python3 python3 /usr/bin/python3.10 1
# Create a virtual environment for MinerU
# install mineru latest
RUN
python3
-m
venv /opt/mineru_venv
RUN
/bin/bash
-c
"pip3 install --upgrade pip &&
\
pip3 install uv &&
\
# Copy the configuration file template and install magic-pdf latest
uv pip install 'mineru[all]>=2.0.0'"
RUN
/bin/bash
-c
"wget https://github.com/opendatalab/MinerU/raw/master/magic-pdf.template.json &&
\
cp magic-pdf.template.json /root/magic-pdf.json &&
\
source /opt/mineru_venv/bin/activate &&
\
pip3 install --upgrade pip &&
\
pip3 install -U magic-pdf[full]"
# Download models and update the configuration file
# Download models and update the configuration file
RUN
/bin/bash
-c
"pip3 install huggingface_hub &&
\
RUN
/bin/bash
-c
"mineru-models-download -s huggingface -m all"
wget https://github.com/opendatalab/MinerU/raw/master/scripts/download_models_hf.py -O download_models.py &&
\
python3 download_models.py &&
\
sed -i 's|cpu|cuda|g' /root/magic-pdf.json"
# Set the entry point to activate the virtual environment and run the command line tool
# Set the entry point to activate the virtual environment and run the command line tool
ENTRYPOINT
["/bin/bash", "-c", "
source /opt/mineru_venv/bin/activate
&& exec \"$@\"", "--"]
ENTRYPOINT
["/bin/bash", "-c", "
export MINERU_MODEL_SOURCE=local
&& exec \"$@\"", "--"]
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment