"official/legacy/transformer/compute_bleu.py" did not exist on "8b18491b26e4b8271db757a3245008882ea112b3"
how_to_download_models_en.md 2.45 KB
Newer Older
1
2
### 1. Download the Model from Hugging Face
Use a Python Script to Download Model Files from Hugging Face
3
```bash
4
5
6
pip install huggingface_hub
wget https://github.com/opendatalab/MinerU/raw/master/docs/download_models_hf.py
python download_models_hf.py
7
```
8
9
After the Python script finishes executing, it will output the directory where the models are downloaded.
### 2. Additional steps
10

11
#### 1. Check whether the model directory is downloaded completely.
12

13
The structure of the model folder is as follows, including configuration files and weight files of different components:
14
```
15
../
16
17
├── Layout
│   ├── config.json
18
│   └── model_final.pth
19
20
21
22
23
24
25
26
27
28
├── MFD
│   └── weights.pt
├── MFR
│   └── UniMERNet
│       ├── config.json
│       ├── preprocessor_config.json
│       ├── pytorch_model.bin
│       ├── README.md
│       ├── tokenizer_config.json
│       └── tokenizer.json
29
30
31
32
33
34
35
36
37
38
│── TabRec
│   └─StructEqTable
│       ├── config.json
│       ├── generation_config.json
│       ├── model.safetensors
│       ├── preprocessor_config.json
│       ├── special_tokens_map.json
│       ├── spiece.model
│       ├── tokenizer.json
│       └── tokenizer_config.json 
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
│   └─ TableMaster 
│       └─ ch_PP-OCRv3_det_infer
│           ├── inference.pdiparams
│           ├── inference.pdiparams.info
│           └── inference.pdmodel
│       └─ ch_PP-OCRv3_rec_infer
│           ├── inference.pdiparams
│           ├── inference.pdiparams.info
│           └── inference.pdmodel
│       └─ table_structure_tablemaster_infer
│           ├── inference.pdiparams
│           ├── inference.pdiparams.info
│           └── inference.pdmodel
│       ├── ppocr_keys_v1.txt
│       └── table_master_structure_dict.txt
54
└── README.md
55
```
56
57
58
59
#### 2. Check whether the model file is fully downloaded.

Please check whether the size of the model file in the directory is consistent with the description on the web page. If possible, it is best to check whether the model is downloaded completely through sha256.

60
61
62
#### 3. 

Additionally, in `~/magic-pdf.json`, update the model directory path to the absolute path of the `models` directory output by the previous Python script. Otherwise, you will encounter an error indicating that the model cannot be loaded.
63