contents.rst 20.9 KB
Newer Older
yuguo-Jack's avatar
yuguo-Jack committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127


------------------------------------
ERNIE模型汇总
------------------------------------



下表汇总介绍了目前PaddleNLP支持的ERNIE模型对应预训练权重。
关于模型的具体细节可以参考对应链接。

+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
| Pretrained Weight                                                                | Language     | Details of the model                                                             |
+==================================================================================+==============+==================================================================================+
|``ernie-1.0-base-zh``                                                             | Chinese      | 12-layer, 768-hidden,                                                            |
|                                                                                  |              | 12-heads, 108M parameters.                                                       |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-1.0-base-zh-cw``                                                          | Chinese      | 12-layer, 768-hidden,                                                            |
|                                                                                  |              | 12-heads, 118M parameters.                                                       |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-1.0-large-zh-cw``                                                         | Chinese      | 24-layer, 1024-hidden,                                                           |
|                                                                                  |              | 16-heads, 272M parameters.                                                       |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-tiny``                                                                    | Chinese      | 3-layer, 1024-hidden,                                                            |
|                                                                                  |              | 16-heads, _M parameters.                                                         |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-2.0-base-en``                                                             | English      | 12-layer, 768-hidden,                                                            |
|                                                                                  |              | 12-heads, 103M parameters.                                                       |
|                                                                                  |              | Trained on lower-cased English text.                                             |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-2.0-base-en-finetuned-squad``                                             | English      | 12-layer, 768-hidden,                                                            |
|                                                                                  |              | 12-heads, 110M parameters.                                                       |
|                                                                                  |              | Trained on finetuned squad text.                                                 |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-2.0-large-en``                                                            | English      | 24-layer, 1024-hidden,                                                           |
|                                                                                  |              | 16-heads, 336M parameters.                                                       |
|                                                                                  |              | Trained on lower-cased English text.                                             |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-3.0-xbase-zh``                                                            | Chinese      | 20-layer, 1024-hidden,                                                           |
|                                                                                  |              | 16-heads, 296M parameters.                                                       |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-3.0-base-zh``                                                             | Chinese      | 12-layer, 768-hidden,                                                            |
|                                                                                  |              | 12-heads, 118M parameters.                                                       |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-3.0-medium-zh``                                                           | Chinese      | 6-layer, 768-hidden,                                                             |
|                                                                                  |              | 12-heads, 75M parameters.                                                        |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-3.0-mini-zh``                                                             | Chinese      | 6-layer, 384-hidden,                                                             |
|                                                                                  |              | 12-heads, 27M parameters.                                                        |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-3.0-micro-zh``                                                            | Chinese      | 4-layer, 384-hidden,                                                             |
|                                                                                  |              | 12-heads, 23M parameters.                                                        |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``ernie-3.0-nano-zh``                                                             | Chinese      | 4-layer, 312-hidden,                                                             |
|                                                                                  |              | 12-heads, 18M parameters.                                                        |
|                                                                                  |              | Trained on Chinese text.                                                         |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-base-cross-encoder``                                                   | Chinese      | 12-layer, 768-hidden,                                                            |
|                                                                                  |              | 12-heads, 118M parameters.                                                       |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-medium-cross-encoder``                                                 | Chinese      | 6-layer, 768-hidden,                                                             |
|                                                                                  |              | 12-heads, 75M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-mini-cross-encoder``                                                   | Chinese      | 6-layer, 384-hidden,                                                             |
|                                                                                  |              | 12-heads, 27M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-micro-cross-encoder``                                                  | Chinese      | 4-layer, 384-hidden,                                                             |
|                                                                                  |              | 12-heads, 23M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-nano-cross-encoder``                                                   | Chinese      | 4-layer, 312-hidden,                                                             |
|                                                                                  |              | 12-heads, 18M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-base-query-encoder``                                                | Chinese      | 12-layer, 768-hidden,                                                            |
|                                                                                  |              | 12-heads, 118M parameters.                                                       |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-base-para-encoder``                                                 | Chinese      | 12-layer, 768-hidden,                                                            |
|                                                                                  |              | 12-heads, 118M parameters.                                                       |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-medium-query-encoder``                                              | Chinese      | 6-layer, 768-hidden,                                                             |
|                                                                                  |              | 12-heads, 75M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-medium-para-encoder``                                               | Chinese      | 6-layer, 768-hidden,                                                             |
|                                                                                  |              | 12-heads, 75M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-mini-query-encoder``                                                | Chinese      | 6-layer, 384-hidden,                                                             |
|                                                                                  |              | 12-heads, 27M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-mini-para-encoder``                                                 | Chinese      | 6-layer, 384-hidden,                                                             |
|                                                                                  |              | 12-heads, 27M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-micro-query-encoder``                                               | Chinese      | 4-layer, 384-hidden,                                                             |
|                                                                                  |              | 12-heads, 23M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-micro-para-encoder``                                                | Chinese      | 4-layer, 384-hidden,                                                             |
|                                                                                  |              | 12-heads, 23M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-nano-query-encoder``                                                | Chinese      | 4-layer, 312-hidden,                                                             |
|                                                                                  |              | 12-heads, 18M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
|``rocketqa-zh-nano-para-encoder``                                                 | Chinese      | 4-layer, 312-hidden,                                                             |
|                                                                                  |              | 12-heads, 18M parameters.                                                        |
|                                                                                  |              | Trained on DuReader retrieval text.                                              |
+----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+
.. _zhui/ernie-1.0-cluecorpussmall: https://github.com/PaddlePaddle/PaddleNLP/tree/develop/community/zhui/ernie-1.0-cluecorpussmall