------------------------------------ ERNIE模型汇总 ------------------------------------ 下表汇总介绍了目前PaddleNLP支持的ERNIE模型对应预训练权重。 关于模型的具体细节可以参考对应链接。 +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ | Pretrained Weight | Language | Details of the model | +==================================================================================+==============+==================================================================================+ |``ernie-1.0-base-zh`` | Chinese | 12-layer, 768-hidden, | | | | 12-heads, 108M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-1.0-base-zh-cw`` | Chinese | 12-layer, 768-hidden, | | | | 12-heads, 118M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-1.0-large-zh-cw`` | Chinese | 24-layer, 1024-hidden, | | | | 16-heads, 272M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-tiny`` | Chinese | 3-layer, 1024-hidden, | | | | 16-heads, _M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-2.0-base-en`` | English | 12-layer, 768-hidden, | | | | 12-heads, 103M parameters. | | | | Trained on lower-cased English text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-2.0-base-en-finetuned-squad`` | English | 12-layer, 768-hidden, | | | | 12-heads, 110M parameters. | | | | Trained on finetuned squad text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-2.0-large-en`` | English | 24-layer, 1024-hidden, | | | | 16-heads, 336M parameters. | | | | Trained on lower-cased English text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-3.0-xbase-zh`` | Chinese | 20-layer, 1024-hidden, | | | | 16-heads, 296M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-3.0-base-zh`` | Chinese | 12-layer, 768-hidden, | | | | 12-heads, 118M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-3.0-medium-zh`` | Chinese | 6-layer, 768-hidden, | | | | 12-heads, 75M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-3.0-mini-zh`` | Chinese | 6-layer, 384-hidden, | | | | 12-heads, 27M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-3.0-micro-zh`` | Chinese | 4-layer, 384-hidden, | | | | 12-heads, 23M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``ernie-3.0-nano-zh`` | Chinese | 4-layer, 312-hidden, | | | | 12-heads, 18M parameters. | | | | Trained on Chinese text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-base-cross-encoder`` | Chinese | 12-layer, 768-hidden, | | | | 12-heads, 118M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-medium-cross-encoder`` | Chinese | 6-layer, 768-hidden, | | | | 12-heads, 75M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-mini-cross-encoder`` | Chinese | 6-layer, 384-hidden, | | | | 12-heads, 27M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-micro-cross-encoder`` | Chinese | 4-layer, 384-hidden, | | | | 12-heads, 23M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-nano-cross-encoder`` | Chinese | 4-layer, 312-hidden, | | | | 12-heads, 18M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-base-query-encoder`` | Chinese | 12-layer, 768-hidden, | | | | 12-heads, 118M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-base-para-encoder`` | Chinese | 12-layer, 768-hidden, | | | | 12-heads, 118M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-medium-query-encoder`` | Chinese | 6-layer, 768-hidden, | | | | 12-heads, 75M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-medium-para-encoder`` | Chinese | 6-layer, 768-hidden, | | | | 12-heads, 75M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-mini-query-encoder`` | Chinese | 6-layer, 384-hidden, | | | | 12-heads, 27M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-mini-para-encoder`` | Chinese | 6-layer, 384-hidden, | | | | 12-heads, 27M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-micro-query-encoder`` | Chinese | 4-layer, 384-hidden, | | | | 12-heads, 23M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-micro-para-encoder`` | Chinese | 4-layer, 384-hidden, | | | | 12-heads, 23M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-nano-query-encoder`` | Chinese | 4-layer, 312-hidden, | | | | 12-heads, 18M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ |``rocketqa-zh-nano-para-encoder`` | Chinese | 4-layer, 312-hidden, | | | | 12-heads, 18M parameters. | | | | Trained on DuReader retrieval text. | +----------------------------------------------------------------------------------+--------------+----------------------------------------------------------------------------------+ .. _zhui/ernie-1.0-cluecorpussmall: https://github.com/PaddlePaddle/PaddleNLP/tree/develop/community/zhui/ernie-1.0-cluecorpussmall