"git@developer.sourcefind.cn:chenpangpang/transformers.git" did not exist on "b97af8cce90aa2d147ca3fd9543ca372d3cb2ae3"
Unverified Commit a801c7fd authored by Bayartsogt Yadamsuren's avatar Bayartsogt Yadamsuren Committed by GitHub
Browse files

Creating a readme for ALBERT in Mongolian (#4603)

Here I am uploading Mongolian masked language model (ALBERT) on your platform.
https://en.wikipedia.org/wiki/Mongolia
parent 6458c0e2
# ALBERT-Mongolian
[pretraining repo link](https://github.com/bayartsogt-ya/albert-mongolian)
## Model description
Here we provide pretrained ALBERT model and trained SentencePiece model for Mongolia text. Training data is the Mongolian wikipedia corpus from Wikipedia Downloads and Mongolian News corpus.
## Evaluation Result:
```
loss = 1.7478163
masked_lm_accuracy = 0.6838185
masked_lm_loss = 1.6687671
sentence_order_accuracy = 0.998125
sentence_order_loss = 0.007942731
```
## Fine-tuning Result on Eduge Dataset:
```
precision recall f1-score support
байгал орчин 0.83 0.76 0.80 483
боловсрол 0.79 0.75 0.77 420
спорт 0.98 0.96 0.97 1391
технологи 0.85 0.83 0.84 543
улс төр 0.88 0.87 0.87 1336
урлаг соёл 0.89 0.94 0.91 726
хууль 0.87 0.83 0.85 840
эдийн засаг 0.80 0.84 0.82 1265
эрүүл мэнд 0.84 0.90 0.87 562
accuracy 0.87 7566
macro avg 0.86 0.85 0.86 7566
weighted avg 0.87 0.87 0.87 7566
```
## Reference
1. [ALBERT - official repo](https://github.com/google-research/albert)
2. [WikiExtrator](https://github.com/attardi/wikiextractor)
3. [Mongolian BERT](https://github.com/tugstugi/mongolian-bert)
4. [ALBERT - Japanese](https://github.com/alinear-corp/albert-japanese)
5. [Mongolian Text Classification](https://github.com/sharavsambuu/mongolian-text-classification)
6. [You's paper](https://arxiv.org/abs/1904.00962)
## Citation
```
@misc{albert-mongolian,
author = {Bayartsogt Yadamsuren},
title = {ALBERT Pretrained Model on Mongolian Datasets},
year = {2020},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/bayartsogt-ya/albert-mongolian/}}
}
```
## For More Information
Please contact by bayartsogtyadamsuren@icloud.com
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment