README.md 610 Bytes
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
---
language:
- bulgarian
- czech
- polish
- russian
---

# bert-base-bg-cs-pl-ru-cased

SlavicBERT\[1\] \(Slavic \(bg, cs, pl, ru\), cased, 12-layer, 768-hidden, 12-heads, 180M parameters\) was trained
on Russian News and four Wikipedias: Bulgarian, Czech, Polish, and Russian.
Subtoken vocabulary was built using this data. Multilingual BERT was used as an initialization for SlavicBERT.


\[1\]: Arkhipov M., Trofimova M., Kuratov Y., Sorokin A. \(2019\).
[Tuning Multilingual Transformers for Language-Specific Named Entity Recognition](https://www.aclweb.org/anthology/W19-3712/).
ACL anthology W19-3712.