Tokenization behave the same as original XLM proprocessing for most languages...
Tokenization behave the same as original XLM proprocessing for most languages except zh, ja and th; Change API to allow specifying language in `tokenize`
Showing
| ... | @@ -10,3 +10,5 @@ requests | ... | @@ -10,3 +10,5 @@ requests |
| regex | regex | ||
| # For XLNet | # For XLNet | ||
| sentencepiece | sentencepiece | ||
| # For XLM | |||
| sacremoses | |||
| \ No newline at end of file |
Please register or sign in to comment