XLM_ROBERTA_START_DOCSTRING=r""" The XLM-RoBERTa model was proposed in
`Unsupervised Cross-lingual Representation Learning at Scale`_
by Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer and Veselin Stoyanov. It is based on Facebook's RoBERTa model released in 2019.
XLM_ROBERTA_START_DOCSTRING=r"""
It is a large multi-lingual language model, trained on 2.5TB of filtered CommonCrawl data.
.. note::
This implementation is the same as RoBERTa.
TF 2.0 models accepts two formats as inputs:
This model is a tf.keras.Model `tf.keras.Model`_ sub-class. Use it as a regular TF 2.0 Keras Model and
refer to the TF 2.0 documentation for all matter related to general usage and behavior.
- having all inputs as keyword arguments (like PyTorch models), or
- having all inputs as a list, tuple or dict in the first positional arguments.
.. _`Unsupervised Cross-lingual Representation Learning at Scale`:
https://arxiv.org/abs/1911.02116
This second option is useful when using :obj:`tf.keras.Model.fit()` method which currently requires having
all the tensors in the first argument of the model call function: :obj:`model(inputs)`.