index.rst 37.3 KB
Newer Older
1
Transformers
Sylvain Gugger's avatar
Sylvain Gugger committed
2
=======================================================================================================================
3

4
State-of-the-art Natural Language Processing for Jax, Pytorch and TensorFlow
thomwolf's avatar
thomwolf committed
5

6
7
🤗 Transformers (formerly known as `pytorch-transformers` and `pytorch-pretrained-bert`) provides general-purpose
architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet...) for Natural Language Understanding (NLU) and Natural
8
9
Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between Jax,
PyTorch and TensorFlow.
10
11

This is the documentation of our repository `transformers <https://github.com/huggingface/transformers>`_.
12

LysandreJik's avatar
LysandreJik committed
13
Features
Sylvain Gugger's avatar
Sylvain Gugger committed
14
-----------------------------------------------------------------------------------------------------------------------
LysandreJik's avatar
LysandreJik committed
15
16
17
18

- High performance on NLU and NLG tasks
- Low barrier to entry for educators and practitioners

LysandreJik's avatar
LysandreJik committed
19
20
State-of-the-art NLP for everyone:

LysandreJik's avatar
LysandreJik committed
21
22
23
24
- Deep learning researchers
- Hands-on practitioners
- AI/ML/NLP teachers and educators

25
..
Sylvain Gugger's avatar
Sylvain Gugger committed
26
27
28
29
30
31
32
33
34
35
36
    Copyright 2020 The HuggingFace Team. All rights reserved.

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
    the License. You may obtain a copy of the License at

        http://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
    an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
    specific language governing permissions and limitations under the License.

LysandreJik's avatar
LysandreJik committed
37
38
Lower compute costs, smaller carbon footprint:

LysandreJik's avatar
LysandreJik committed
39
40
41
42
- Researchers can share trained models instead of always retraining
- Practitioners can reduce compute time and production costs
- 8 architectures with over 30 pretrained models, some in more than 100 languages

LysandreJik's avatar
LysandreJik committed
43
44
Choose the right framework for every part of a model's lifetime:

LysandreJik's avatar
LysandreJik committed
45
- Train state-of-the-art models in 3 lines of code
46
47
- Deep interoperability between Jax, Pytorch and TensorFlow models
- Move a single model between Jax/PyTorch/TensorFlow frameworks at will
LysandreJik's avatar
LysandreJik committed
48
49
- Seamlessly pick the right framework for training, evaluation, production

50
The support for Jax is still experimental (with a few models right now), expect to see it grow in the coming months!
Sylvain Gugger's avatar
Sylvain Gugger committed
51

52
53
54
55
56
57
58
59
`All the model checkpoints <https://huggingface.co/models>`__ are seamlessly integrated from the huggingface.co `model
hub <https://huggingface.co>`__ where they are uploaded directly by `users <https://huggingface.co/users>`__ and
`organizations <https://huggingface.co/organizations>`__.

Current number of checkpoints: |checkpoints|

.. |checkpoints| image:: https://img.shields.io/endpoint?url=https://huggingface.co/api/shields/models&color=brightgreen

LysandreJik's avatar
LysandreJik committed
60
Contents
Sylvain Gugger's avatar
Sylvain Gugger committed
61
-----------------------------------------------------------------------------------------------------------------------
LysandreJik's avatar
LysandreJik committed
62

Sylvain Gugger's avatar
Sylvain Gugger committed
63
64
65
66
The documentation is organized in five parts:

- **GET STARTED** contains a quick tour, the installation instructions and some useful information about our philosophy
  and a glossary.
Sylvain Gugger's avatar
Sylvain Gugger committed
67
- **USING 🤗 TRANSFORMERS** contains general tutorials on how to use the library.
Sylvain Gugger's avatar
Sylvain Gugger committed
68
- **ADVANCED GUIDES** contains more advanced guides that are more specific to a given script or part of the library.
Santiago Castro's avatar
Santiago Castro committed
69
- **RESEARCH** focuses on tutorials that have less to do with how to use the library but more about general research in
Sylvain Gugger's avatar
Sylvain Gugger committed
70
  transformers model
71
- The three last section contain the documentation of each public class and function, grouped in:
Sylvain Gugger's avatar
Sylvain Gugger committed
72

73
74
75
    - **MAIN CLASSES** for the main classes exposing the important APIs of the library.
    - **MODELS** for the classes and functions related to each model implemented in the library.
    - **INTERNAL HELPERS** for the classes and functions we use internally.
Sylvain Gugger's avatar
Sylvain Gugger committed
76

77
78
The library currently contains Jax, PyTorch and Tensorflow implementations, pretrained model weights, usage scripts and
conversion utilities for the following models:
79

80
81
82
..
    This list is updated automatically from the README with `make fix-copies`. Do not update manually!

83
84
85
86
87
88
1. :doc:`ALBERT <model_doc/albert>` (from Google Research and the Toyota Technological Institute at Chicago) released
   with the paper `ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
   <https://arxiv.org/abs/1909.11942>`__, by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush
   Sharma, Radu Soricut.
2. :doc:`BART <model_doc/bart>` (from Facebook) released with the paper `BART: Denoising Sequence-to-Sequence
   Pre-training for Natural Language Generation, Translation, and Comprehension
89
90
   <https://arxiv.org/pdf/1910.13461.pdf>`__ by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman
   Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer.
91
92
93
94
3. :doc:`BARThez <model_doc/barthez>` (from École polytechnique) released with the paper `BARThez: a Skilled Pretrained
   French Sequence-to-Sequence Model <https://arxiv.org/abs/2010.12321>`__ by Moussa Kamal Eddine, Antoine J.-P.
   Tixier, Michalis Vazirgiannis.
4. :doc:`BERT <model_doc/bert>` (from Google) released with the paper `BERT: Pre-training of Deep Bidirectional
95
96
   Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`__ by Jacob Devlin, Ming-Wei Chang,
   Kenton Lee and Kristina Toutanova.
97
5. :doc:`BERT For Sequence Generation <model_doc/bertgeneration>` (from Google) released with the paper `Leveraging
98
99
   Pre-trained Checkpoints for Sequence Generation Tasks <https://arxiv.org/abs/1907.12461>`__ by Sascha Rothe, Shashi
   Narayan, Aliaksei Severyn.
Vasudev Gupta's avatar
Vasudev Gupta committed
100
101
102
6. :doc:`BigBird-RoBERTa <model_doc/bigbird>` (from Google Research) released with the paper `Big Bird: Transformers
   for Longer Sequences <https://arxiv.org/abs/2007.14062>`__ by Manzil Zaheer, Guru Guruganesh, Avinava Dubey, Joshua
   Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed.
Vasudev Gupta's avatar
Vasudev Gupta committed
103
104
105
106
7. :doc:`BigBird-Pegasus <model_doc/bigbird_pegasus>` (from Google Research) released with the paper `Big Bird:
   Transformers for Longer Sequences <https://arxiv.org/abs/2007.14062>`__ by Manzil Zaheer, Guru Guruganesh, Avinava
   Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, Amr Ahmed.
8. :doc:`Blenderbot <model_doc/blenderbot>` (from Facebook) released with the paper `Recipes for building an
Lysandre's avatar
Lysandre committed
107
108
   open-domain chatbot <https://arxiv.org/abs/2004.13637>`__ by Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary
   Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston.
Vasudev Gupta's avatar
Vasudev Gupta committed
109
9. :doc:`BlenderbotSmall <model_doc/blenderbot_small>` (from Facebook) released with the paper `Recipes for building an
110
111
   open-domain chatbot <https://arxiv.org/abs/2004.13637>`__ by Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary
   Williamson, Yinhan Liu, Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston.
Vasudev Gupta's avatar
Vasudev Gupta committed
112
113
114
10. :doc:`BORT <model_doc/bort>` (from Alexa) released with the paper `Optimal Subarchitecture Extraction For BERT
    <https://arxiv.org/abs/2010.10499>`__ by Adrian de Wynter and Daniel J. Perry.
11. :doc:`CamemBERT <model_doc/camembert>` (from Inria/Facebook/Sorbonne) released with the paper `CamemBERT: a Tasty
Vasudev Gupta's avatar
Vasudev Gupta committed
115
116
    French Language Model <https://arxiv.org/abs/1911.03894>`__ by Louis Martin*, Benjamin Muller*, Pedro Javier Ortiz
    Suárez*, Yoann Dupont, Laurent Romary, Éric Villemonte de la Clergerie, Djamé Seddah and Benoît Sagot.
Vasudev Gupta's avatar
Vasudev Gupta committed
117
12. :doc:`ConvBERT <model_doc/convbert>` (from YituTech) released with the paper `ConvBERT: Improving BERT with
Stefan Schweter's avatar
Stefan Schweter committed
118
119
    Span-based Dynamic Convolution <https://arxiv.org/abs/2008.02496>`__ by Zihang Jiang, Weihao Yu, Daquan Zhou,
    Yunpeng Chen, Jiashi Feng, Shuicheng Yan.
Vasudev Gupta's avatar
Vasudev Gupta committed
120
13. :doc:`CPM <model_doc/cpm>` (from Tsinghua University) released with the paper `CPM: A Large-scale Generative
121
122
123
124
    Chinese Pre-trained Language Model <https://arxiv.org/abs/2012.00413>`__ by Zhengyan Zhang, Xu Han, Hao Zhou, Pei
    Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng,
    Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang,
    Juanzi Li, Xiaoyan Zhu, Maosong Sun.
Vasudev Gupta's avatar
Vasudev Gupta committed
125
14. :doc:`CTRL <model_doc/ctrl>` (from Salesforce) released with the paper `CTRL: A Conditional Transformer Language
abhishek thakur's avatar
abhishek thakur committed
126
127
    Model for Controllable Generation <https://arxiv.org/abs/1909.05858>`__ by Nitish Shirish Keskar*, Bryan McCann*,
    Lav R. Varshney, Caiming Xiong and Richard Socher.
Vasudev Gupta's avatar
Vasudev Gupta committed
128
15. :doc:`DeBERTa <model_doc/deberta>` (from Microsoft) released with the paper `DeBERTa: Decoding-enhanced BERT with
Lysandre Debut's avatar
Lysandre Debut committed
129
130
    Disentangled Attention <https://arxiv.org/abs/2006.03654>`__ by Pengcheng He, Xiaodong Liu, Jianfeng Gao, Weizhu
    Chen.
Vasudev Gupta's avatar
Vasudev Gupta committed
131
16. :doc:`DeBERTa-v2 <model_doc/deberta_v2>` (from Microsoft) released with the paper `DeBERTa: Decoding-enhanced BERT
Lysandre's avatar
Lysandre committed
132
133
    with Disentangled Attention <https://arxiv.org/abs/2006.03654>`__ by Pengcheng He, Xiaodong Liu, Jianfeng Gao,
    Weizhu Chen.
Vasudev Gupta's avatar
Vasudev Gupta committed
134
17. :doc:`DeiT <model_doc/deit>` (from Facebook) released with the paper `Training data-efficient image transformers &
NielsRogge's avatar
NielsRogge committed
135
136
    distillation through attention <https://arxiv.org/abs/2012.12877>`__ by Hugo Touvron, Matthieu Cord, Matthijs
    Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou.
Vasudev Gupta's avatar
Vasudev Gupta committed
137
18. :doc:`DialoGPT <model_doc/dialogpt>` (from Microsoft Research) released with the paper `DialoGPT: Large-Scale
138
139
    Generative Pre-training for Conversational Response Generation <https://arxiv.org/abs/1911.00536>`__ by Yizhe
    Zhang, Siqi Sun, Michel Galley, Yen-Chun Chen, Chris Brockett, Xiang Gao, Jianfeng Gao, Jingjing Liu, Bill Dolan.
Vasudev Gupta's avatar
Vasudev Gupta committed
140
19. :doc:`DistilBERT <model_doc/distilbert>` (from HuggingFace), released together with the paper `DistilBERT, a
141
142
143
144
145
146
    distilled version of BERT: smaller, faster, cheaper and lighter <https://arxiv.org/abs/1910.01108>`__ by Victor
    Sanh, Lysandre Debut and Thomas Wolf. The same method has been applied to compress GPT2 into `DistilGPT2
    <https://github.com/huggingface/transformers/tree/master/examples/distillation>`__, RoBERTa into `DistilRoBERTa
    <https://github.com/huggingface/transformers/tree/master/examples/distillation>`__, Multilingual BERT into
    `DistilmBERT <https://github.com/huggingface/transformers/tree/master/examples/distillation>`__ and a German
    version of DistilBERT.
Vasudev Gupta's avatar
Vasudev Gupta committed
147
20. :doc:`DPR <model_doc/dpr>` (from Facebook) released with the paper `Dense Passage Retrieval for Open-Domain
148
149
    Question Answering <https://arxiv.org/abs/2004.04906>`__ by Vladimir Karpukhin, Barlas Oğuz, Sewon Min, Patrick
    Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih.
Vasudev Gupta's avatar
Vasudev Gupta committed
150
21. :doc:`ELECTRA <model_doc/electra>` (from Google Research/Stanford University) released with the paper `ELECTRA:
151
152
    Pre-training text encoders as discriminators rather than generators <https://arxiv.org/abs/2003.10555>`__ by Kevin
    Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning.
Vasudev Gupta's avatar
Vasudev Gupta committed
153
22. :doc:`FlauBERT <model_doc/flaubert>` (from CNRS) released with the paper `FlauBERT: Unsupervised Language Model
154
155
    Pre-training for French <https://arxiv.org/abs/1912.05372>`__ by Hang Le, Loïc Vial, Jibril Frej, Vincent Segonne,
    Maximin Coavoux, Benjamin Lecouteux, Alexandre Allauzen, Benoît Crabbé, Laurent Besacier, Didier Schwab.
Vasudev Gupta's avatar
Vasudev Gupta committed
156
23. :doc:`Funnel Transformer <model_doc/funnel>` (from CMU/Google Brain) released with the paper `Funnel-Transformer:
157
158
    Filtering out Sequential Redundancy for Efficient Language Processing <https://arxiv.org/abs/2006.03236>`__ by
    Zihang Dai, Guokun Lai, Yiming Yang, Quoc V. Le.
Vasudev Gupta's avatar
Vasudev Gupta committed
159
24. :doc:`GPT <model_doc/gpt>` (from OpenAI) released with the paper `Improving Language Understanding by Generative
160
161
    Pre-Training <https://blog.openai.com/language-unsupervised/>`__ by Alec Radford, Karthik Narasimhan, Tim Salimans
    and Ilya Sutskever.
Vasudev Gupta's avatar
Vasudev Gupta committed
162
25. :doc:`GPT-2 <model_doc/gpt2>` (from OpenAI) released with the paper `Language Models are Unsupervised Multitask
163
164
    Learners <https://blog.openai.com/better-language-models/>`__ by Alec Radford*, Jeffrey Wu*, Rewon Child, David
    Luan, Dario Amodei** and Ilya Sutskever**.
Vasudev Gupta's avatar
Vasudev Gupta committed
165
26. :doc:`GPT Neo <model_doc/gpt_neo>` (from EleutherAI) released in the repository `EleutherAI/gpt-neo
Suraj Patil's avatar
Suraj Patil committed
166
    <https://github.com/EleutherAI/gpt-neo>`__ by Sid Black, Stella Biderman, Leo Gao, Phil Wang and Connor Leahy.
Vasudev Gupta's avatar
Vasudev Gupta committed
167
27. :doc:`I-BERT <model_doc/ibert>` (from Berkeley) released with the paper `I-BERT: Integer-only BERT Quantization
Lysandre's avatar
Lysandre committed
168
    <https://arxiv.org/abs/2101.01321>`__ by Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
Vasudev Gupta's avatar
Vasudev Gupta committed
169
28. :doc:`LayoutLM <model_doc/layoutlm>` (from Microsoft Research Asia) released with the paper `LayoutLM: Pre-training
170
171
    of Text and Layout for Document Image Understanding <https://arxiv.org/abs/1912.13318>`__ by Yiheng Xu, Minghao Li,
    Lei Cui, Shaohan Huang, Furu Wei, Ming Zhou.
Vasudev Gupta's avatar
Vasudev Gupta committed
172
29. :doc:`LED <model_doc/led>` (from AllenAI) released with the paper `Longformer: The Long-Document Transformer
173
    <https://arxiv.org/abs/2004.05150>`__ by Iz Beltagy, Matthew E. Peters, Arman Cohan.
Vasudev Gupta's avatar
Vasudev Gupta committed
174
30. :doc:`Longformer <model_doc/longformer>` (from AllenAI) released with the paper `Longformer: The Long-Document
175
    Transformer <https://arxiv.org/abs/2004.05150>`__ by Iz Beltagy, Matthew E. Peters, Arman Cohan.
Vasudev Gupta's avatar
Vasudev Gupta committed
176
31. :doc:`LUKE <model_doc/luke>` (from Studio Ousia) released with the paper `LUKE: Deep Contextualized Entity
NielsRogge's avatar
NielsRogge committed
177
178
    Representations with Entity-aware Self-attention <https://arxiv.org/abs/2010.01057>`__ by Ikuya Yamada, Akari Asai,
    Hiroyuki Shindo, Hideaki Takeda, Yuji Matsumoto.
Vasudev Gupta's avatar
Vasudev Gupta committed
179
32. :doc:`LXMERT <model_doc/lxmert>` (from UNC Chapel Hill) released with the paper `LXMERT: Learning Cross-Modality
180
181
    Encoder Representations from Transformers for Open-Domain Question Answering <https://arxiv.org/abs/1908.07490>`__
    by Hao Tan and Mohit Bansal.
Vasudev Gupta's avatar
Vasudev Gupta committed
182
33. :doc:`M2M100 <model_doc/m2m_100>` (from Facebook) released with the paper `Beyond English-Centric Multilingual
Suraj Patil's avatar
Suraj Patil committed
183
184
185
    Machine Translation <https://arxiv.org/abs/2010.11125>`__ by by Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi
    Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman
    Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin.
Vasudev Gupta's avatar
Vasudev Gupta committed
186
34. :doc:`MarianMT <model_doc/marian>` Machine translation models trained using `OPUS <http://opus.nlpl.eu/>`__ data by
187
188
    Jörg Tiedemann. The `Marian Framework <https://marian-nmt.github.io/>`__ is being developed by the Microsoft
    Translator Team.
Vasudev Gupta's avatar
Vasudev Gupta committed
189
35. :doc:`MBart <model_doc/mbart>` (from Facebook) released with the paper `Multilingual Denoising Pre-training for
190
191
    Neural Machine Translation <https://arxiv.org/abs/2001.08210>`__ by Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li,
    Sergey Edunov, Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.
Vasudev Gupta's avatar
Vasudev Gupta committed
192
36. :doc:`MBart-50 <model_doc/mbart>` (from Facebook) released with the paper `Multilingual Translation with Extensible
Suraj Patil's avatar
Suraj Patil committed
193
194
    Multilingual Pretraining and Finetuning <https://arxiv.org/abs/2008.00401>`__ by Yuqing Tang, Chau Tran, Xian Li,
    Peng-Jen Chen, Naman Goyal, Vishrav Chaudhary, Jiatao Gu, Angela Fan.
Vasudev Gupta's avatar
Vasudev Gupta committed
195
37. :doc:`Megatron-BERT <model_doc/megatron_bert>` (from NVIDIA) released with the paper `Megatron-LM: Training
196
197
    Multi-Billion Parameter Language Models Using Model Parallelism <https://arxiv.org/abs/1909.08053>`__ by Mohammad
    Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper and Bryan Catanzaro.
Vasudev Gupta's avatar
Vasudev Gupta committed
198
38. :doc:`Megatron-GPT2 <model_doc/megatron_gpt2>` (from NVIDIA) released with the paper `Megatron-LM: Training
199
200
    Multi-Billion Parameter Language Models Using Model Parallelism <https://arxiv.org/abs/1909.08053>`__ by Mohammad
    Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper and Bryan Catanzaro.
Vasudev Gupta's avatar
Vasudev Gupta committed
201
39. :doc:`MPNet <model_doc/mpnet>` (from Microsoft Research) released with the paper `MPNet: Masked and Permuted
StillKeepTry's avatar
StillKeepTry committed
202
203
    Pre-training for Language Understanding <https://arxiv.org/abs/2004.09297>`__ by Kaitao Song, Xu Tan, Tao Qin,
    Jianfeng Lu, Tie-Yan Liu.
Vasudev Gupta's avatar
Vasudev Gupta committed
204
40. :doc:`MT5 <model_doc/mt5>` (from Google AI) released with the paper `mT5: A massively multilingual pre-trained
Patrick von Platen's avatar
Patrick von Platen committed
205
206
    text-to-text transformer <https://arxiv.org/abs/2010.11934>`__ by Linting Xue, Noah Constant, Adam Roberts, Mihir
    Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel.
Vasudev Gupta's avatar
Vasudev Gupta committed
207
41. :doc:`Pegasus <model_doc/pegasus>` (from Google) released with the paper `PEGASUS: Pre-training with Extracted
208
209
    Gap-sentences for Abstractive Summarization <https://arxiv.org/abs/1912.08777>`__> by Jingqing Zhang, Yao Zhao,
    Mohammad Saleh and Peter J. Liu.
Vasudev Gupta's avatar
Vasudev Gupta committed
210
42. :doc:`ProphetNet <model_doc/prophetnet>` (from Microsoft Research) released with the paper `ProphetNet: Predicting
Lysandre's avatar
Lysandre committed
211
212
    Future N-gram for Sequence-to-Sequence Pre-training <https://arxiv.org/abs/2001.04063>`__ by Yu Yan, Weizhen Qi,
    Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
Vasudev Gupta's avatar
Vasudev Gupta committed
213
43. :doc:`Reformer <model_doc/reformer>` (from Google Research) released with the paper `Reformer: The Efficient
214
    Transformer <https://arxiv.org/abs/2001.04451>`__ by Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya.
Vasudev Gupta's avatar
Vasudev Gupta committed
215
44. :doc:`RoBERTa <model_doc/roberta>` (from Facebook), released together with the paper a `Robustly Optimized BERT
216
    Pretraining Approach <https://arxiv.org/abs/1907.11692>`__ by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar
217
    Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
Vasudev Gupta's avatar
Vasudev Gupta committed
218
45. :doc:`SpeechToTextTransformer <model_doc/speech_to_text>` (from Facebook), released together with the paper
Suraj Patil's avatar
Suraj Patil committed
219
220
    `fairseq S2T: Fast Speech-to-Text Modeling with fairseq <https://arxiv.org/abs/2010.05171>`__ by Changhan Wang, Yun
    Tang, Xutai Ma, Anne Wu, Dmytro Okhonko, Juan Pino.
Vasudev Gupta's avatar
Vasudev Gupta committed
221
46. :doc:`SqueezeBert <model_doc/squeezebert>` released with the paper `SqueezeBERT: What can computer vision teach NLP
Lysandre's avatar
Lysandre committed
222
223
    about efficient neural networks? <https://arxiv.org/abs/2006.11316>`__ by Forrest N. Iandola, Albert E. Shaw, Ravi
    Krishna, and Kurt W. Keutzer.
Vasudev Gupta's avatar
Vasudev Gupta committed
224
47. :doc:`T5 <model_doc/t5>` (from Google AI) released with the paper `Exploring the Limits of Transfer Learning with a
225
226
    Unified Text-to-Text Transformer <https://arxiv.org/abs/1910.10683>`__ by Colin Raffel and Noam Shazeer and Adam
    Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
Vasudev Gupta's avatar
Vasudev Gupta committed
227
48. :doc:`TAPAS <model_doc/tapas>` (from Google AI) released with the paper `TAPAS: Weakly Supervised Table Parsing via
Lysandre's avatar
Lysandre committed
228
229
    Pre-training <https://arxiv.org/abs/2004.02349>`__ by Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller,
    Francesco Piccinno and Julian Martin Eisenschlos.
Vasudev Gupta's avatar
Vasudev Gupta committed
230
49. :doc:`Transformer-XL <model_doc/transformerxl>` (from Google/CMU) released with the paper `Transformer-XL:
231
232
    Attentive Language Models Beyond a Fixed-Length Context <https://arxiv.org/abs/1901.02860>`__ by Zihang Dai*,
    Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov.
Vasudev Gupta's avatar
Vasudev Gupta committed
233
50. :doc:`Vision Transformer (ViT) <model_doc/vit>` (from Google AI) released with the paper `An Image is Worth 16x16
234
235
236
    Words: Transformers for Image Recognition at Scale <https://arxiv.org/abs/2010.11929>`__ by Alexey Dosovitskiy,
    Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias
    Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby.
Vasudev Gupta's avatar
Vasudev Gupta committed
237
51. :doc:`Wav2Vec2 <model_doc/wav2vec2>` (from Facebook AI) released with the paper `wav2vec 2.0: A Framework for
Patrick von Platen's avatar
Patrick von Platen committed
238
239
    Self-Supervised Learning of Speech Representations <https://arxiv.org/abs/2006.11477>`__ by Alexei Baevski, Henry
    Zhou, Abdelrahman Mohamed, Michael Auli.
Vasudev Gupta's avatar
Vasudev Gupta committed
240
52. :doc:`XLM <model_doc/xlm>` (from Facebook) released together with the paper `Cross-lingual Language Model
241
    Pretraining <https://arxiv.org/abs/1901.07291>`__ by Guillaume Lample and Alexis Conneau.
Vasudev Gupta's avatar
Vasudev Gupta committed
242
53. :doc:`XLM-ProphetNet <model_doc/xlmprophetnet>` (from Microsoft Research) released with the paper `ProphetNet:
Lysandre's avatar
Lysandre committed
243
244
    Predicting Future N-gram for Sequence-to-Sequence Pre-training <https://arxiv.org/abs/2001.04063>`__ by Yu Yan,
    Weizhen Qi, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang and Ming Zhou.
Vasudev Gupta's avatar
Vasudev Gupta committed
245
54. :doc:`XLM-RoBERTa <model_doc/xlmroberta>` (from Facebook AI), released together with the paper `Unsupervised
246
247
248
    Cross-lingual Representation Learning at Scale <https://arxiv.org/abs/1911.02116>`__ by Alexis Conneau*, Kartikay
    Khandelwal*, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke
    Zettlemoyer and Veselin Stoyanov.
Vasudev Gupta's avatar
Vasudev Gupta committed
249
55. :doc:`XLNet <model_doc/xlnet>` (from Google/CMU) released with the paper `​XLNet: Generalized Autoregressive
250
251
    Pretraining for Language Understanding <https://arxiv.org/abs/1906.08237>`__ by Zhilin Yang*, Zihang Dai*, Yiming
    Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le.
Vasudev Gupta's avatar
Vasudev Gupta committed
252
56. :doc:`XLSR-Wav2Vec2 <model_doc/xlsr_wav2vec2>` (from Facebook AI) released with the paper `Unsupervised
253
254
    Cross-Lingual Representation Learning For Speech Recognition <https://arxiv.org/abs/2006.13979>`__ by Alexis
    Conneau, Alexei Baevski, Ronan Collobert, Abdelrahman Mohamed, Michael Auli.
LysandreJik's avatar
LysandreJik committed
255

Sylvain Gugger's avatar
Sylvain Gugger committed
256

257
258
.. _bigtable:

Sylvain Gugger's avatar
Sylvain Gugger committed
259
The table below represents the current support in the library for each of those models, whether they have a Python
260
261
tokenizer (called "slow"). A "fast" tokenizer backed by the 🤗 Tokenizers library, whether they have support in Jax (via
Flax), PyTorch, and/or TensorFlow.
Sylvain Gugger's avatar
Sylvain Gugger committed
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278

..
    This table is updated automatically from the auto modules with `make fix-copies`. Do not update manually!

.. rst-class:: center-aligned-table

+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|            Model            | Tokenizer slow | Tokenizer fast | PyTorch support | TensorFlow support | Flax Support |
+=============================+================+================+=================+====================+==============+
|           ALBERT            |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|            BART             |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|            BERT             |       ✅       |       ✅       |       ✅        |         ✅         |      ✅      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|       Bert Generation       |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Vasudev Gupta's avatar
Vasudev Gupta committed
279
280
|           BigBird           |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Vasudev Gupta's avatar
Vasudev Gupta committed
281
282
|       BigBirdPegasus        |       ❌       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
283
284
|         Blenderbot          |       ✅       |       ❌       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
285
|       BlenderbotSmall       |       ✅       |       ❌       |       ✅        |         ✅         |      ❌      |
286
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
287
288
289
290
|            CTRL             |       ✅       |       ❌       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|          CamemBERT          |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
abhishek thakur's avatar
abhishek thakur committed
291
292
|          ConvBERT           |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
293
294
|             DPR             |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
295
|           DeBERTa           |       ✅       |       ✅       |       ✅        |         ❌         |      ❌      |
Sylvain Gugger's avatar
Sylvain Gugger committed
296
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
297
298
|         DeBERTa-v2          |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
NielsRogge's avatar
NielsRogge committed
299
300
|            DeiT             |       ❌       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
301
302
|         DistilBERT          |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
303
|           ELECTRA           |       ✅       |       ✅       |       ✅        |         ✅         |      ✅      |
Sylvain Gugger's avatar
Sylvain Gugger committed
304
305
306
307
308
309
310
311
312
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|       Encoder decoder       |       ❌       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
| FairSeq Machine-Translation |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|          FlauBERT           |       ✅       |       ❌       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|     Funnel Transformer      |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Suraj Patil's avatar
Suraj Patil committed
313
314
|           GPT Neo           |       ❌       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sehoon Kim's avatar
Sehoon Kim committed
315
316
|           I-BERT            |       ❌       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Patrick von Platen's avatar
Patrick von Platen committed
317
318
|             LED             |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
NielsRogge's avatar
NielsRogge committed
319
320
|            LUKE             |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
321
322
|           LXMERT            |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Amir Tahmasbi's avatar
Amir Tahmasbi committed
323
|          LayoutLM           |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
Sylvain Gugger's avatar
Sylvain Gugger committed
324
325
326
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|         Longformer          |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Suraj Patil's avatar
Suraj Patil committed
327
328
|           M2M100            |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
StillKeepTry's avatar
StillKeepTry committed
329
330
|            MPNet            |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
331
332
|           Marian            |       ✅       |       ❌       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
333
334
|        MegatronBert         |       ❌       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
335
336
337
338
339
340
341
342
343
344
|         MobileBERT          |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|         OpenAI GPT          |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|        OpenAI GPT-2         |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|           Pegasus           |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|         ProphetNet          |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Ratthachat (Jung)'s avatar
Ratthachat (Jung) committed
345
|             RAG             |       ✅       |       ❌       |       ✅        |         ✅         |      ❌      |
Sylvain Gugger's avatar
Sylvain Gugger committed
346
347
348
349
350
351
352
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|          Reformer           |       ✅       |       ✅       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|          RetriBERT          |       ✅       |       ✅       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|           RoBERTa           |       ✅       |       ✅       |       ✅        |         ✅         |      ✅      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Suraj Patil's avatar
Suraj Patil committed
353
354
|         Speech2Text         |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
355
356
357
358
|         SqueezeBERT         |       ✅       |       ✅       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|             T5              |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
NielsRogge's avatar
NielsRogge committed
359
360
|            TAPAS            |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
361
362
|       Transformer-XL        |       ✅       |       ❌       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
363
364
|             ViT             |       ❌       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Patrick von Platen's avatar
Patrick von Platen committed
365
366
|          Wav2Vec2           |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
Sylvain Gugger's avatar
Sylvain Gugger committed
367
368
369
370
371
372
373
374
375
376
377
378
379
|             XLM             |       ✅       |       ❌       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|         XLM-RoBERTa         |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|        XLMProphetNet        |       ✅       |       ❌       |       ✅        |         ❌         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|            XLNet            |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|            mBART            |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+
|             mT5             |       ✅       |       ✅       |       ✅        |         ✅         |      ❌      |
+-----------------------------+----------------+----------------+-----------------+--------------------+--------------+

380
381
.. toctree::
    :maxdepth: 2
382
    :caption: Get started
383

Sylvain Gugger's avatar
Sylvain Gugger committed
384
    quicktour
385
    installation
Sylvain Gugger's avatar
Sylvain Gugger committed
386
    philosophy
Lysandre's avatar
Lysandre committed
387
    glossary
388
389
390

.. toctree::
    :maxdepth: 2
Sylvain Gugger's avatar
Sylvain Gugger committed
391
    :caption: Using 🤗 Transformers
392

Sylvain Gugger's avatar
Sylvain Gugger committed
393
394
    task_summary
    model_summary
Sylvain Gugger's avatar
Sylvain Gugger committed
395
    preprocessing
396
    training
397
    model_sharing
Sylvain Gugger's avatar
Sylvain Gugger committed
398
    tokenizer_summary
399
400
401
402
403
404
405
    multilingual

.. toctree::
    :maxdepth: 2
    :caption: Advanced guides

    pretrained_models
406
    examples
407
    troubleshooting
408
    custom_datasets
409
    notebooks
410
    sagemaker
411
    community
412
    converting_tensorflow_models
413
    migration
414
    contributing
415
    add_new_model
416
    fast_tokenizers
417
    testing
418
    debugging
Funtowicz Morgan's avatar
Funtowicz Morgan committed
419
    serialization
420
421
422
423
424
425

.. toctree::
    :maxdepth: 2
    :caption: Research

    bertology
426
    perplexity
427
    benchmarks
428

thomwolf's avatar
thomwolf committed
429
430
.. toctree::
    :maxdepth: 2
431
    :caption: Main Classes
thomwolf's avatar
thomwolf committed
432

Sylvain Gugger's avatar
Sylvain Gugger committed
433
    main_classes/callback
thomwolf's avatar
thomwolf committed
434
    main_classes/configuration
435
    main_classes/data_collator
436
    main_classes/logging
thomwolf's avatar
thomwolf committed
437
438
    main_classes/model
    main_classes/optimizer_schedules
439
440
    main_classes/output
    main_classes/pipelines
LysandreJik's avatar
LysandreJik committed
441
    main_classes/processors
442
443
    main_classes/tokenizer
    main_classes/trainer
444
    main_classes/feature_extractor
445
446
447
448
449
450

.. toctree::
    :maxdepth: 2
    :caption: Models

    model_doc/albert
thomwolf's avatar
thomwolf committed
451
    model_doc/auto
452
    model_doc/bart
453
    model_doc/barthez
454
    model_doc/bert
455
    model_doc/bertweet
456
    model_doc/bertgeneration
457
    model_doc/bert_japanese
Vasudev Gupta's avatar
Vasudev Gupta committed
458
    model_doc/bigbird
Vasudev Gupta's avatar
Vasudev Gupta committed
459
    model_doc/bigbird_pegasus
Sam Shleifer's avatar
Sam Shleifer committed
460
    model_doc/blenderbot
461
    model_doc/blenderbot_small
Stefan Schweter's avatar
Stefan Schweter committed
462
    model_doc/bort
Lysandre's avatar
Lysandre committed
463
    model_doc/camembert
abhishek thakur's avatar
abhishek thakur committed
464
    model_doc/convbert
465
    model_doc/cpm
466
    model_doc/ctrl
Pengcheng He's avatar
Pengcheng He committed
467
    model_doc/deberta
468
    model_doc/deberta_v2
NielsRogge's avatar
NielsRogge committed
469
    model_doc/deit
470
    model_doc/dialogpt
471
    model_doc/distilbert
Quentin Lhoest's avatar
Quentin Lhoest committed
472
    model_doc/dpr
473
474
475
    model_doc/electra
    model_doc/encoderdecoder
    model_doc/flaubert
476
    model_doc/fsmt
Sylvain Gugger's avatar
Sylvain Gugger committed
477
    model_doc/funnel
478
    model_doc/herbert
Sehoon Kim's avatar
Sehoon Kim committed
479
    model_doc/ibert
Minghao Li's avatar
Minghao Li committed
480
    model_doc/layoutlm
Patrick von Platen's avatar
Patrick von Platen committed
481
    model_doc/led
482
    model_doc/longformer
NielsRogge's avatar
NielsRogge committed
483
    model_doc/luke
484
485
    model_doc/lxmert
    model_doc/marian
Suraj Patil's avatar
Suraj Patil committed
486
    model_doc/m2m_100
487
    model_doc/mbart
488
489
    model_doc/megatron_bert
    model_doc/megatron_gpt2
490
    model_doc/mobilebert
StillKeepTry's avatar
StillKeepTry committed
491
    model_doc/mpnet
Patrick von Platen's avatar
Patrick von Platen committed
492
    model_doc/mt5
493
494
    model_doc/gpt
    model_doc/gpt2
Suraj Patil's avatar
Suraj Patil committed
495
    model_doc/gpt_neo
496
    model_doc/pegasus
497
    model_doc/phobert
Weizhen's avatar
Weizhen committed
498
    model_doc/prophetnet
Sylvain Gugger's avatar
Sylvain Gugger committed
499
    model_doc/rag
500
501
502
    model_doc/reformer
    model_doc/retribert
    model_doc/roberta
Suraj Patil's avatar
Suraj Patil committed
503
    model_doc/speech_to_text
504
    model_doc/squeezebert
505
    model_doc/t5
NielsRogge's avatar
NielsRogge committed
506
    model_doc/tapas
507
    model_doc/transformerxl
508
    model_doc/vit
Patrick von Platen's avatar
Patrick von Platen committed
509
    model_doc/wav2vec2
510
    model_doc/xlm
Weizhen's avatar
Weizhen committed
511
    model_doc/xlmprophetnet
512
513
    model_doc/xlmroberta
    model_doc/xlnet
514
    model_doc/xlsr_wav2vec2
515
516
517
518
519

.. toctree::
    :maxdepth: 2
    :caption: Internal Helpers

Sylvain Gugger's avatar
Sylvain Gugger committed
520
    internal/modeling_utils
521
    internal/pipelines_utils
522
    internal/tokenization_utils
Sylvain Gugger's avatar
Sylvain Gugger committed
523
    internal/trainer_utils
524
    internal/generation_utils
525
    internal/file_utils