index.rst 2.66 KB
Newer Older
LysandreJik's avatar
LysandreJik committed
1
Pytorch-Transformers
2
3
================================================================================================================================================

thomwolf's avatar
thomwolf committed
4
5
6
7
8
9
10
11
12
13
PyTorch-Transformers is a library of state-of-the-art pre-trained models for Natural Language Processing (NLP).

The library currently contains PyTorch implementations, pre-trained model weights, usage scripts and conversion utilities for the following models:

1. `BERT <https://github.com/google-research/bert>`_ (from Google) released with the paper `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`_ by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
2. `GPT <https://github.com/openai/finetune-transformer-lm>`_ (from OpenAI) released with the paper `Improving Language Understanding by Generative Pre-Training <https://blog.openai.com/language-unsupervised>`_ by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever.
3. `GPT-2 <https://blog.openai.com/better-language-models>`_ (from OpenAI) released with the paper `Language Models are Unsupervised Multitask Learners <https://blog.openai.com/better-language-models>`_ by Alec Radford*, Jeffrey Wu*, Rewon Child, David Luan, Dario Amodei** and Ilya Sutskever**.
4. `Transformer-XL <https://github.com/kimiyoung/transformer-xl>`_ (from Google/CMU) released with the paper `Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context <https://arxiv.org/abs/1901.02860>`_ by Zihang Dai*, Zhilin Yang*, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov.
5. `XLNet <https://github.com/zihangdai/xlnet>`_ (from Google/CMU) released with the paper `鈥媂LNet: Generalized Autoregressive Pretraining for Language Understanding <https://arxiv.org/abs/1906.08237>`_ by Zhilin Yang*, Zihang Dai*, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le.
6. `XLM <https://github.com/facebookresearch/XLM>`_ (from Facebook) released together with the paper `Cross-lingual Language Model Pretraining <https://arxiv.org/abs/1901.07291>`_ by Guillaume Lample and Alexis Conneau.
14
15
16

.. toctree::
    :maxdepth: 2
17
    :caption: Notes
18
19

    installation
thomwolf's avatar
thomwolf committed
20
21
    quickstart
    pretrained_models
22
23
    examples
    notebooks
thomwolf's avatar
thomwolf committed
24
    serialization
25
    converting_tensorflow_models
26
27
    migration
    bertology
28
    torchscript
29

thomwolf's avatar
thomwolf committed
30
31
32
33
34
35
36
37
.. toctree::
    :maxdepth: 2
    :caption: Main classes

    main_classes/configuration
    main_classes/model
    main_classes/tokenizer
    main_classes/optimizer_schedules
38

39
40
41
42
.. toctree::
    :maxdepth: 2
    :caption: Package Reference

thomwolf's avatar
thomwolf committed
43
    model_doc/auto
44
45
46
47
48
49
    model_doc/bert
    model_doc/gpt
    model_doc/transformerxl
    model_doc/gpt2
    model_doc/xlm
    model_doc/xlnet