bart.rst 5.02 KB
Newer Older
1
BART
Sylvain Gugger's avatar
Sylvain Gugger committed
2
-----------------------------------------------------------------------------------------------------------------------
3
4
5

**DISCLAIMER:** If you see something strange, file a `Github Issue
<https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title>`__ and assign
Sam Shleifer's avatar
Sam Shleifer committed
6
7
@sshleifer

Sylvain Gugger's avatar
Sylvain Gugger committed
8
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
9
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
10

11
12
13
14
The Bart model was proposed in `BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation,
Translation, and Comprehension <https://arxiv.org/abs/1910.13461>`__ by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan
Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer on 29 Oct, 2019.

15
According to the abstract,
Sam Shleifer's avatar
Sam Shleifer committed
16

17
18
19
20
21
22
23
24
- Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a
  left-to-right decoder (like GPT).
- The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme,
  where spans of text are replaced with a single mask token.
- BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It
  matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new
  state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains
  of up to 6 ROUGE.
Sam Shleifer's avatar
Sam Shleifer committed
25

26
The Authors' code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/bart>`__.
Sam Shleifer's avatar
Sam Shleifer committed
27
28


29
30
31
32
33
34
35
36
37
38
Examples
_______________________________________________________________________________________________________________________

- Examples and scripts for fine-tuning BART and other models for sequence to sequence tasks can be found in
  `examples/seq2seq/ <https://github.com/huggingface/transformers/blob/master/examples/seq2seq/README.md>`__.
- An example of how to train :class:`~transformers.BartForConditionalGeneration` with a Hugging Face :obj:`datasets`
  object can be found in this `forum discussion
  <https://discuss.huggingface.co/t/train-bart-for-conditional-generation-e-g-summarization/1904>`__.


39
Implementation Notes
Sylvain Gugger's avatar
Sylvain Gugger committed
40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
41

Sylvain Gugger's avatar
Sylvain Gugger committed
42
43
- Bart doesn't use :obj:`token_type_ids` for sequence classification. Use :class:`~transformers.BartTokenizer` or
  :meth:`~transformers.BartTokenizer.encode` to get the proper splitting.
44
- The forward pass of :class:`~transformers.BartModel` will create decoder inputs (using the helper function
Sylvain Gugger's avatar
Sylvain Gugger committed
45
  :func:`transformers.modeling_bart._prepare_bart_decoder_inputs`) if they are not passed. This is different than some
46
47
48
49
50
51
52
53
  other modeling APIs.
- Model predictions are intended to be identical to the original implementation. This only works, however, if the
  string you pass to :func:`fairseq.encode` starts with a space.
- :meth:`~transformers.BartForConditionalGeneration.generate` should be used for conditional generation tasks like
  summarization, see the example in that docstrings.
- Models that load the `facebook/bart-large-cnn` weights will not have a :obj:`mask_token_id`, or be able to perform
  mask-filling tasks.
- For training/forward passes that don't involve beam search, pass :obj:`use_cache=False`.
54

55

Sylvain Gugger's avatar
Sylvain Gugger committed
56
BartConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
57
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
58
59
60
61
62
63

.. autoclass:: transformers.BartConfig
    :members:


BartTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
64
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
65
66
67

.. autoclass:: transformers.BartTokenizer
    :members:
Sam Shleifer's avatar
Sam Shleifer committed
68

Sam Shleifer's avatar
Sam Shleifer committed
69

70

Sam Shleifer's avatar
Sam Shleifer committed
71
BartModel
Sylvain Gugger's avatar
Sylvain Gugger committed
72
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sam Shleifer's avatar
Sam Shleifer committed
73
74
75
76

.. autoclass:: transformers.BartModel
    :members: forward

77
.. autofunction:: transformers.modeling_bart._prepare_bart_decoder_inputs
Sam Shleifer's avatar
Sam Shleifer committed
78
79


80
81
82
83
84
85
86
BartForConditionalGeneration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BartForConditionalGeneration
    :members: forward


Sam Shleifer's avatar
Sam Shleifer committed
87
BartForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
88
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sam Shleifer's avatar
Sam Shleifer committed
89
90
91
92
93

.. autoclass:: transformers.BartForSequenceClassification
    :members: forward


Suraj Patil's avatar
Suraj Patil committed
94
BartForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
95
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suraj Patil's avatar
Suraj Patil committed
96
97
98

.. autoclass:: transformers.BartForQuestionAnswering
    :members: forward
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113



TFBartModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.TFBartModel
    :members: call


TFBartForConditionalGeneration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.TFBartForConditionalGeneration
    :members: call