bart.rst 4.42 KB
Newer Older
1
BART
Sylvain Gugger's avatar
Sylvain Gugger committed
2
-----------------------------------------------------------------------------------------------------------------------
3
4
5

**DISCLAIMER:** If you see something strange, file a `Github Issue
<https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title>`__ and assign
Sam Shleifer's avatar
Sam Shleifer committed
6
7
@sshleifer

Sylvain Gugger's avatar
Sylvain Gugger committed
8
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
9
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
10

11
12
13
14
The Bart model was proposed in `BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation,
Translation, and Comprehension <https://arxiv.org/abs/1910.13461>`__ by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan
Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer on 29 Oct, 2019.

15
According to the abstract,
Sam Shleifer's avatar
Sam Shleifer committed
16

17
18
19
20
21
22
23
24
- Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a
  left-to-right decoder (like GPT).
- The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme,
  where spans of text are replaced with a single mask token.
- BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It
  matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new
  state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains
  of up to 6 ROUGE.
Sam Shleifer's avatar
Sam Shleifer committed
25

26
The Authors' code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/bart>`__.
Sam Shleifer's avatar
Sam Shleifer committed
27
28


29
Implementation Notes
Sylvain Gugger's avatar
Sylvain Gugger committed
30
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
31

32
33
34
35
36
37
38
39
40
41
42
43
- Bart doesn't use :obj:`token_type_ids` for sequence classification. Use :class:`~transformers.BartTokenizer` 
  or :meth:`~transformers.BartTokenizer.encode` to get the proper splitting.
- The forward pass of :class:`~transformers.BartModel` will create decoder inputs (using the helper function
  :func:`transformers.modeling_bart._prepare_bart_decoder_inputs`)  if they are not passed. This is different than some
  other modeling APIs.
- Model predictions are intended to be identical to the original implementation. This only works, however, if the
  string you pass to :func:`fairseq.encode` starts with a space.
- :meth:`~transformers.BartForConditionalGeneration.generate` should be used for conditional generation tasks like
  summarization, see the example in that docstrings.
- Models that load the `facebook/bart-large-cnn` weights will not have a :obj:`mask_token_id`, or be able to perform
  mask-filling tasks.
- For training/forward passes that don't involve beam search, pass :obj:`use_cache=False`.
44

45

Sylvain Gugger's avatar
Sylvain Gugger committed
46
BartConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
47
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
48
49
50
51
52
53

.. autoclass:: transformers.BartConfig
    :members:


BartTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
54
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
55
56
57

.. autoclass:: transformers.BartTokenizer
    :members:
Sam Shleifer's avatar
Sam Shleifer committed
58

Sam Shleifer's avatar
Sam Shleifer committed
59

60

Sam Shleifer's avatar
Sam Shleifer committed
61
BartModel
Sylvain Gugger's avatar
Sylvain Gugger committed
62
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sam Shleifer's avatar
Sam Shleifer committed
63
64
65
66

.. autoclass:: transformers.BartModel
    :members: forward

67
.. autofunction:: transformers.modeling_bart._prepare_bart_decoder_inputs
Sam Shleifer's avatar
Sam Shleifer committed
68
69


70
71
72
73
74
75
76
BartForConditionalGeneration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BartForConditionalGeneration
    :members: forward


Sam Shleifer's avatar
Sam Shleifer committed
77
BartForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
78
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sam Shleifer's avatar
Sam Shleifer committed
79
80
81
82
83

.. autoclass:: transformers.BartForSequenceClassification
    :members: forward


Suraj Patil's avatar
Suraj Patil committed
84
BartForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
85
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suraj Patil's avatar
Suraj Patil committed
86
87
88

.. autoclass:: transformers.BartForQuestionAnswering
    :members: forward
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103



TFBartModel
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.TFBartModel
    :members: call


TFBartForConditionalGeneration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.TFBartForConditionalGeneration
    :members: call