bart.rst 3.76 KB
Newer Older
Sam Shleifer's avatar
Sam Shleifer committed
1
Bart
Sylvain Gugger's avatar
Sylvain Gugger committed
2
-----------------------------------------------------------------------------------------------------------------------
3
**DISCLAIMER:** If you see something strange,
Sam Shleifer's avatar
Sam Shleifer committed
4
5
6
file a `Github Issue <https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title>`__ and assign
@sshleifer

Sylvain Gugger's avatar
Sylvain Gugger committed
7
Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
8
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
9

Sam Shleifer's avatar
Sam Shleifer committed
10
The Bart model was `proposed <https://arxiv.org/abs/1910.13461>`_ by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer on 29 Oct, 2019.
11
According to the abstract,
Sam Shleifer's avatar
Sam Shleifer committed
12

Sam Shleifer's avatar
Sam Shleifer committed
13
14
15
- Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a left-to-right decoder (like GPT).
- The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme, where spans of text are replaced with a single mask token.
- BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains of up to 6 ROUGE.
Sam Shleifer's avatar
Sam Shleifer committed
16

Sam Shleifer's avatar
Sam Shleifer committed
17
The Authors' code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/bart>`_
Sam Shleifer's avatar
Sam Shleifer committed
18
19


20
Implementation Notes
Sylvain Gugger's avatar
Sylvain Gugger committed
21
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
22

23
24
25
26
- Bart doesn't use :obj:`token_type_ids` for sequence classification. Use BartTokenizer.encode to get the proper splitting.
- The forward pass of ``BartModel`` will create decoder inputs (using the helper function ``transformers.modeling_bart._prepare_bart_decoder_inputs``)  if they are not passed. This is different than some other modeling APIs.
- Model predictions are intended to be identical to the original implementation. This only works, however, if the string you pass to ``fairseq.encode`` starts with a space.
- ``BartForConditionalGeneration.generate`` should be used for conditional generation tasks like summarization, see the example in that docstrings
27
- Models that load the ``"facebook/bart-large-cnn"`` weights will not have a ``mask_token_id``, or be able to perform mask filling tasks.
28
29
30
31
- for training/forward passes that don't involve beam search, pass ``use_cache=False``


BartForConditionalGeneration
Sylvain Gugger's avatar
Sylvain Gugger committed
32
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
33
34

.. autoclass:: transformers.BartForConditionalGeneration
35
    :members: forward
36

37

Sylvain Gugger's avatar
Sylvain Gugger committed
38
BartConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
39
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
40
41
42
43
44
45

.. autoclass:: transformers.BartConfig
    :members:


BartTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
46
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
47
48
49

.. autoclass:: transformers.BartTokenizer
    :members:
Sam Shleifer's avatar
Sam Shleifer committed
50

Sam Shleifer's avatar
Sam Shleifer committed
51

52

Sam Shleifer's avatar
Sam Shleifer committed
53
BartModel
Sylvain Gugger's avatar
Sylvain Gugger committed
54
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sam Shleifer's avatar
Sam Shleifer committed
55
56
57
58

.. autoclass:: transformers.BartModel
    :members: forward

59
.. autofunction:: transformers.modeling_bart._prepare_bart_decoder_inputs
Sam Shleifer's avatar
Sam Shleifer committed
60
61
62


BartForSequenceClassification
Sylvain Gugger's avatar
Sylvain Gugger committed
63
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sam Shleifer's avatar
Sam Shleifer committed
64
65
66
67
68

.. autoclass:: transformers.BartForSequenceClassification
    :members: forward


Suraj Patil's avatar
Suraj Patil committed
69
BartForQuestionAnswering
Sylvain Gugger's avatar
Sylvain Gugger committed
70
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Suraj Patil's avatar
Suraj Patil committed
71
72
73
74
75

.. autoclass:: transformers.BartForQuestionAnswering
    :members: forward