bart.rst 2.59 KB
Newer Older
Sam Shleifer's avatar
Sam Shleifer committed
1
2
3
4
5
6
Bart
----------------------------------------------------
**DISCLAIMER:** This model is still a work in progress, if you see something strange,
file a `Github Issue <https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title>`__ and assign
@sshleifer

Sam Shleifer's avatar
Sam Shleifer committed
7
8
9
10
Paper
~~~~~
The Bart model was `proposed <https://arxiv.org/abs/1910.13461>`_ by Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov and Luke Zettlemoyer on 29 Oct, 2019.
According to the abstract:
Sam Shleifer's avatar
Sam Shleifer committed
11

Sam Shleifer's avatar
Sam Shleifer committed
12
13
14
- Bart uses a standard seq2seq/machine translation architecture with a bidirectional encoder (like BERT) and a left-to-right decoder (like GPT).
- The pretraining task involves randomly shuffling the order of the original sentences and a novel in-filling scheme, where spans of text are replaced with a single mask token.
- BART is particularly effective when fine tuned for text generation but also works well for comprehension tasks. It matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new state-of-the-art results on a range of abstractive dialogue, question answering, and summarization tasks, with gains of up to 6 ROUGE.
Sam Shleifer's avatar
Sam Shleifer committed
15

Sam Shleifer's avatar
Sam Shleifer committed
16
The Authors' code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/bart>`_
Sam Shleifer's avatar
Sam Shleifer committed
17
18


Sam Shleifer's avatar
Sam Shleifer committed
19
20
Implementation Notes
~~~~~~~~~~~~~~~~~~~~
Sam Shleifer's avatar
Sam Shleifer committed
21
22
23
- Bart doesn't use :obj:`token_type_ids`, for sequence classification just use BartTokenizer.encode to get the proper splitting.
- Inputs to the decoder are created by BartModel.forward if they are not passed. This is different than some other model APIs.
- Model predictions are intended to be identical to the original implementation. This only works, however, if the string you pass to fairseq.encode starts with a space.
Sam Shleifer's avatar
Sam Shleifer committed
24
25
26
27
- Decoder inputs are created automatically by the helper function ``transformers.modeling_bart._prepare_bart_decoder_inputs``
BartModel
- ``MaskedLM.generate`` should be used for summarization, see the example in that docstrings

Sam Shleifer's avatar
Sam Shleifer committed
28
29
30
31
32
33
34
35
36
37
38
39

BartModel
~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BartModel
    :members: forward


BartForMaskedLM
~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BartForMaskedLM
Sam Shleifer's avatar
Sam Shleifer committed
40
    :members: forward, generate
Sam Shleifer's avatar
Sam Shleifer committed
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59


BartForSequenceClassification
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BartForSequenceClassification
    :members: forward

BartConfig
~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BartConfig
    :members:

Automatic Creation of Decoder Inputs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is enabled by default

.. autofunction:: transformers.modeling_bart._prepare_bart_decoder_inputs