"examples/run_xlnet_classifier.py" did not exist on "8a4e90ff40e217d38311737ef4ab43531b6397c4"
mbart.rst 4.1 KB
Newer Older
1
MBart
Sylvain Gugger's avatar
Sylvain Gugger committed
2
-----------------------------------------------------------------------------------------------------------------------
3
4
5

**DISCLAIMER:** If you see something strange, file a `Github Issue
<https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title>`__ and assign
6
7
8
@sshleifer

Overview
Sylvain Gugger's avatar
Sylvain Gugger committed
9
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
10
11
12
The MBart model was presented in `Multilingual Denoising Pre-training for Neural Machine Translation
<https://arxiv.org/abs/2001.08210>`_ by Yinhan Liu, Jiatao Gu, Naman Goyal, Xian Li, Sergey Edunov
Marjan Ghazvininejad, Mike Lewis, Luke Zettlemoyer.
13

14
15
16
17
According to the abstract, MBART is a sequence-to-sequence denoising auto-encoder pretrained on large-scale monolingual
corpora in many languages using the BART objective. mBART is one of the first methods for pre-training a complete
sequence-to-sequence model by denoising full texts in multiple languages, while previous approaches have focused only
on the encoder, decoder, or reconstructing parts of the text.
18
19
20
21

The Authors' code can be found `here <https://github.com/pytorch/fairseq/tree/master/examples/mbart>`__


22
Training
Sylvain Gugger's avatar
Sylvain Gugger committed
23
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24
25
MBart is a multilingual encoder-decoder (seq-to-seq) model primarily intended for translation task. 
As the model is multilingual it expects the sequences in a different format. A special language id token 
26
27
28
29
30
is added in both the source and target text. The source text format is :obj:`X [eos, src_lang_code]`
where :obj:`X` is the source text. The target text format is :obj:`[tgt_lang_code] X [eos]`. :obj:`bos` is never used.

The :meth:`~transformers.MBartTokenizer.prepare_seq2seq_batch` handles this automatically and should be used to encode 
the sequences for sequence-to-sequence fine-tuning.
31
32
33

- Supervised training

Sylvain Gugger's avatar
Sylvain Gugger committed
34
.. code-block::
35
36
37
38
39
40
41
42
43
44
45
46

    example_english_phrase = "UN Chief Says There Is No Military Solution in Syria"
    expected_translation_romanian = "艦eful ONU declar膬 c膬 nu exist膬 o solu牛ie militar膬 卯n Siria"
    batch = tokenizer.prepare_seq2seq_batch(example_english_phrase, src_lang="en_XX", tgt_lang="ro_RO", tgt_texts=expected_translation_romanian)
    input_ids = batch["input_ids"]
    target_ids = batch["decoder_input_ids"]
    decoder_input_ids = target_ids[:, :-1].contiguous()
    labels = target_ids[:, 1:].clone()
    model(input_ids=input_ids, decoder_input_ids=decoder_input_ids, labels=labels) #forward

- Generation

47
48
    While generating the target text set the :obj:`decoder_start_token_id` to the target language id. 
    The following example shows how to translate English to Romanian using the `facebook/mbart-large-en-ro` model.
49

Sylvain Gugger's avatar
Sylvain Gugger committed
50
.. code-block::
51
52
53
54
55
56
57
58
59
60
61

    from transformers import MBartForConditionalGeneration, MBartTokenizer
    model = MBartForConditionalGeneration.from_pretrained("facebook/mbart-large-en-ro")
    tokenizer = MBartTokenizer.from_pretrained("facebook/mbart-large-en-ro")
    article = "UN Chief Says There Is No Military Solution in Syria"
    batch = tokenizer.prepare_seq2seq_batch(src_texts=[article], src_lang="en_XX")
    translated_tokens = model.generate(**batch, decoder_start_token_id=tokenizer.lang_code_to_id["ro_RO"])
    translation = tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]
    assert translation == "艦eful ONU declar膬 c膬 nu exist膬 o solu牛ie militar膬 卯n Siria"


62
MBartConfig
Sylvain Gugger's avatar
Sylvain Gugger committed
63
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
64
65
66
67
68
69

.. autoclass:: transformers.MBartConfig
    :members:


MBartTokenizer
Sylvain Gugger's avatar
Sylvain Gugger committed
70
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71
72
73
74
75
76

.. autoclass:: transformers.MBartTokenizer
    :members: build_inputs_with_special_tokens, prepare_seq2seq_batch


MBartForConditionalGeneration
Sylvain Gugger's avatar
Sylvain Gugger committed
77
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
78
79

.. autoclass:: transformers.MBartForConditionalGeneration
80
    :members: forward