blenderbot.rst 4.9 KB
Newer Older
Sam Shleifer's avatar
Sam Shleifer committed
1
Blenderbot
2
-----------------------------------------------------------------------------------------------------------------------
Sylvain Gugger's avatar
Sylvain Gugger committed
3
4
5

**DISCLAIMER:** If you see something strange, file a `Github Issue
<https://github.com/huggingface/transformers/issues/new?assignees=&labels=&template=bug-report.md&title>`__ .
Sam Shleifer's avatar
Sam Shleifer committed
6
7
8
9

Overview
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Sylvain Gugger's avatar
Sylvain Gugger committed
10
11
12
The Blender chatbot model was proposed in `Recipes for building an open-domain chatbot
<https://arxiv.org/pdf/2004.13637.pdf>`__ Stephen Roller, Emily Dinan, Naman Goyal, Da Ju, Mary Williamson, Yinhan Liu,
Jing Xu, Myle Ott, Kurt Shuster, Eric M. Smith, Y-Lan Boureau, Jason Weston on 30 Apr 2020.
Sam Shleifer's avatar
Sam Shleifer committed
13
14
15

The abstract of the paper is the following:

Sylvain Gugger's avatar
Sylvain Gugger committed
16
17
18
19
20
21
22
23
24
25
*Building open-domain chatbots is a challenging area for machine learning research. While prior work has shown that
scaling neural models in the number of parameters and the size of the data they are trained on gives improved results,
we show that other ingredients are important for a high-performing chatbot. Good conversation requires a number of
skills that an expert conversationalist blends in a seamless way: providing engaging talking points and listening to
their partners, and displaying knowledge, empathy and personality appropriately, while maintaining a consistent
persona. We show that large scale models can learn these skills when given appropriate training data and choice of
generation strategy. We build variants of these recipes with 90M, 2.7B and 9.4B parameter models, and make our models
and code publicly available. Human evaluations show our best models are superior to existing approaches in multi-turn
dialogue in terms of engagingness and humanness measurements. We then discuss the limitations of this work by analyzing
failure cases of our models.*
Sam Shleifer's avatar
Sam Shleifer committed
26
27
28
29
30
31
32
33
34

The authors' code can be found `here <https://github.com/facebookresearch/ParlAI>`__ .


Implementation Notes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- Blenderbot uses a standard `seq2seq model transformer <https://arxiv.org/pdf/1706.03762.pdf>`__ based architecture.
- It inherits completely from :class:`~transformers.BartForConditionalGeneration`
Sylvain Gugger's avatar
Sylvain Gugger committed
35
36
37
38
39
- Even though blenderbot is one model, it uses two tokenizers :class:`~transformers.BlenderbotSmallTokenizer` for 90M
  checkpoint and :class:`~transformers.BlenderbotTokenizer` for all other checkpoints.
- :class:`~transformers.BlenderbotSmallTokenizer` will always return :class:`~transformers.BlenderbotSmallTokenizer`,
  regardless of checkpoint. To use the 3B parameter checkpoint, you must call
  :class:`~transformers.BlenderbotTokenizer` directly.
Sam Shleifer's avatar
Sam Shleifer committed
40
41
42
43
44
45
- Available checkpoints can be found in the `model hub <https://huggingface.co/models?search=blenderbot>`__.


Usage
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

46
47
48
Here is an example of model usage:

.. code-block::
Sam Shleifer's avatar
Sam Shleifer committed
49
50
51
52
53
54
55
56
57
58
59

        >>> from transformers import BlenderbotSmallTokenizer, BlenderbotForConditionalGeneration
        >>> mname = 'facebook/blenderbot-90M'
        >>> model = BlenderbotForConditionalGeneration.from_pretrained(mname)
        >>> tokenizer = BlenderbotSmallTokenizer.from_pretrained(mname)
        >>> UTTERANCE = "My friends are cool but they eat too many carbs."
        >>> inputs = tokenizer([UTTERANCE], return_tensors='pt')
        >>> reply_ids = model.generate(**inputs)
        >>> print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in reply_ids])


60
61
62
63
Here is how you can check out config values:

.. code-block::

Sam Shleifer's avatar
Sam Shleifer committed
64
65
66
67
68
69
70
71
72
73

        >>> from transformers import BlenderbotConfig
        >>> config_90 = BlenderbotConfig.from_pretrained("facebook/blenderbot-90M")
        >>> config_90.to_diff_dict()  # show interesting Values.
        >>> configuration_3B = BlenderbotConfig("facebook/blenderbot-3B")
        >>> configuration_3B.to_diff_dict()


BlenderbotConfig
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
74

Sam Shleifer's avatar
Sam Shleifer committed
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
.. autoclass:: transformers.BlenderbotConfig
    :members:

BlenderbotTokenizer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BlenderbotTokenizer
    :members: build_inputs_with_special_tokens

BlenderbotSmallTokenizer
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autoclass:: transformers.BlenderbotSmallTokenizer
    :members:


BlenderbotForConditionalGeneration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sylvain Gugger's avatar
Sylvain Gugger committed
93

Sam Shleifer's avatar
Sam Shleifer committed
94
95
96
97
See :obj:`transformers.BartForConditionalGeneration` for arguments to `forward` and `generate`

.. autoclass:: transformers.BlenderbotForConditionalGeneration
    :members: