The model only requires a single token as input as all the previous tokens' key/value pairs are contained in the `past`.
The model only requires a single token as input as all the previous tokens' key/value pairs are contained in the `past`.
\ No newline at end of file
### Model2Model example
Encoder-decoder architectures require two tokenized inputs: one for the encoder and the other one for the decoder. Let's assume that we want to use `Model2Model` for generative question answering, and start by tokenizing the question and answer that will be fed to the model.
```python
importtorch
fromtransformersimportBertTokenizer,Model2Model
# OPTIONAL: if you want to have more information on what's happening under the hood, activate the logger as follows
# See the models docstrings for the detail of all the outputs
# In our case, the first element is the value of the LM loss
lm_loss=outputs[0]
```
This loss can be used to fine-tune `Model2Model` on the question answering task. Assuming that we fine-tuned the model, let us now see how to generate an answer:
```python
# Let's re-use the previous question
question="Who was Jim Henson?"
encoded_question=tokenizer.encode(question)
question_tensor=torch.tensor([encoded_question])
# This time we try to generate the answer, so we start with an empty sequence