Here is a quick-start example using `OpenAIGPTTokenizer`, `OpenAIGPTModel` and `OpenAIGPTLMHeadModel` class with OpenAI's pre-trained model. See the [doc section](#doc) below for all the details on these classes.
Here is a quick-start example using `TransfoXLTokenizer`, `TransfoXLModel` and `TransfoXLModelLMHeadModel` class with the Transformer-XL model pre-trained on WikiText-103. See the [doc section](#doc) below for all the details on these classes.
First let's prepare a tokenized input with `OpenAIGPTTokenizer`
First let's prepare a tokenized input with `TransfoXLTokenizer`
```python
```python
import torch
import torch
...
@@ -294,27 +317,40 @@ Let's see how to use `TransfoXLModel` to get hidden states
...
@@ -294,27 +317,40 @@ Let's see how to use `TransfoXLModel` to get hidden states
model = TransfoXLModel.from_pretrained('transfo-xl-wt103')
model = TransfoXLModel.from_pretrained('transfo-xl-wt103')
model.eval()
model.eval()
# Predict hidden states features for each layer
# If you have a GPU, put everything on cuda
hidden_states_1, mems_1 = model(tokens_tensor_1)
tokens_tensor_1 = tokens_tensor_1.to('cuda')
# We can re-use the memory cells in a subsequent call to attend a longer context