@@ -23,7 +23,7 @@ The abstract from the paper is the following:
...
@@ -23,7 +23,7 @@ The abstract from the paper is the following:
Tips:
Tips:
- OPT has the same architecture as [`BartDecoder`].
- OPT has the same architecture as [`BartDecoder`].
- Contrary to GPT2, OPT adds the EOS token `</s>` to the beginning of every prompt. **Note**: Make sure to pass `use_fast=False` when loading OPT's tokenizer with [`AutoTokenizer`] to get the correct tokenizer.
- Contrary to GPT2, OPT adds the EOS token `</s>` to the beginning of every prompt.
This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ), [Younes Belkada](https://huggingface.co/ybelkada), and [Patrick Von Platen](https://huggingface.co/patrickvonplaten).
This model was contributed by [Arthur Zucker](https://huggingface.co/ArthurZ), [Younes Belkada](https://huggingface.co/ybelkada), and [Patrick Von Platen](https://huggingface.co/patrickvonplaten).
The original code can be found [here](https://github.com/facebookresearch/metaseq).
The original code can be found [here](https://github.com/facebookresearch/metaseq).