Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
3ff5e895
Unverified
Commit
3ff5e895
authored
Dec 18, 2020
by
Stas Bekman
Committed by
GitHub
Dec 18, 2020
Browse files
[t5 doc] typos (#9199)
* [t5 doc] typos a few run away backticks @sgugger * style
parent
291974c6
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
4 additions
and
4 deletions
+4
-4
docs/source/model_doc/t5.rst
docs/source/model_doc/t5.rst
+4
-4
No files found.
docs/source/model_doc/t5.rst
View file @
3ff5e895
...
...
@@ -44,9 +44,9 @@ Tips:
For more information about which prefix to use, it is easiest to look into Appendix D of the `paper
<https://arxiv.org/pdf/1910.10683.pdf>`__. - For sequence-to-sequence generation, it is recommended to use
:obj:`T5ForConditionalGeneration.generate()`
`
. This method takes care of feeding the encoded input via
cross-attention
layers to the decoder and auto-regressively generates the decoder output. - T5 uses relative scalar
embeddings.
Encoder input padding can be done on the left and on the right.
:obj:`T5ForConditionalGeneration.generate()`. This method takes care of feeding the encoded input via
cross-attention
layers to the decoder and auto-regressively generates the decoder output. - T5 uses relative scalar
embeddings.
Encoder input padding can be done on the left and on the right.
The original code can be found `here <https://github.com/google-research/text-to-text-transfer-transformer>`__.
...
...
@@ -55,7 +55,7 @@ Training
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher
forcing. This means that for training we always need an input sequence and a target sequence. The input sequence is fed
to the model using :obj:`input_ids`
`
. The target sequence is shifted to the right, i.e., prepended by a start-sequence
to the model using :obj:`input_ids`. The target sequence is shifted to the right, i.e., prepended by a start-sequence
token and fed to the decoder using the :obj:`decoder_input_ids`. In teacher-forcing style, the target sequence is then
appended by the EOS token and corresponds to the :obj:`labels`. The PAD token is hereby used as the start-sequence
token. T5 can be trained / fine-tuned both in a supervised and unsupervised fashion.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment