"...git@developer.sourcefind.cn:chenpangpang/open-webui.git" did not exist on "2b84af878a2bd0deab5423761a48705dcd8cb984"
Unverified Commit 12bb7fe7 authored by Lorenzo Ampil's avatar Lorenzo Ampil Committed by GitHub
Browse files

Fix t5 doc typos (#3978)

* Fix tpo in into and add line under

* Add missing blank line under

* Correct types under
parent 97a37548
...@@ -20,13 +20,14 @@ Training ...@@ -20,13 +20,14 @@ Training
~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing. T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing.
This means that for training we always need an input sequence and a target sequence. This means that for training we always need an input sequence and a target sequence.
The input sequence is fed to the model using ``input_ids``. The target sequence is shifted to the right, *i.e.* perprended by a start-sequence token and fed to the decoder using the `decoder_input_ids`. In teacher-forcing style, the target sequence is then appended by the EOS token and corresponds to the ``lm_labels``. The PAD token is hereby used as the start-sequence token. The input sequence is fed to the model using ``input_ids``. The target sequence is shifted to the right, *i.e.* prepended by a start-sequence token and fed to the decoder using the `decoder_input_ids`. In teacher-forcing style, the target sequence is then appended by the EOS token and corresponds to the ``lm_labels``. The PAD token is hereby used as the start-sequence token.
T5 can be trained / fine-tuned both in a supervised and unsupervised fashion. T5 can be trained / fine-tuned both in a supervised and unsupervised fashion.
- Unsupervised denoising training - Unsupervised denoising training
In this setup spans of the input sequence are masked by so-called sentinel tokens (*a.k.a* unique mask tokens) In this setup spans of the input sequence are masked by so-called sentinel tokens (*a.k.a* unique mask tokens)
and the output sequence is formed as a concatenation of the same sentinel tokens and the *real* masked tokens. and the output sequence is formed as a concatenation of the same sentinel tokens and the *real* masked tokens.
Each sentinel tokens represents a unique mask token for this sentence and should start with ``<extra_id_1>``, ``<extrac_id_2>``, ... up to ``<extra_id_100>``. As a default 100 sentinel tokens are available in ``T5Tokenizer``. Each sentinel token represents a unique mask token for this sentence and should start with ``<extra_id_1>``, ``<extra_id_2>``, ... up to ``<extra_id_100>``. As a default 100 sentinel tokens are available in ``T5Tokenizer``.
*E.g.* the sentence "The cute dog walks in the park" with the masks put on "cute dog" and "the" should be processed as follows: *E.g.* the sentence "The cute dog walks in the park" with the masks put on "cute dog" and "the" should be processed as follows:
:: ::
...@@ -37,6 +38,7 @@ T5 can be trained / fine-tuned both in a supervised and unsupervised fashion. ...@@ -37,6 +38,7 @@ T5 can be trained / fine-tuned both in a supervised and unsupervised fashion.
model(input_ids=input_ids, lm_labels=lm_labels) model(input_ids=input_ids, lm_labels=lm_labels)
- Supervised training - Supervised training
In this setup the input sequence and output sequence are standard sequence to sequence input output mapping. In this setup the input sequence and output sequence are standard sequence to sequence input output mapping.
In translation, *e.g.* the input sequence "The house is wonderful." and output sequence "Das Haus ist wunderbar." should In translation, *e.g.* the input sequence "The house is wonderful." and output sequence "Das Haus ist wunderbar." should
be processed as follows: be processed as follows:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment