Tokenization tutorial (#5257)

* All done * Link to the tutorial * Typo fixes Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Add metnion of the return_xxx args Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

Tokenization tutorial (#5257)
* All done * Link to the tutorial * Typo fixes Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Add metnion of the return_xxx args Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
d12ceb48 · Sylvain Gugger · GitHub · 7ac91107 · d12ceb48 · d12ceb48
Unverified Commit d12ceb48 authored Jun 24, 2020 by Sylvain Gugger Committed by GitHub Jun 24, 2020
Showing with 375 additions and 1 deletion

docs/source/index.rst docs/source/index.rst +1 -0

docs/source/preprocessing.rst docs/source/preprocessing.rst +373 -0

docs/source/quicktour.rst docs/source/quicktour.rst +1 -1

No files found.
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -139,6 +139,7 @@ conversion utilities for the following models:
    task_summary
    model_summary
+    preprocessing
    serialization
    model_sharing
    multilingual

--- a/docs/source/preprocessing.rst
+++ b/docs/source/preprocessing.rst
--- a/docs/source/quicktour.rst
+++ b/docs/source/quicktour.rst
@@ -204,7 +204,7 @@ padding token the model was pretrained with. The attention mask is also adapted
     'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
                               [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]])}
-You can learn more about tokenizers on their :doc:`doc page <main_classes/tokenizer>` (tutorial coming soon).
+You can learn more about tokenizers :doc:`here <preprocessing>`.
 Using the model
 ^^^^^^^^^^^^^^^