"...composable_kernel_rocm.git" did not exist on "94b8c6292637f9550e9a394d941fefbad04fd62e"
Unverified Commit d12ceb48 authored by Sylvain Gugger's avatar Sylvain Gugger Committed by GitHub
Browse files

Tokenization tutorial (#5257)



* All done

* Link to the tutorial

* Typo fixes
Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>

* Add metnion of the return_xxx args
Co-authored-by: default avatarThomas Wolf <thomwolf@users.noreply.github.com>
parent 7ac91107
...@@ -139,6 +139,7 @@ conversion utilities for the following models: ...@@ -139,6 +139,7 @@ conversion utilities for the following models:
task_summary task_summary
model_summary model_summary
preprocessing
serialization serialization
model_sharing model_sharing
multilingual multilingual
......
This diff is collapsed.
...@@ -204,7 +204,7 @@ padding token the model was pretrained with. The attention mask is also adapted ...@@ -204,7 +204,7 @@ padding token the model was pretrained with. The attention mask is also adapted
'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]])} [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]])}
You can learn more about tokenizers on their :doc:`doc page <main_classes/tokenizer>` (tutorial coming soon). You can learn more about tokenizers :doc:`here <preprocessing>`.
Using the model Using the model
^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment