Update tutorial docs (#15165)

* first draft of pipeline, autoclass, preprocess tutorials * apply review feedback * 🖍 apply feedback from patrick/niels * 📝add output image to preprocessed image * 🖍 apply feedback from patrick

Update tutorial docs (#15165)
* first draft of pipeline, autoclass, preprocess tutorials * apply review feedback * 🖍 apply feedback from patrick/niels * 📝add output image to preprocessed image * 🖍 apply feedback from patrick
b9418a1d · Steven Liu · GitHub · c157c7e3 · b9418a1d · b9418a1d
Unverified Commit b9418a1d authored Feb 01, 2022 by Steven Liu Committed by GitHub Feb 01, 2022
4 changed files
--- a/docs/source/_toctree.yml
+++ b/docs/source/_toctree.yml
@@ -11,12 +11,16 @@
    title: Glossary
  title: Get started
 - sections:
+  - local: pipeline_tutorial
+    title: Pipelines for inference
+  - local: autoclass_tutorial
+    title: Load pretrained instances with an AutoClass
+  - local: preprocessing
+    title: Preprocess
  - local: task_summary
    title: Summary of the tasks
  - local: model_summary
    title: Summary of the models
-  - local: preprocessing
-    title: Preprocessing data
  - local: training
    title: Fine-tuning a pretrained model
  - local: accelerate
@@ -27,7 +31,7 @@
    title: Summary of the tokenizers
  - local: multilingual
    title: Multi-lingual models
-  title: "Using 🤗 Transformers"
+  title: Tutorials
 - sections:
  - local: examples
    title: Examples

--- a/docs/source/autoclass_tutorial.mdx
+++ b/docs/source/autoclass_tutorial.mdx
+<!--Copyright 2022 The HuggingFace Team. All rights reserved.
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+# Load pretrained instances with an AutoClass
+With so many different Transformer architectures, it can be challenging to create one for your checkpoint. As a part of 🤗 Transformers core philosophy to make the library easy, simple and flexible to use, an `AutoClass` automatically infer and load the correct architecture from a given checkpoint. The `from_pretrained` method lets you quickly load a pretrained model for any architecture so you don't have to devote time and resources to train a model from scratch. Producing this type of checkpoint-agnostic code means if your code works for one checkpoint, it will work with another checkpoint - as long as it was trained for a similar task - even if the architecture is different.
+<Tip>
+Remember, architecture refers to the skeleton of the model and checkpoints are the weights for a given architecture. For example, [BERT](https://huggingface.co/bert-base-uncased) is an architecture, while `bert-base-uncased` is a checkpoint. Model is a general term that can mean either architecture or checkpoint.
+</Tip>
+In this tutorial, learn to:
+* Load a pretrained tokenizer.
+* Load a pretrained feature extractor.
+* Load a pretrained processor.
+* Load a pretrained model.
+## AutoTokenizer
+Nearly every NLP task begins with a tokenizer. A tokenizer converts your input into a format that can be processed by the model.
+Load a tokenizer with [`AutoTokenizer.from_pretrained`]:
+```py
+>>> from transformers import AutoTokenizer
+>>> tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
+```
+Then tokenize your input as shown below:
+```py
+>>> sequence = "In a hole in the ground there lived a hobbit."
+>>> print(tokenizer(sequence))
+{'input_ids': [101, 1999, 1037, 4920, 1999, 1996, 2598, 2045, 2973, 1037, 7570, 10322, 4183, 1012, 102], 
+ 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
+ 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}
+```
+## AutoFeatureExtractor
+For audio and vision tasks, a feature extractor processes the audio signal or image into the correct input format.
+Load a feature extractor with [`AutoFeatureExtractor.from_pretrained`]:
+```py
+>>> from transformers import AutoFeatureExtractor
+>>> feature_extractor = AutoFeatureExtractor.from_pretrained(
+...     "ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition"
+... )
+```
+## AutoProcessor
+Multimodal tasks require a processor that combines two types of preprocessing tools. For example, the [LayoutLMV2](model_doc/layoutlmv2) model requires a feature extractor to handle images and a tokenizer to handle text; a processor combines both of them.
+Load a processor with [`AutoProcessor.from_pretrained`]:
+```py
+>>> from transformers import AutoProcessor
+>>> processor = AutoProcessor.from_pretrained("microsoft/layoutlmv2-base-uncased")
+```
+## AutoModel
+Finally, the `AutoModelFor` classes let you load a pretrained model for a given task (see [here](model_doc/auto) for a complete list of available tasks). For example, load a model for sequence classification with [`AutoModelForSequenceClassification.from_pretrained`]:
+```py
+>>> from transformers import AutoModelForSequenceClassification
+>>> model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
+===PT-TF-SPLIT===
+>>> from transformers import TFAutoModelForSequenceClassification
+>>> model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
+```
+Easily reuse the same checkpoint to load an architecture for a different task:
+```py
+>>> from transformers import AutoModelForTokenClassification
+>>> model = AutoModelForTokenClassification.from_pretrained("distilbert-base-uncased")
+===PT-TF-SPLIT===
+>>> from transformers import TFAutoModelForTokenClassification
+>>> model = TFAutoModelForTokenClassification.from_pretrained("distilbert-base-uncased")
+```
+Generally, we recommend using the `AutoTokenizer` class and the `AutoModelFor` class to load pretrained instances of models. This will ensure you load the correct architecture every time. In the next [tutorial](preprocessing), learn how to use your newly loaded tokenizer, feature extractor and processor to preprocess a dataset for fine-tuning.
\ No newline at end of file
--- a/docs/source/pipeline_tutorial.mdx
+++ b/docs/source/pipeline_tutorial.mdx
+<!--Copyright 2022 The HuggingFace Team. All rights reserved.
+Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
+the License. You may obtain a copy of the License at
+http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
+an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
+specific language governing permissions and limitations under the License.
+-->
+# Pipelines for inference
+The [`pipeline`] makes it simple to use any model from the [Model Hub](https://huggingface.co/models) for inference on a variety of tasks such as text generation, image segmentation and audio classification. Even if you don't have experience with a specific modality or understand the code powering the models, you can still use them with the [`pipeline`]! This tutorial will teach you to:
+* Use a [`pipeline`] for inference.
+* Use a specific tokenizer or model.
+* Use a [`pipeline`] for audio and vision tasks.
+<Tip>
+Take a look at the [`pipeline`] documentation for a complete list of supported taska.
+</Tip>
+## Pipeline usage
+While each task has an associated [`pipeline`], it is simpler to use the general [`pipeline`] abstraction which contains all the specific task pipelines. The [`pipeline`] automatically loads a default model and tokenizer capable of inference for your task. 
+1. Start by creating a [`pipeline`] and specify an inference task:
+```py
+>>> from transformers import pipeline
+>>> generator = pipeline(task="text-generation")
+```
+2. Pass your input text to the [`pipeline`]:
+```py
+>>> generator("Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone")
+[{'generated_text': 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Seven for the Iron-priests at the door to the east, and thirteen for the Lord Kings at the end of the mountain'}]
+```
+If you have more than one input, pass your input as a list:
+```py
+>>> generator(
+...     [
+...         "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone",
+...         "Nine for Mortal Men, doomed to die, One for the Dark Lord on his dark throne",
+...     ]
+... )
+```
+Any additional parameters for your task can also be included in the [`pipeline`]. The `text-generation` task has a [`~generation_utils.GenerationMixin.generate`] method with several parameters for controlling the output. For example, if you want to generate more than one output, set the `num_return_sequences` parameter:
+```py
+>>> generator(
+...     "Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone",
+...     num_return_sequences=2,
+... )
+```
+### Choose a model and tokenizer
+The [`pipeline`] accepts any model from the [Model Hub](https://huggingface.co/models). There are tags on the Model Hub that allow you to filter for a model you'd like to use for your task. Once you've picked an appropriate model, load it with the corresponding `AutoModelFor` and [`AutoTokenizer'] class. For example, load the [`AutoModelForCausalLM`] class for a causal language modeling task:
+```py
+>>> from transformers import AutoTokenizer, AutoModelForCausalLM
+>>> tokenizer = AutoTokenizer.from_pretrained("distilgpt2")
+>>> model = AutoModelForCausalLM.from_pretrained("distilgpt2")
+```
+Create a [`pipeline`] for your task, and specify the model and tokenizer you've loaded:
+```py
+>>> from transformers import pipeline
+>>> generator = pipeline(task="text-generation", model=model, tokenizer=tokenizer)
+```
+Pass your input text to the [`pipeline`] to generate some text:
+```py
+>>> generator("Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone")
+[{'generated_text': 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone, Seven for the Dragon-lords (for them to rule in a world ruled by their rulers, and all who live within the realm'}]
+```
+## Audio pipeline
+The flexibility of the [`pipeline`] means it can also be extended to audio tasks.
+For example, let's classify the emotion from a short clip of John F. Kennedy's famous ["We choose to go to the Moon"](https://en.wikipedia.org/wiki/We_choose_to_go_to_the_Moon) speech. Find an [audio classification](https://huggingface.co/models?pipeline_tag=audio-classification) model on the Model Hub for emotion recognition and load it in the [`pipeline`]:
+```py
+>>> from transformers import pipeline
+>>> audio_classifier = pipeline(
+...     task="audio-classification", model="ehcalabres/wav2vec2-lg-xlsr-en-speech-emotion-recognition"
+... )
+```
+Pass the audio file to the [`pipeline`]:
+```py
+>>> audio_classifier("jfk_moon_speech.wav")
+[{'label': 'calm', 'score': 0.13856211304664612},
+ {'label': 'disgust', 'score': 0.13148026168346405},
+ {'label': 'happy', 'score': 0.12635163962841034},
+ {'label': 'angry', 'score': 0.12439591437578201},
+ {'label': 'fearful', 'score': 0.12404385954141617}]
+```
+## Vision pipeline
+Finally, using a [`pipeline`] for vision tasks is practically identical.
+Specify your vision task and pass your image to the classifier. The imaage can be a link or a local path to the image. For example, what species of cat is shown below?
+![pipeline-cat-chonk](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg)
+```py
+>>> from transformers import pipeline
+>>> vision_classifier = pipeline(task="image-classification")
+>>> vision_classifier(
+...     images="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
+... )
+[{'label': 'lynx, catamount', 'score': 0.4403027892112732},
+ {'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor',
+  'score': 0.03433405980467796},
+ {'label': 'snow leopard, ounce, Panthera uncia',
+  'score': 0.032148055732250214},
+ {'label': 'Egyptian cat', 'score': 0.02353910356760025},
+ {'label': 'tiger cat', 'score': 0.023034192621707916}]
+```
--- a/docs/source/preprocessing.mdx
+++ b/docs/source/preprocessing.mdx