Add inference section to task guides (#18781)

* 📝 start adding inference section to task guides * ✨ make style * 📝 add multiple choice * add rest of inference sections * make style * add compute_metric, push_to_hub, pipeline * make style * add updated sequence and token classification * make style * make edits in token classification * add audio classification * make style * add asr * make style * add image classification * make style * add summarization * make style * add translation * make style * add multiple choice * add language modeling * add qa * make style * review and edits * apply reviews * make style * fix call to processor * apply audio reviews * update to better asr model * make style

Add inference section to task guides (#18781)
* 📝 start adding inference section to task guides * ✨ make style * 📝 add multiple choice * add rest of inference sections * make style * add compute_metric, push_to_hub, pipeline * make style * add updated sequence and token classification * make style * make edits in token classification * add audio classification * make style * add asr * make style * add image classification * make style * add summarization * make style * add translation * make style * add multiple choice * add language modeling * add qa * make style * review and edits * apply reviews * make style * fix call to processor * apply audio reviews * update to better asr model * make style
d896029e · Steven Liu · GitHub · 4973d2a0 · d896029e · d896029e
Unverified Commit d896029e authored Nov 21, 2022 by Steven Liu Committed by GitHub Nov 21, 2022
11 changed files
--- a/docs/source/en/tasks/asr.mdx
+++ b/docs/source/en/tasks/asr.mdx
@@ -14,9 +14,12 @@ specific language governing permissions and limitations under the License.

 <Youtube id="TksaY_FDgnk"/>

-Automatic speech recognition (ASR) converts a speech signal to text. It is an example of a sequence-to-sequence task, going from a sequence of audio inputs to textual outputs. Voice assistants like Siri and Alexa utilize ASR models to assist users.
+Automatic speech recognition (ASR) converts a speech signal to text, mapping a sequence of audio inputs to text outputs. Virtual assistants like Siri and Alexa use ASR models to help users everyday, and there are many other useful user-facing applications like live captioning and note-taking during meetings.

-This guide will show you how to fine-tune [Wav2Vec2](https://huggingface.co/facebook/wav2vec2-base) on the [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) dataset to transcribe audio to text.
+This guide will show you how to:
+
+1. Finetune [Wav2Vec2](https://huggingface.co/facebook/wav2vec2-base) on the [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) dataset to transcribe audio to text.
+2. Use your finetuned model for inference.

 <Tip>

@@ -24,17 +27,31 @@ See the automatic speech recognition [task page](https://huggingface.co/tasks/au

 </Tip>

+Before you begin, make sure you have all the necessary libraries installed:
+
+```bash
+pip install transformers datasets evaluate jiwer
+```
+
+We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```
+
 ## Load MInDS-14 dataset

-Load the [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) from the 🤗 Datasets library:
+Start by loading a smaller subset of the [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) dataset from the 🤗 Datasets library. This'll give you a chance to experiment and make sure everything works before spending more time training on the full dataset.

 ```py
 >>> from datasets import load_dataset, Audio

->>> minds = load_dataset("PolyAI/minds14", name="en-US", split="train")
+>>> minds = load_dataset("PolyAI/minds14", name="en-US", split="train[:100]")
 ```

-Split this dataset into a train and test set:
+Split the dataset's `train` split into a train and test set with the [`~Dataset.train_test_split`] method:

 ```py
 >>> minds = minds.train_test_split(test_size=0.2)
@@ -47,16 +64,16 @@ Then take a look at the dataset:
 DatasetDict({
    train: Dataset({
        features: ['path', 'audio', 'transcription', 'english_transcription', 'intent_class', 'lang_id'],
-        num_rows: 450
+        num_rows: 16
    })
    test: Dataset({
        features: ['path', 'audio', 'transcription', 'english_transcription', 'intent_class', 'lang_id'],
-        num_rows: 113
+        num_rows: 4
    })
 })
 ```

-While the dataset contains a lot of helpful information, like `lang_id` and `intent_class`, you will focus on the `audio` and `transcription` columns in this guide. Remove the other columns:
+While the dataset contains a lot of useful information, like `lang_id` and `english_transcription`, you'll focus on the `audio` and `transcription` in this guide. Remove the other columns with the [`~datasets.Dataset.remove_columns`] method:

 ```py
 >>> minds = minds.remove_columns(["english_transcription", "intent_class", "lang_id"])
@@ -74,11 +91,14 @@ Take a look at the example again:
 'transcription': "hi I'm trying to use the banking app on my phone and currently my checking and savings account balance is not refreshing"}
 ```

-The `audio` column contains a 1-dimensional `array` of the speech signal that must be called to load and resample the audio file.
+There are two fields:
+
+- `audio`: a 1-dimensional `array` of the speech signal that must be called to load and resample the audio file. 
+- `transcription`: the target text. 

 ## Preprocess

-Load the Wav2Vec2 processor to process the audio signal and transcribed text:
+The next step is to load a Wav2Vec2 processor to process the audio signal:

 ```py
 >>> from transformers import AutoProcessor
@@ -86,7 +106,7 @@ Load the Wav2Vec2 processor to process the audio signal and transcribed text:
 >>> processor = AutoProcessor.from_pretrained("facebook/wav2vec2-base")
 ```

-The [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) dataset has a sampling rate of 8000khz. You will need to resample the dataset to use the pretrained Wav2Vec2 model:
+The MInDS-14 dataset has a sampling rate of 8000khz (you can find this information in its [dataset card](https://huggingface.co/datasets/PolyAI/minds14)), which means you'll need to resample the dataset to 16000kHz to use the pretrained Wav2Vec2 model:

 ```py
 >>> minds = minds.cast_column("audio", Audio(sampling_rate=16_000))
@@ -99,32 +119,38 @@ The [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) dataset has a sam
 'transcription': "hi I'm trying to use the banking app on my phone and currently my checking and savings account balance is not refreshing"}
 ```

-The preprocessing function needs to:
+As you can see in the `transcription` above, the text contains a mix of upper and lowercase characters. The Wav2Vec2 tokenizer is only trained on uppercase characters so you'll need to make sure the text matches the tokenizer's vocabulary:
+
+```py
+>>> def uppercase(example):
+...     return {"transcription": example["transcription"].upper()}
+
+
+>>> minds = minds.map(uppercase)
+```
+
+Now create a preprocessing function that:

-1. Call the `audio` column to load and resample the audio file.
-2. Extract the `input_values` from the audio file.
-3. Typically, when you call the processor, you call the feature extractor. Since you also want to tokenize text, instruct the processor to call the tokenizer instead with a context manager.
+1. Calls the `audio` column to load and resample the audio file.
+2. Extracts the `input_values` from the audio file and tokenize the `transcription` column with the processor.

 ```py
 >>> def prepare_dataset(batch):
 ...     audio = batch["audio"]
-
-...     batch = processor(audio=audio["array"], sampling_rate=audio["sampling_rate"]).input_values[0]
-...     batch["input_length"] = len(batch["input_values"])
-
-...     batch["labels"] = processor(text=batch["transcription"]).input_ids
+...     batch = processor(audio["array"], sampling_rate=audio["sampling_rate"], text=batch["transcription"])
+...     batch["input_length"] = len(batch["input_values"][0])
 ...     return batch
 ```

-Use 🤗 Datasets [`~datasets.Dataset.map`] function to apply the preprocessing function over the entire dataset. You can speed up the map function by increasing the number of processes with `num_proc`. Remove the columns you don't need:
+To apply the preprocessing function over the entire dataset, use 🤗 Datasets [`~datasets.Dataset.map`] function. You can speed up `map` by increasing the number of processes with the `num_proc` parameter. Remove the columns you don't need with the [`~datasets.Dataset.remove_columns`] method:

 ```py
 >>> encoded_minds = minds.map(prepare_dataset, remove_columns=minds.column_names["train"], num_proc=4)
 ```

-🤗 Transformers doesn't have a data collator for automatic speech recognition, so you will need to create one. You can adapt the [`DataCollatorWithPadding`] to create a batch of examples for automatic speech recognition. It will also dynamically pad your text and labels to the length of the longest element in its batch, so they are a uniform length. While it is possible to pad your text in the `tokenizer` function by setting `padding=True`, dynamic padding is more efficient.
+🤗 Transformers doesn't have a data collator for ASR, so you'll need to adapt the [`DataCollatorWithPadding`] to create a batch of examples. It'll also dynamically pad your text and labels to the length of the longest element in its batch (instead of the entire dataset) so they are a uniform length. While it is possible to pad your text in the `tokenizer` function by setting `padding=True`, dynamic padding is more efficient.

-Unlike other data collators, this specific data collator needs to apply a different padding method to `input_values` and `labels`. You can apply a different padding method with a context manager:
+Unlike other data collators, this specific data collator needs to apply a different padding method to `input_values` and `labels`:

 ```py
 >>> import torch
@@ -137,12 +163,12 @@ Unlike other data collators, this specific data collator needs to apply a differ
 ... class DataCollatorCTCWithPadding:

 ...     processor: AutoProcessor
-...     padding: Union[bool, str] = True
+...     padding: Union[bool, str] = "longest"

 ...     def __call__(self, features: List[Dict[str, Union[List[int], torch.Tensor]]]) -> Dict[str, torch.Tensor]:
 ...         # split inputs and labels since they have to be of different lengths and need
 ...         # different padding methods
-...         input_features = [{"input_values": feature["input_values"]} for feature in features]
+...         input_features = [{"input_values": feature["input_values"][0]} for feature in features]
 ...         label_features = [{"input_ids": feature["labels"]} for feature in features]

 ...         batch = self.processor.pad(input_features, padding=self.padding, return_tensors="pt")
@@ -157,17 +183,55 @@ Unlike other data collators, this specific data collator needs to apply a differ
 ...         return batch
 ```

-Create a batch of examples and dynamically pad them with `DataCollatorForCTCWithPadding`:
+Now instantiate your `DataCollatorForCTCWithPadding`:

 ```py
->>> data_collator = DataCollatorCTCWithPadding(processor=processor, padding=True)
+>>> data_collator = DataCollatorCTCWithPadding(processor=processor, padding="longest")
+```
+
+## Evaluate
+
+Including a metric during training is often helpful for evaluating your model's performance. You can quickly load a evaluation method with the 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) library. For this task, load the [word error rate](https://huggingface.co/spaces/evaluate-metric/wer) (WER) metric (see the 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) to learn more about how to load and compute a metric):
+
+```py
+>>> import evaluate
+
+>>> wer = evaluate.load("wer")
 ```

+Then create a function that passes your predictions and labels to [`~evaluate.EvaluationModule.compute`] to calculate the WER:
+
+```py
+>>> import numpy as np
+
+
+>>> def compute_metrics(pred):
+...     pred_logits = pred.predictions
+...     pred_ids = np.argmax(pred_logits, axis=-1)
+
+...     pred.label_ids[pred.label_ids == -100] = processor.tokenizer.pad_token_id
+
+...     pred_str = processor.batch_decode(pred_ids)
+...     label_str = processor.batch_decode(pred.label_ids, group_tokens=False)
+
+...     wer = wer.compute(predictions=pred_str, references=label_str)
+
+...     return {"wer": wer}
+```
+
+Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.
+
 ## Train

 <frameworkcontent>
 <pt>
-Load Wav2Vec2 with [`AutoModelForCTC`]. For `ctc_loss_reduction`, it is often better to use the average instead of the default summation:
+<Tip>
+
+If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#train-with-pytorch-trainer)!
+
+</Tip>
+
+You're ready to start training your model now! Load Wav2Vec2 with [`AutoModelForCTC`]. Specify the reduction to apply with the `ctc_loss_reduction` parameter. It is often better to use the average instead of the default summation:

 ```py
 >>> from transformers import AutoModelForCTC, TrainingArguments, Trainer
@@ -179,30 +243,32 @@ Load Wav2Vec2 with [`AutoModelForCTC`]. For `ctc_loss_reduction`, it is often be
 ... )
 ```

-<Tip>
-
-If you aren't familiar with fine-tuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#finetune-with-trainer)!
-
-</Tip>
-
 At this point, only three steps remain:

-1. Define your training hyperparameters in [`TrainingArguments`].
-2. Pass the training arguments to [`Trainer`] along with the model, datasets, tokenizer, and data collator.
-3. Call [`~Trainer.train`] to fine-tune your model.
+1. Define your training hyperparameters in [`TrainingArguments`]. The only required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model). At the end of each epoch, the [`Trainer`] will evaluate the WER and save the training checkpoint.
+2. Pass the training arguments to [`Trainer`] along with the model, dataset, tokenizer, data collator, and `compute_metrics` function.
+3. Call [`~Trainer.train`] to finetune your model.

 ```py
 >>> training_args = TrainingArguments(
-...     output_dir="./results",
+...     output_dir="my_awesome_asr_mind_model",
+...     per_device_train_batch_size=8,
+...     gradient_accumulation_steps=2,
+...     learning_rate=1e-5,
+...     warmup_steps=500,
+...     max_steps=2000,
+...     gradient_checkpointing=True,
+...     fp16=True,
 ...     group_by_length=True,
-...     per_device_train_batch_size=16,
 ...     evaluation_strategy="steps",
-...     num_train_epochs=3,
-...     fp16=True,
-...     gradient_checkpointing=True,
-...     learning_rate=1e-4,
-...     weight_decay=0.005,
-...     save_total_limit=2,
+...     per_device_eval_batch_size=8,
+...     save_steps=1000,
+...     eval_steps=1000,
+...     logging_steps=25,
+...     load_best_model_at_end=True,
+...     metric_for_best_model="wer",
+...     greater_is_better=False,
+...     push_to_hub=True,
 ... )

 >>> trainer = Trainer(
@@ -212,15 +278,89 @@ At this point, only three steps remain:
 ...     eval_dataset=encoded_minds["test"],
 ...     tokenizer=processor.feature_extractor,
 ...     data_collator=data_collator,
+...     compute_metrics=compute_metrics,
 ... )

 >>> trainer.train()
 ```
+
+Once training is completed, share your model to the Hub with the [`~transformers.Trainer.push_to_hub`] method so everyone can use your model:
+
+```py
+>>> trainer.push_to_hub()
+```
 </pt>
 </frameworkcontent>

 <Tip>

-For a more in-depth example of how to fine-tune a model for automatic speech recognition, take a look at this blog [post](https://huggingface.co/blog/fine-tune-wav2vec2-english) for English ASR and this [post](https://huggingface.co/blog/fine-tune-xlsr-wav2vec2) for multilingual ASR.
+For a more in-depth example of how to finetune a model for automatic speech recognition, take a look at this blog [post](https://huggingface.co/blog/fine-tune-wav2vec2-english) for English ASR and this [post](https://huggingface.co/blog/fine-tune-xlsr-wav2vec2) for multilingual ASR.
+
+</Tip>
+
+## Inference
+
+Great, now that you've finetuned a model, you can use it for inference!
+
+Load an audio file you'd like to run inference on. Remember to resample the sampling rate of the audio file to match the sampling rate of the model if you need to!
+
+```py
+>>> from datasets import load_dataset, Audio
+
+>>> dataset = load_dataset("PolyAI/minds14", "en-US", split="train")
+>>> dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
+>>> sampling_rate = dataset.features["audio"].sampling_rate
+>>> audio_file = dataset[0]["audio"]["path"]
+```
+
+The simplest way to try out your finetuned model for inference is to use it in a [`pipeline`]. Instantiate a `pipeline` for automatic speech recognition with your model, and pass your audio file to it:
+
+```py
+>>> from transformers import pipeline
+
+>>> transcriber = pipeline("automatic-speech-recognition", model="stevhliu/my_awesome_asr_minds_model")
+>>> transcriber(audio_file)
+{'text': 'I WOUD LIKE O SET UP JOINT ACOUNT WTH Y PARTNER'}
+```
+
+<Tip>
+
+The transcription is decent, but it could be better! Try finetuning your model on more examples to get even better results!

 </Tip>
+
+You can also manually replicate the results of the `pipeline` if you'd like:
+
+<frameworkcontent>
+<pt>
+Load a processor to preprocess the audio file and transcription and return the `input` as PyTorch tensors:
+
+```py
+>>> from transformers import AutoProcessor
+
+>>> processor = AutoProcessor.from_pretrained("stevhliu/my_awesome_asr_mind_model")
+>>> inputs = processor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt")
+```
+
+Pass your inputs to the model and return the logits:
+
+```py
+>>> from transformers import AutoModelForCTC
+
+>>> model = AutoModelForCTC.from_pretrained("stevhliu/my_awesome_asr_mind_model")
+>>> with torch.no_grad():
+...     logits = model(**inputs).logits
+```
+
+Get the predicted `input_ids` with the highest probability, and use the processor to decode the predicted `input_ids` back into text:
+
+```py
+>>> import torch
+
+>>> predicted_ids = torch.argmax(logits, dim=-1)
+>>> transcription = processor.batch_decode(predicted_ids)
+>>> transcription
+['I WOUL LIKE O SET UP JOINT ACOUNT WTH Y PARTNER']
+```
+</pt>
+</frameworkcontent>
\ No newline at end of file
--- a/docs/source/en/tasks/audio_classification.mdx
+++ b/docs/source/en/tasks/audio_classification.mdx
@@ -14,9 +14,12 @@ specific language governing permissions and limitations under the License.

 <Youtube id="KWwzcmG98Ds"/>

-Audio classification assigns a label or class to audio data. It is similar to text classification, except an audio input is continuous and must be discretized, whereas text can be split into tokens. Some practical applications of audio classification include identifying intent, speakers, and even animal species by their sounds.
+Audio classification - just like with text - assigns a class label output from the input data. The only difference is instead of text inputs, you have raw audio waveforms. Some practical applications of audio classification include identifying speaker intent, language classification, and even animal species by their sounds.

-This guide will show you how to fine-tune [Wav2Vec2](https://huggingface.co/facebook/wav2vec2-base) on the [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) to classify intent.
+This guide will show you how to:
+
+1. Finetune [Wav2Vec2](https://huggingface.co/facebook/wav2vec2-base) on the [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) dataset to classify speaker intent.
+2. Use your finetuned model for inference.

 <Tip>

@@ -24,9 +27,23 @@ See the audio classification [task page](https://huggingface.co/tasks/audio-clas

 </Tip>

+Before you begin, make sure you have all the necessary libraries installed:
+
+```bash
+pip install transformers datasets evaluate
+```
+
+We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```
+
 ## Load MInDS-14 dataset

-Load the [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) from the 🤗 Datasets library:
+Start by loading the MInDS-14 dataset from the 🤗 Datasets library:

 ```py
 >>> from datasets import load_dataset, Audio
@@ -34,7 +51,7 @@ Load the [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) from the 
 >>> minds = load_dataset("PolyAI/minds14", name="en-US", split="train")
 ```

-Split this dataset into a train and test set:
+Split the dataset's `train` split into a smaller train and test set with the [`~datasets.Dataset.train_test_split`] method. This'll give you a chance to experiment and make sure everything works before spending more time on the full dataset.

 ```py
 >>> minds = minds.train_test_split(test_size=0.2)
@@ -56,7 +73,7 @@ DatasetDict({
 })
 ```

-While the dataset contains a lot of other useful information, like `lang_id` and `english_transcription`, you will focus on the `audio` and `intent_class` in this guide. Remove the other columns:
+While the dataset contains a lot of useful information, like `lang_id` and `english_transcription`, you'll focus on the `audio` and `intent_class` in this guide. Remove the other columns with the [`~datasets.Dataset.remove_columns`] method:

 ```py
 >>> minds = minds.remove_columns(["path", "transcription", "english_transcription", "lang_id"])
@@ -73,7 +90,12 @@ Take a look at an example now:
 'intent_class': 2}
 ```

-The `audio` column contains a 1-dimensional `array` of the speech signal that must be called to load and resample the audio file. The `intent_class` column is an integer that represents the class id of intent. Create a dictionary that maps a label name to an integer and vice versa. The mapping will help the model recover the label name from the label number:
+There are two fields:
+
+- `audio`: a 1-dimensional `array` of the speech signal that must be called to load and resample the audio file. 
+- `intent_class`: represents the class id of the speaker's intent. 
+
+To make it easier for the model to get the label name from the label id, create a dictionary that maps the label name to an integer and vice versa:

 ```py
 >>> labels = minds["train"].features["intent_class"].names
@@ -83,18 +105,16 @@ The `audio` column contains a 1-dimensional `array` of the speech signal that mu
 ...     id2label[str(i)] = label
 ```

-Now you can convert the label number to a label name for more information:
+Now you can convert the label id to a label name:

 ```py
 >>> id2label[str(2)]
 'app_error'
 ```

-Each keyword - or label - corresponds to a number; `2` indicates `app_error` in the example above.
-
 ## Preprocess

-Load the Wav2Vec2 feature extractor to process the audio signal:
+The next step is to load a Wav2Vec2 feature extractor to process the audio signal:

 ```py
 >>> from transformers import AutoFeatureExtractor
@@ -102,7 +122,7 @@ Load the Wav2Vec2 feature extractor to process the audio signal:
 >>> feature_extractor = AutoFeatureExtractor.from_pretrained("facebook/wav2vec2-base")
 ```

-The [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) dataset has a sampling rate of 8000khz. You will need to resample the dataset to use the pretrained Wav2Vec2 model:
+The MInDS-14 dataset has a sampling rate of 8000khz (you can find this information in it's [dataset card](https://huggingface.co/datasets/PolyAI/minds14)), which means you'll need to resample the dataset to 16000kHz to use the pretrained Wav2Vec2 model:

 ```py
 >>> minds = minds.cast_column("audio", Audio(sampling_rate=16_000))
@@ -114,11 +134,11 @@ The [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) dataset has a sam
 'intent_class': 2}
 ```

-The preprocessing function needs to:
+Now create a preprocessing function that:

-1. Call the `audio` column to load and if necessary resample the audio file.
-2. Check the sampling rate of the audio file matches the sampling rate of the audio data a model was pretrained with. You can find this information on the Wav2Vec2 [model card](https://huggingface.co/facebook/wav2vec2-base).
-3. Set a maximum input length so longer inputs are batched without being truncated.
+1. Calls the `audio` column to load, and if necessary, resample the audio file.
+2. Checks if the sampling rate of the audio file matches the sampling rate of the audio data a model was pretrained with. You can find this information in the Wav2Vec2 [model card](https://huggingface.co/facebook/wav2vec2-base).
+3. Set a maximum input length to batch longer inputs without truncating them.

 ```py
 >>> def preprocess_function(examples):
@@ -129,18 +149,46 @@ The preprocessing function needs to:
 ...     return inputs
 ```

-Use 🤗 Datasets [`~datasets.Dataset.map`] function to apply the preprocessing function over the entire dataset. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once. Remove the columns you don't need, and rename `intent_class` to `label` because that is what the model expects:
+To apply the preprocessing function over the entire dataset, use 🤗 Datasets [`~datasets.Dataset.map`] function. You can speed up `map` by setting `batched=True` to process multiple elements of the dataset at once. Remove the columns you don't need, and rename `intent_class` to `label` because that's the name the model expects:

 ```py
 >>> encoded_minds = minds.map(preprocess_function, remove_columns="audio", batched=True)
 >>> encoded_minds = encoded_minds.rename_column("intent_class", "label")
 ```

+## Evaluate
+
+Including a metric during training is often helpful for evaluating your model's performance. You can quickly load a evaluation method with the 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) library. For this task, load the [accuracy](https://huggingface.co/spaces/evaluate-metric/accuracy) metric (see the 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) to learn more about how to load and compute a metric):
+
+```py
+>>> import evaluate
+
+>>> accuracy = evaluate.load("accuracy")
+```
+
+Then create a function that passes your predictions and labels to [`~evaluate.EvaluationModule.compute`] to calculate the accuracy:
+
+```py
+>>> import numpy as np
+
+
+>>> def compute_metrics(eval_pred):
+...     predictions = np.argmax(eval_pred.predictions, axis=1)
+...     return accuracy.compute(predictions=predictions, references=eval_pred.label_ids)
+```
+
+Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.
+
 ## Train

 <frameworkcontent>
 <pt>
-Load Wav2Vec2 with [`AutoModelForAudioClassification`]. Specify the number of labels, and pass the model the mapping between label number and label class:
+<Tip>
+
+If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#train-with-pytorch-trainer)!
+
+</Tip>
+You're ready to start training your model now! Load Wav2Vec2 with [`AutoModelForAudioClassification`] along with the number of expected labels, and the label mappings:

 ```py
 >>> from transformers import AutoModelForAudioClassification, TrainingArguments, Trainer
@@ -151,25 +199,28 @@ Load Wav2Vec2 with [`AutoModelForAudioClassification`]. Specify the number of la
 ... )
 ```

-<Tip>
-
-If you aren't familiar with fine-tuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#finetune-with-trainer)!
-
-</Tip>
-
 At this point, only three steps remain:

-1. Define your training hyperparameters in [`TrainingArguments`].
-2. Pass the training arguments to [`Trainer`] along with the model, datasets, and feature extractor.
-3. Call [`~Trainer.train`] to fine-tune your model.
+1. Define your training hyperparameters in [`TrainingArguments`]. The only required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model). At the end of each epoch, the [`Trainer`] will evaluate the accuracy and save the training checkpoint.
+2. Pass the training arguments to [`Trainer`] along with the model, dataset, tokenizer, data collator, and `compute_metrics` function.
+3. Call [`~Trainer.train`] to finetune your model.
+

 ```py
 >>> training_args = TrainingArguments(
-...     output_dir="./results",
+...     output_dir="my_awesome_mind_model",
 ...     evaluation_strategy="epoch",
 ...     save_strategy="epoch",
 ...     learning_rate=3e-5,
-...     num_train_epochs=5,
+...     per_device_train_batch_size=32,
+...     gradient_accumulation_steps=4,
+...     per_device_eval_batch_size=32,
+...     num_train_epochs=10,
+...     warmup_ratio=0.1,
+...     logging_steps=10,
+...     load_best_model_at_end=True,
+...     metric_for_best_model="accuracy",
+...     push_to_hub=True,
 ... )

 >>> trainer = Trainer(
@@ -178,15 +229,89 @@ At this point, only three steps remain:
 ...     train_dataset=encoded_minds["train"],
 ...     eval_dataset=encoded_minds["test"],
 ...     tokenizer=feature_extractor,
+...     compute_metrics=compute_metrics,
 ... )

 >>> trainer.train()
 ```
+
+Once training is completed, share your model to the Hub with the [`~transformers.Trainer.push_to_hub`] method so everyone can use your model:
+
+```py
+>>> trainer.push_to_hub()
+```
 </pt>
 </frameworkcontent>

 <Tip>

-For a more in-depth example of how to fine-tune a model for audio classification, take a look at the corresponding [PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/audio_classification.ipynb).
+For a more in-depth example of how to finetune a model for audio classification, take a look at the corresponding [PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/audio_classification.ipynb).

 </Tip>
+
+## Inference
+
+Great, now that you've finetuned a model, you can use it for inference!
+
+Load an audio file you'd like to run inference on. Remember to resample the sampling rate of the audio file to match the sampling rate of the model if you need to!
+
+```py
+>>> from datasets import load_dataset, Audio
+
+>>> dataset = load_dataset("PolyAI/minds14", name="en-US", split="train")
+>>> dataset = dataset.cast_column("audio", Audio(sampling_rate=16000))
+>>> sampling_rate = dataset.features["audio"].sampling_rate
+>>> audio_file = dataset[0]["audio"]["path"]
+```
+
+The simplest way to try out your finetuned model for inference is to use it in a [`pipeline`]. Instantiate a `pipeline` for audio classification with your model, and pass your audio file to it:
+
+```py
+>>> from transformers import pipeline
+
+>>> classifier = pipeline("audio-classification", model="stevhliu/my_awesome_minds_model")
+>>> classifier(audio_file)
+[
+    {'score': 0.09766869246959686, 'label': 'cash_deposit'},
+    {'score': 0.07998877018690109, 'label': 'app_error'},
+    {'score': 0.0781070664525032, 'label': 'joint_account'},
+    {'score': 0.07667109370231628, 'label': 'pay_bill'},
+    {'score': 0.0755252093076706, 'label': 'balance'}
+]
+```
+
+You can also manually replicate the results of the `pipeline` if you'd like:
+
+<frameworkcontent>
+<pt>
+Load a feature extractor to preprocess the audio file and return the `input` as PyTorch tensors:
+
+```py
+>>> from transformers import AutoFeatureExtractor
+
+>>> feature_extractor = AutoFeatureExtractor.from_pretrained("stevhliu/my_awesome_minds_model")
+>>> inputs = feature_extractor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt")
+```
+
+Pass your inputs to the model and return the logits:
+
+```py
+>>> from transformers import AutoModelForAudioClassification
+
+>>> model = AutoModelForAudioClassification.from_pretrained("stevhliu/my_awesome_minds_model")
+>>> with torch.no_grad():
+...     logits = model(**inputs).logits
+```
+
+Get the class with the highest probability, and use the model's `id2label` mapping to convert it to a label:
+
+```py
+>>> import torch
+
+>>> predicted_class_ids = torch.argmax(logits).item()
+>>> predicted_label = model.config.id2label[predicted_class_ids]
+>>> predicted_label
+'cash_deposit'
+```
+</pt>
+</frameworkcontent>
\ No newline at end of file
--- a/docs/source/en/tasks/image_classification.mdx
+++ b/docs/source/en/tasks/image_classification.mdx
@@ -12,11 +12,16 @@ specific language governing permissions and limitations under the License.

 # Image classification

+[[open-in-colab]]
+
 <Youtube id="tjAIM7BOYhw"/>

-Image classification assigns a label or class to an image. Unlike text or audio classification, the inputs are the pixel values that represent an image. There are many uses for image classification, like detecting damage after a disaster, monitoring crop health, or helping screen medical images for signs of disease.
+Image classification assigns a label or class to an image. Unlike text or audio classification, the inputs are the pixel values that comprise an image. There are many applications for image classification such as detecting damage after a natural disaster, monitoring crop health, or helping screen medical images for signs of disease.
+
+This guide will show you how to:

-This guide will show you how to fine-tune [ViT](https://huggingface.co/docs/transformers/v4.16.2/en/model_doc/vit) on the [Food-101](https://huggingface.co/datasets/food101) dataset to classify a food item in an image.
+1. Finetune [ViT](https://huggingface.co/docs/transformers/v4.16.2/en/model_doc/vit) on the [Food-101](https://huggingface.co/datasets/food101) dataset to classify a food item in an image.
+2. Use your finetuned model for inference.

 <Tip>

@@ -24,9 +29,23 @@ See the image classification [task page](https://huggingface.co/tasks/image-clas

 </Tip>

+Before you begin, make sure you have all the necessary libraries installed:
+
+```bash
+pip install transformers datasets evaluate
+```
+
+We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```
+
 ## Load Food-101 dataset

-Load only the first 5000 images of the Food-101 dataset from the 🤗 Datasets library since it is pretty large:
+Start by loading a smaller subset of the Food-101 dataset from the 🤗 Datasets library. This'll give you a chance to experiment and make sure everythings works before spending more time training on the full dataset.

 ```py
 >>> from datasets import load_dataset
@@ -34,7 +53,7 @@ Load only the first 5000 images of the Food-101 dataset from the 🤗 Datasets l
 >>> food = load_dataset("food101", split="train[:5000]")
 ```

-Split this dataset into a train and test set:
+Split the dataset's `train` split into a train and test set with the [`~datasets.Dataset.train_test_split`] method:

 ```py
 >>> food = food.train_test_split(test_size=0.2)
@@ -48,7 +67,12 @@ Then take a look at an example:
 'label': 79}
 ```

-The `image` field contains a PIL image, and each `label` is an integer that represents a class. Create a dictionary that maps a label name to an integer and vice versa. The mapping will help the model recover the label name from the label number:
+There are two fields:
+
+- `image`: a PIL image of the food item.
+- `label`: the label class of the food item.
+
+To make it easier for the model to get the label name from the label id, create a dictionary that maps the label name to an integer and vice versa:

 ```py
 >>> labels = food["train"].features["label"].names
@@ -58,18 +82,16 @@ The `image` field contains a PIL image, and each `label` is an integer that repr
 ...     id2label[str(i)] = label
 ```

-Now you can convert the label number to a label name for more information:
+Now you can convert the label id to a label name:

 ```py
 >>> id2label[str(79)]
 'prime_rib'
 ```

-Each food class - or label - corresponds to a number; `79` indicates a prime rib in the example above.
-
 ## Preprocess

-Load the ViT feature extractor to process the image into a tensor:
+The next step is to load a ViT feature extractor to process the image into a tensor:

 ```py
 >>> from transformers import AutoFeatureExtractor
@@ -77,7 +99,9 @@ Load the ViT feature extractor to process the image into a tensor:
 >>> feature_extractor = AutoFeatureExtractor.from_pretrained("google/vit-base-patch16-224-in21k")
 ```

-Apply several image transformations to the dataset to make the model more robust against overfitting. Here you'll use torchvision's [`transforms`](https://pytorch.org/vision/stable/transforms.html) module. Crop a random part of the image, resize it, and normalize it with the image mean and standard deviation:
+Apply some image transformations to the images to make the model more robust against overfitting. Here you'll use torchvision's [`transforms`](https://pytorch.org/vision/stable/transforms.html) module, but you can also use any image library you like. 
+
+Crop a random part of the image, resize it, and normalize it with the image mean and standard deviation:

 ```py
 >>> from torchvision.transforms import RandomResizedCrop, Compose, Normalize, ToTensor
@@ -91,7 +115,7 @@ Apply several image transformations to the dataset to make the model more robust
 >>> _transforms = Compose([RandomResizedCrop(size), ToTensor(), normalize])
 ```

-Create a preprocessing function that will apply the transforms and return the `pixel_values` - the inputs to the model - of the image:
+Then create a preprocessing function to apply the transforms and return the `pixel_values` - the inputs to the model - of the image:

 ```py
 >>> def transforms(examples):
@@ -100,13 +124,13 @@ Create a preprocessing function that will apply the transforms and return the `p
 ...     return examples
 ```

-Use 🤗 Dataset's [`~datasets.Dataset.with_transform`] method to apply the transforms over the entire dataset. The transforms are applied on-the-fly when you load an element of the dataset:
+To apply the preprocessing function over the entire dataset, use 🤗 Datasets [`~datasets.Dataset.with_transform`] method. The transforms are applied on the fly when you load an element of the dataset:

 ```py
 >>> food = food.with_transform(transforms)
 ```

-Use [`DefaultDataCollator`] to create a batch of examples. Unlike other data collators in 🤗 Transformers, the DefaultDataCollator does not apply additional preprocessing such as padding.
+Now create a batch of examples using [`DataCollatorWithPadding`]. Unlike other data collators in 🤗 Transformers, the `DefaultDataCollator` does not apply additional preprocessing such as padding.

 ```py
 >>> from transformers import DefaultDataCollator
@@ -114,11 +138,39 @@ Use [`DefaultDataCollator`] to create a batch of examples. Unlike other data col
 >>> data_collator = DefaultDataCollator()
 ```

+## Evaluate
+
+Including a metric during training is often helpful for evaluating your model's performance. You can quickly load a evaluation method with the 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) library. For this task, load the [accuracy](https://huggingface.co/spaces/evaluate-metric/accuracy) metric (see the 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) to learn more about how to load and compute a metric):
+
+```py
+>>> import evaluate
+
+>>> accuracy = evaluate.load("accuracy")
+```
+
+Then create a function that passes your predictions and labels to [`~evaluate.EvaluationModule.compute`] to calculate the accuracy:
+
+```py
+>>> import numpy as np
+
+
+>>> def compute_metrics(eval_pred):
+...     predictions = np.argmax(eval_pred.predictions, axis=1)
+...     return accuracy.compute(predictions=predictions, references=eval_pred.label_ids)
+```
+
+Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.
+
 ## Train

 <frameworkcontent>
 <pt>
-Load ViT with [`AutoModelForImageClassification`]. Specify the number of labels, and pass the model the mapping between label number and label class:
+<Tip>
+
+If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#train-with-pytorch-trainer)!
+
+</Tip>
+You're ready to start training your model now! Load ViT with [`AutoModelForImageClassification`]. Specify the number of labels along with the number of expected labels, and the label mappings:

 ```py
 >>> from transformers import AutoModelForImageClassification, TrainingArguments, Trainer
@@ -131,31 +183,28 @@ Load ViT with [`AutoModelForImageClassification`]. Specify the number of labels,
 ... )
 ```

-<Tip>
-
-If you aren't familiar with fine-tuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#finetune-with-trainer)!
-
-</Tip>
-
 At this point, only three steps remain:

-1. Define your training hyperparameters in [`TrainingArguments`]. It is important you don't remove unused columns because this will drop the `image` column. Without the `image` column, you can't create `pixel_values`. Set `remove_unused_columns=False` to prevent this behavior!
-2. Pass the training arguments to [`Trainer`] along with the model, datasets, tokenizer, and data collator.
-3. Call [`~Trainer.train`] to fine-tune your model.
+1. Define your training hyperparameters in [`TrainingArguments`]. It is important you don't remove unused columns because this'll drop the `image` column. Without the `image` column, you can't create `pixel_values`. Set `remove_unused_columns=False` to prevent this behavior! The only other required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model). At the end of each epoch, the [`Trainer`] will evaluate the accuracy and save the training checkpoint.
+2. Pass the training arguments to [`Trainer`] along with the model, dataset, tokenizer, data collator, and `compute_metrics` function.
+3. Call [`~Trainer.train`] to finetune your model.

 ```py
 >>> training_args = TrainingArguments(
-...     output_dir="./results",
+...     output_dir="my_awesome_food_model",
+...     remove_unused_columns=False,
+...     evaluation_strategy="epoch",
+...     save_strategy="epoch",
+...     learning_rate=5e-5,
 ...     per_device_train_batch_size=16,
-...     evaluation_strategy="steps",
-...     num_train_epochs=4,
-...     fp16=True,
-...     save_steps=100,
-...     eval_steps=100,
+...     gradient_accumulation_steps=4,
+...     per_device_eval_batch_size=16,
+...     num_train_epochs=3,
+...     warmup_ratio=0.1,
 ...     logging_steps=10,
-...     learning_rate=2e-4,
-...     save_total_limit=2,
-...     remove_unused_columns=False,
+...     load_best_model_at_end=True,
+...     metric_for_best_model="accuracy",
+...     push_to_hub=True,
 ... )

 >>> trainer = Trainer(
@@ -165,15 +214,84 @@ At this point, only three steps remain:
 ...     train_dataset=food["train"],
 ...     eval_dataset=food["test"],
 ...     tokenizer=feature_extractor,
+...     compute_metrics=compute_metrics,
 ... )

 >>> trainer.train()
 ```
+
+Once training is completed, share your model to the Hub with the [`~transformers.Trainer.push_to_hub`] method so everyone can use your model:
+
+```py
+>>> trainer.push_to_hub()
+```
 </pt>
 </frameworkcontent>

 <Tip>

-For a more in-depth example of how to fine-tune a model for image classification, take a look at the corresponding [PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).
+For a more in-depth example of how to finetune a model for image classification, take a look at the corresponding [PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/image_classification.ipynb).

 </Tip>
+
+## Inference
+
+Great, now that you've finetuned a model, you can use it for inference!
+
+Load an image you'd like to run inference on:
+
+```py
+>>> ds = load_dataset("food101", split="validation[:10]")
+>>> image = ds["image"][0]
+```
+
+<div class="flex justify-center">
+    <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png" alt="image of beignets"/>
+</div>
+
+The simplest way to try out your finetuned model for inference is to use it in a [`pipeline`]. Instantiate a `pipeline` for image classification with your model, and pass your image to it:
+
+```py
+>>> from transformers import pipeline
+
+>>> classifier = pipeline("image-classification", model="my_awesome_food_model")
+>>> classifier(image)
+[{'score': 0.35574808716773987, 'label': 'beignets'},
+ {'score': 0.018057454377412796, 'label': 'chicken_wings'},
+ {'score': 0.017733804881572723, 'label': 'prime_rib'},
+ {'score': 0.016335085034370422, 'label': 'bruschetta'},
+ {'score': 0.0160061065107584, 'label': 'ramen'}]
+```
+You can also manually replicate the results of the `pipeline` if you'd like:
+
+<frameworkcontent>
+<pt>
+Load a feature extractor to preprocess the image and return the `input` as PyTorch tensors:
+
+```py
+>>> from transformers import AutoFeatureExtractor
+>>> import torch
+
+>>> feature_extractor = AutoFeatureExtractor.from_pretrained("my_awesome_food_model")
+>>> inputs = feature_extractor(image, return_tensors="pt")
+```
+
+Pass your inputs to the model and return the logits:
+
+```py
+>>> from transformers import AutoModelForImageClassification
+
+>>> model = AutoModelForImageClassification.from_pretrained("my_awesome_food_model")
+>>> with torch.no_grad():
+...     logits = model(**inputs).logits
+```
+
+Get the predicted label with the highest probability, and use the model's `id2label` mapping to convert it to a label:
+
+```py
+>>> predicted_label = logits.argmax(-1).item()
+>>> model.config.id2label[predicted_label]
+'beignets'
+```
+</pt>
+</frameworkcontent>
\ No newline at end of file
--- a/docs/source/en/tasks/language_modeling.mdx
+++ b/docs/source/en/tasks/language_modeling.mdx
--- a/docs/source/en/tasks/multiple_choice.mdx
+++ b/docs/source/en/tasks/multiple_choice.mdx
@@ -12,13 +12,30 @@ specific language governing permissions and limitations under the License.

 # Multiple choice

-A multiple choice task is similar to question answering, except several candidate answers are provided along with a context. The model is trained to select the correct answer from multiple inputs given a context.
+A multiple choice task is similar to question answering, except several candidate answers are provided along with a context and the model is trained to select the correct answer.

-This guide will show you how to fine-tune [BERT](https://huggingface.co/bert-base-uncased) on the `regular` configuration of the [SWAG](https://huggingface.co/datasets/swag) dataset to select the best answer given multiple options and some context.
+This guide will show you how to:
+
+1. Finetune [BERT](https://huggingface.co/bert-base-uncased) on the `regular` configuration of the [SWAG](https://huggingface.co/datasets/swag) dataset to select the best answer given multiple options and some context.
+2. Use your finetuned model for inference.
+
+Before you begin, make sure you have all the necessary libraries installed:
+
+```bash
+pip install transformers datasets evaluate
+```
+
+We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```

 ## Load SWAG dataset

-Load the SWAG dataset from the 🤗 Datasets library:
+Start by loading the `regular` configuration of the SWAG dataset from the 🤗 Datasets library:

 ```py
 >>> from datasets import load_dataset
@@ -43,11 +60,15 @@ Then take a look at an example:
 'video-id': 'anetv_jkn6uvmqwh4'}
 ```

-The `sent1` and `sent2` fields show how a sentence begins, and each `ending` field shows how a sentence could end. Given the sentence beginning, the model must pick the correct sentence ending as indicated by the `label` field.
+While it looks like there are a lot of fields here, it is actually pretty straightforward:
+
+- `sent1` and `sent2`: these fields show how a sentence starts, and if you put the two together, you get the `startphrase` field.
+- `ending`: suggests a possible ending for how a sentence can end, but only one of them is correct.
+- `label`: identifies the correct sentence ending.

 ## Preprocess

-Load the BERT tokenizer to process the start of each sentence and the four possible endings:
+The next step is to load a BERT tokenizer to process the sentence starts and the four possible endings:

 ```py
 >>> from transformers import AutoTokenizer
@@ -55,9 +76,9 @@ Load the BERT tokenizer to process the start of each sentence and the four possi
 >>> tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
 ```

-The preprocessing function needs to do:
+The preprocessing function you want to create needs to:

-1. Make four copies of the `sent1` field so you can combine each of them with `sent2` to recreate how a sentence starts.
+1. Make four copies of the `sent1` field and combine each of them with `sent2` to recreate how a sentence starts.
 2. Combine `sent2` with each of the four possible sentence endings.
 3. Flatten these two lists so you can tokenize them, and then unflatten them afterward so each example has a corresponding `input_ids`, `attention_mask`, and `labels` field.

@@ -79,15 +100,15 @@ The preprocessing function needs to do:
 ...     return {k: [v[i : i + 4] for i in range(0, len(v), 4)] for k, v in tokenized_examples.items()}
 ```

-Use 🤗 Datasets [`~datasets.Dataset.map`] function to apply the preprocessing function over the entire dataset. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once:
+To apply the preprocessing function over the entire dataset, use 🤗 Datasets [`~datasets.Dataset.map`] method. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once:

 ```py
 tokenized_swag = swag.map(preprocess_function, batched=True)
 ```

-🤗 Transformers doesn't have a data collator for multiple choice, so you will need to create one. You can adapt the [`DataCollatorWithPadding`] to create a batch of examples for multiple choice. It will also *dynamically pad* your text and labels to the length of the longest element in its batch, so they are a uniform length. While it is possible to pad your text in the `tokenizer` function by setting `padding=True`, dynamic padding is more efficient.
+🤗 Transformers doesn't have a data collator for multiple choice, so you'll need to adapt the [`DataCollatorWithPadding`] to create a batch of examples. It's more efficient to *dynamically pad* the sentences to the longest length in a batch during collation, instead of padding the whole dataset to the maximium length.

-`DataCollatorForMultipleChoice` will flatten all the model inputs, apply padding, and then unflatten the results:
+`DataCollatorForMultipleChoice` flattens all the model inputs, applies padding, and then unflattens the results:

 <frameworkcontent>
 <pt>
@@ -176,39 +197,65 @@ tokenized_swag = swag.map(preprocess_function, batched=True)
 </tf>
 </frameworkcontent>

-## Train
+## Evaluate

-<frameworkcontent>
-<pt>
-Load BERT with [`AutoModelForMultipleChoice`]:
+Including a metric during training is often helpful for evaluating your model's performance. You can quickly load a evaluation method with the 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) library. For this task, load the [accuracy](https://huggingface.co/spaces/evaluate-metric/accuracy) metric (see the 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) to learn more about how to load and compute a metric):

 ```py
->>> from transformers import AutoModelForMultipleChoice, TrainingArguments, Trainer
+>>> import evaluate

->>> model = AutoModelForMultipleChoice.from_pretrained("bert-base-uncased")
+>>> accuracy = evaluate.load("accuracy")
 ```

+Then create a function that passes your predictions and labels to [`~evaluate.EvaluationModule.compute`] to calculate the accuracy:
+
+```py
+>>> import numpy as np
+
+
+>>> def compute_metrics(eval_pred):
+...     predictions, labels = eval_pred
+...     predictions = np.argmax(predictions, axis=1)
+...     return accuracy.compute(predictions=predictions, references=labels)
+```
+
+Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.
+
+## Train
+
+<frameworkcontent>
+<pt>
 <Tip>

-If you aren't familiar with fine-tuning a model with Trainer, take a look at the basic tutorial [here](../training#finetune-with-trainer)!
+If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#train-with-pytorch-trainer)!

 </Tip>
+You're ready to start training your model now! Load BERT with [`AutoModelForMultipleChoice`]:
+
+```py
+>>> from transformers import AutoModelForMultipleChoice, TrainingArguments, Trainer
+
+>>> model = AutoModelForMultipleChoice.from_pretrained("bert-base-uncased")
+```

 At this point, only three steps remain:

-1. Define your training hyperparameters in [`TrainingArguments`].
-2. Pass the training arguments to [`Trainer`] along with the model, dataset, tokenizer, and data collator.
-3. Call [`~Trainer.train`] to fine-tune your model.
+1. Define your training hyperparameters in [`TrainingArguments`]. The only required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model). At the end of each epoch, the [`Trainer`] will evaluate the accuracy and save the training checkpoint.
+2. Pass the training arguments to [`Trainer`] along with the model, dataset, tokenizer, data collator, and `compute_metrics` function.
+3. Call [`~Trainer.train`] to finetune your model.

 ```py
 >>> training_args = TrainingArguments(
-...     output_dir="./results",
+...     output_dir="my_awesome_swag_model",
 ...     evaluation_strategy="epoch",
+...     save_strategy="epoch",
+...     load_best_model_at_end=True,
 ...     learning_rate=5e-5,
 ...     per_device_train_batch_size=16,
 ...     per_device_eval_batch_size=16,
 ...     num_train_epochs=3,
 ...     weight_decay=0.01,
+...     push_to_hub=True,
 ... )

 >>> trainer = Trainer(
@@ -218,13 +265,44 @@ At this point, only three steps remain:
 ...     eval_dataset=tokenized_swag["validation"],
 ...     tokenizer=tokenizer,
 ...     data_collator=DataCollatorForMultipleChoice(tokenizer=tokenizer),
+...     compute_metrics=compute_metrics,
 ... )

 >>> trainer.train()
 ```
+
+Once training is completed, share your model to the Hub with the [`~transformers.Trainer.push_to_hub`] method so everyone can use your model:
+
+```py
+>>> trainer.push_to_hub()
+```
 </pt>
 <tf>
-To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`~TFPreTrainedModel.prepare_tf_dataset`].
+<Tip>
+
+If you aren't familiar with finetuning a model with Keras, take a look at the basic tutorial [here](../training#train-a-tensorflow-model-with-keras)!
+
+</Tip>
+To finetune a model in TensorFlow, start by setting up an optimizer function, learning rate schedule, and some training hyperparameters:
+
+```py
+>>> from transformers import create_optimizer
+
+>>> batch_size = 16
+>>> num_train_epochs = 2
+>>> total_train_steps = (len(tokenized_swag["train"]) // batch_size) * num_train_epochs
+>>> optimizer, schedule = create_optimizer(init_lr=5e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
+```
+
+Then you can load BERT with [`TFAutoModelForMultipleChoice`]:
+
+```py
+>>> from transformers import TFAutoModelForMultipleChoice
+
+>>> model = TFAutoModelForMultipleChoice.from_pretrained("bert-base-uncased")
+```
+
+Convert your datasets to the `tf.data.Dataset` format with [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:

 ```py
 >>> data_collator = DataCollatorForMultipleChoice(tokenizer=tokenizer)
@@ -243,41 +321,127 @@ To fine-tune a model in TensorFlow, start by converting your datasets to the `tf
 ... )
 ```

+Configure the model for training with [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):
+
+```py
+>>> model.compile(optimizer=optimizer)
+```
+
+The last two things to setup before you start training is to compute the accuracy from the predictions, and provide a way to push your model to the Hub. Both are done by using [Keras callbacks](./main_classes/keras_callbacks).
+
+Pass your `compute_metrics` function to [`~transformers.KerasMetricCallback`]:
+
+```py
+>>> from transformers.keras_callbacks import KerasMetricCallback
+
+>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
+```
+
+Specify where to push your model and tokenizer in the [`~transformers.PushToHubCallback`]:
+
+```py
+>>> from transformers.keras_callbacks import PushToHubCallback
+
+>>> push_to_hub_callback = PushToHubCallback(
+...     output_dir="my_awesome_model",
+...     tokenizer=tokenizer,
+... )
+```
+
+Then bundle your callbacks together:
+
+```py
+>>> callbacks = [metric_callback, push_to_hub_callback]
+```
+
+Finally, you're ready to start training your model! Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) with your training and validation datasets, the number of epochs, and your callbacks to finetune the model:
+
+```py
+>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=2, callbacks=callbacks)
+```
+
+Once training is completed, your model is automatically uploaded to the Hub so everyone can use it!
+</tf>
+</frameworkcontent>
+
+
 <Tip>

-If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
+For a more in-depth example of how to finetune a model for multiple choice, take a look at the corresponding
+[PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice.ipynb)
+or [TensorFlow notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/multiple_choice-tf.ipynb).

 </Tip>

-Set up an optimizer function, learning rate schedule, and some training hyperparameters:
+## Inference
+
+Great, now that you've finetuned a model, you can use it for inference!
+
+Come up with some text and two candidate answers:

 ```py
->>> from transformers import create_optimizer
+>>> prompt = "France has a bread law, Le Décret Pain, with strict rules on what is allowed in a traditional baguette."
+>>> candidate1 = "The law does not apply to croissants and brioche."
+>>> candidate2 = "The law applies to baguettes."
+```

->>> batch_size = 16
->>> num_train_epochs = 2
->>> total_train_steps = (len(tokenized_swag["train"]) // batch_size) * num_train_epochs
->>> optimizer, schedule = create_optimizer(init_lr=5e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
+<frameworkcontent>
+<pt>
+Tokenize each prompt and candidate answer pair and return PyTorch tensors. You should also create some `labels`:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_swag_model")
+>>> inputs = tokenizer([[prompt, candidate1], [prompt, candidate2]], return_tensors="pt", padding=True)
+>>> labels = torch.tensor(0).unsqueeze(0)
 ```

-Load BERT with [`TFAutoModelForMultipleChoice`]:
+Pass your inputs and labels to the model and return the `logits`:

 ```py
->>> from transformers import TFAutoModelForMultipleChoice
+>>> from transformers import AutoModelForMultipleChoice

->>> model = TFAutoModelForMultipleChoice.from_pretrained("bert-base-uncased")
+>>> model = AutoModelForMultipleChoice.from_pretrained("my_awesome_swag_model")
+>>> outputs = model(**{k: v.unsqueeze(0) for k, v in inputs.items()}, labels=labels)
+>>> logits = outputs.logits
 ```

-Configure the model for training with [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):
+Get the class with the highest probability:

 ```py
->>> model.compile(optimizer=optimizer)
+>>> predicted_class = logits.argmax().item()
+>>> predicted_class
+'0'
+```
+</pt>
+<tf>
+Tokenize each prompt and candidate answer pair and return TensorFlow tensors:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_swag_model")
+>>> inputs = tokenizer([[prompt, candidate1], [prompt, candidate2]], return_tensors="tf", padding=True)
+```
+
+Pass your inputs to the model and return the `logits`:
+
+```py
+>>> from transformers import TFAutoModelForMultipleChoice
+
+>>> model = TFAutoModelForMultipleChoice.from_pretrained("my_awesome_swag_model")
+>>> inputs = {k: tf.expand_dims(v, 0) for k, v in inputs.items()}
+>>> outputs = model(inputs)
+>>> logits = outputs.logits
 ```

-Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fine-tune the model:
+Get the class with the highest probability:

 ```py
->>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=2)
+>>> predicted_class = int(tf.math.argmax(logits, axis=-1)[0])
+>>> predicted_class
+'0'
 ```
 </tf>
 </frameworkcontent>
\ No newline at end of file
--- a/docs/source/en/tasks/question_answering.mdx
+++ b/docs/source/en/tasks/question_answering.mdx
@@ -12,14 +12,19 @@ specific language governing permissions and limitations under the License.

 # Question answering

+[[open-in-colab]]
+
 <Youtube id="ajPx5LwJD-I"/>

-Question answering tasks return an answer given a question. There are two common forms of question answering:
+Question answering tasks return an answer given a question. If you've ever asked a virtual assistant like Alexa, Siri or Google what the weather is, then you've used a question answering model before. There are two common types of question answering tasks:

 - Extractive: extract the answer from the given context.
 - Abstractive: generate an answer from the context that correctly answers the question.

-This guide will show you how to fine-tune [DistilBERT](https://huggingface.co/distilbert-base-uncased) on the [SQuAD](https://huggingface.co/datasets/squad) dataset for extractive question answering.
+This guide will show you how to:
+
+1. Finetune [DistilBERT](https://huggingface.co/distilbert-base-uncased) on the [SQuAD](https://huggingface.co/datasets/squad) dataset for extractive question answering.
+2. Use your finetuned model for inference.

 <Tip>

@@ -27,14 +32,34 @@ See the question answering [task page](https://huggingface.co/tasks/question-ans

 </Tip>

+Before you begin, make sure you have all the necessary libraries installed:
+
+```bash
+pip install transformers datasets evaluate
+```
+
+We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```
+
 ## Load SQuAD dataset

-Load the SQuAD dataset from the 🤗 Datasets library:
+Start by loading a smaller subset of the SQuAD dataset from the 🤗 Datasets library. This'll give you a chance to experiment and make sure everythings works before spending more time training on the full dataset.

 ```py
 >>> from datasets import load_dataset

->>> squad = load_dataset("squad")
+>>> squad = load_dataset("squad", split="train[:5000]")
+```
+
+Split the dataset's `train` split into a train and test set with the [`~datasets.Dataset.train_test_split`] method:
+
+```py
+>>> squad = squad.train_test_split(test_size=0.2)
 ```

 Then take a look at an example:
@@ -49,13 +74,17 @@ Then take a look at an example:
 }
 ```

-The `answers` field is a dictionary containing the starting position of the answer and the `text` of the answer.
+There are several important fields here:
+
+- `answers`: the starting location of the answer token and the answer text.
+- `context`: background information from which the model needs to extract the answer.
+- `question`: the question a model should answer.

 ## Preprocess

 <Youtube id="qgaM0weJHpA"/>

-Load the DistilBERT tokenizer to process the `question` and `context` fields:
+The next step is to load a DistilBERT tokenizer to process the `question` and `context` fields:

 ```py
 >>> from transformers import AutoTokenizer
@@ -63,15 +92,15 @@ Load the DistilBERT tokenizer to process the `question` and `context` fields:
 >>> tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
 ```

-There are a few preprocessing steps particular to question answering that you should be aware of:
+There are a few preprocessing steps particular to question answering tasks you should be aware of:

-1. Some examples in a dataset may have a very long `context` that exceeds the maximum input length of the model. Truncate only the `context` by setting `truncation="only_second"`.
+1. Some examples in a dataset may have a very long `context` that exceeds the maximum input length of the model. To deal with longer sequences, truncate only the `context` by setting `truncation="only_second"`.
 2. Next, map the start and end positions of the answer to the original `context` by setting
   `return_offset_mapping=True`.
-3. With the mapping in hand, you can find the start and end tokens of the answer. Use the [`sequence_ids`](https://huggingface.co/docs/tokenizers/python/latest/api/reference.html#tokenizers.Encoding.sequence_ids) method to
+3. With the mapping in hand, now you can find the start and end tokens of the answer. Use the [`~tokenizers.Encoding.sequence_ids`] method to
   find which part of the offset corresponds to the `question` and which corresponds to the `context`.

-Here is how you can create a function to truncate and map the start and end tokens of the answer to the `context`:
+Here is how you can create a function to truncate and map the start and end tokens of the `answer` to the `context`:

 ```py
 >>> def preprocess_function(examples):
@@ -126,13 +155,13 @@ Here is how you can create a function to truncate and map the start and end toke
 ...     return inputs
 ```

-Use 🤗 Datasets [`~datasets.Dataset.map`] function to apply the preprocessing function over the entire dataset. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once. Remove the columns you don't need:
+To apply the preprocessing function over the entire dataset, use 🤗 Datasets [`~datasets.Dataset.map`] function. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once. Remove any columns you don't need:

 ```py
 >>> tokenized_squad = squad.map(preprocess_function, batched=True, remove_columns=squad["train"].column_names)
 ```

-Use [`DefaultDataCollator`] to create a batch of examples. Unlike other data collators in 🤗 Transformers, the `DefaultDataCollator` does not apply additional preprocessing such as padding.
+Now create a batch of examples using [`DefaultDataCollator`]. Unlike other data collators in 🤗 Transformers, the [`DefaultDataCollator`] does not apply any additional preprocessing such as padding.

 <frameworkcontent>
 <pt>
@@ -155,7 +184,12 @@ Use [`DefaultDataCollator`] to create a batch of examples. Unlike other data col

 <frameworkcontent>
 <pt>
-Load DistilBERT with [`AutoModelForQuestionAnswering`]:
+<Tip>
+
+If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#train-with-pytorch-trainer)!
+
+</Tip>
+You're ready to start training your model now! Load DistilBERT with [`AutoModelForQuestionAnswering`]:

 ```py
 >>> from transformers import AutoModelForQuestionAnswering, TrainingArguments, Trainer
@@ -163,67 +197,49 @@ Load DistilBERT with [`AutoModelForQuestionAnswering`]:
 >>> model = AutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")
 ```

-<Tip>
-
-If you aren't familiar with fine-tuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#finetune-with-trainer)!
-
-</Tip>
-
 At this point, only three steps remain:

-1. Define your training hyperparameters in [`TrainingArguments`].
+1. Define your training hyperparameters in [`TrainingArguments`]. The only required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model).
 2. Pass the training arguments to [`Trainer`] along with the model, dataset, tokenizer, and data collator.
-3. Call [`~Trainer.train`] to fine-tune your model.
+3. Call [`~Trainer.train`] to finetune your model.

 ```py
 >>> training_args = TrainingArguments(
-...     output_dir="./results",
+...     output_dir="my_awesome_qa_model",
 ...     evaluation_strategy="epoch",
 ...     learning_rate=2e-5,
 ...     per_device_train_batch_size=16,
 ...     per_device_eval_batch_size=16,
 ...     num_train_epochs=3,
 ...     weight_decay=0.01,
+...     push_to_hub=True,
 ... )

 >>> trainer = Trainer(
 ...     model=model,
 ...     args=training_args,
 ...     train_dataset=tokenized_squad["train"],
-...     eval_dataset=tokenized_squad["validation"],
+...     eval_dataset=tokenized_squad["test"],
 ...     tokenizer=tokenizer,
 ...     data_collator=data_collator,
 ... )

 >>> trainer.train()
 ```
-</pt>
-<tf>
-To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`~TFPreTrainedModel.prepare_tf_dataset`].

-```py
->>> tf_train_set = model.prepare_tf_dataset(
-...     tokenized_squad["train"],
-...     shuffle=True,
-...     batch_size=16,
-...     collate_fn=data_collator,
-... )
+Once training is completed, share your model to the Hub with the [`~transformers.Trainer.push_to_hub`] method so everyone can use your model:

->>> tf_validation_set = model.prepare_tf_dataset(
-...     tokenized_squad["validation"],
-...     shuffle=False,
-...     batch_size=16,
-...     collate_fn=data_collator,
-... )
+```py
+>>> trainer.push_to_hub()
 ```
-
+</pt>
+<tf>
 <Tip>

-If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
+If you aren't familiar with finetuning a model with Keras, take a look at the basic tutorial [here](../training#train-a-tensorflow-model-with-keras)!

 </Tip>
-
-Set up an optimizer function, learning rate schedule, and some training hyperparameters:
+To finetune a model in TensorFlow, start by setting up an optimizer function, learning rate schedule, and some training hyperparameters:

 ```py
 >>> from transformers import create_optimizer
@@ -238,7 +254,7 @@ Set up an optimizer function, learning rate schedule, and some training hyperpar
 ... )
 ```

-Load DistilBERT with [`TFAutoModelForQuestionAnswering`]:
+Then you can load DistilBERT with [`TFAutoModelForQuestionAnswering`]:

 ```py
 >>> from transformers import TFAutoModelForQuestionAnswering
@@ -246,6 +262,24 @@ Load DistilBERT with [`TFAutoModelForQuestionAnswering`]:
 >>> model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")
 ```

+Convert your datasets to the `tf.data.Dataset` format with [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:
+
+```py
+>>> tf_train_set = model.prepare_tf_dataset(
+...     tokenized_squad["train"],
+...     shuffle=True,
+...     batch_size=16,
+...     collate_fn=data_collator,
+... )
+
+>>> tf_validation_set = model.prepare_tf_dataset(
+...     tokenized_squad["test"],
+...     shuffle=False,
+...     batch_size=16,
+...     collate_fn=data_collator,
+... )
+```
+
 Configure the model for training with [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):

 ```py
@@ -254,18 +288,134 @@ Configure the model for training with [`compile`](https://keras.io/api/models/mo
 >>> model.compile(optimizer=optimizer)
 ```

-Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fine-tune the model:
+The last thing to setup before you start training is to provide a way to push your model to the Hub. This can be done by specifying where to push your model and tokenizer in the [`~transformers.PushToHubCallback`]:
+
+```py
+>>> from transformers.keras_callbacks import PushToHubCallback
+
+>>> callback = PushToHubCallback(
+...     output_dir="my_awesome_qa_model",
+...     tokenizer=tokenizer,
+... )
+```
+
+Finally, you're ready to start training your model! Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) with your training and validation datasets, the number of epochs, and your callback to finetune the model:

 ```py
->>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3)
+>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3, callbacks=[callback])
 ```
+Once training is completed, your model is automatically uploaded to the Hub so everyone can use it!
 </tf>
 </frameworkcontent>

 <Tip>

-For a more in-depth example of how to fine-tune a model for question answering, take a look at the corresponding
+For a more in-depth example of how to finetune a model for question answering, take a look at the corresponding
 [PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering.ipynb)
 or [TensorFlow notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/question_answering-tf.ipynb).

 </Tip>
+
+## Evaluate
+
+Evaluation for question answering requires a significant amount of postprocessing. To avoid taking up too much of your time, this guide skips the evaluation step. The [`Trainer`] still calculates the evaluation loss during training so you're not completely in the dark about your model's performance.
+
+If have more time and you're interested in how to evaluate your model for question answering, take a look at the [Question answering](https://huggingface.co/course/chapter7/7?fw=pt#postprocessing) chapter from the 🤗 Hugging Face Course!
+
+## Inference
+
+Great, now that you've finetuned a model, you can use it for inference!
+
+Come up with a question and some context you'd like the model to predict:
+
+```py
+>>> question = "How many programming languages does BLOOM support?"
+>>> context = "BLOOM has 176 billion parameters and can generate text in 46 languages natural languages and 13 programming languages."
+```
+
+The simplest way to try out your finetuned model for inference is to use it in a [`pipeline`]. Instantiate a `pipeline` for question answering with your model, and pass your text to it:
+
+```py
+>>> from transformers import pipeline
+
+>>> question_answerer = pipeline("question-answering", model="my_awesome_qa_model")
+>>> question_answerer(question=question, context=context)
+{'score': 0.2058267742395401,
+ 'start': 10,
+ 'end': 95,
+ 'answer': '176 billion parameters and can generate text in 46 languages natural languages and 13'}
+```
+
+You can also manually replicate the results of the `pipeline` if you'd like:
+
+<frameworkcontent>
+<pt>
+Tokenize the text and return PyTorch tensors:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_qa_model")
+>>> inputs = tokenizer(question, context, return_tensors="pt")
+```
+
+Pass your inputs to the model and return the `logits`:
+
+```py
+>>> from transformers import AutoModelForQuestionAnswering
+
+>>> model = AutoModelForQuestionAnswering.from_pretrained("my_awesome_qa_model")
+>>> with torch.no_grad():
+...     outputs = model(**inputs)
+```
+
+Get the highest probability from the model output for the start and end positions:
+
+```py
+>>> answer_start_index = outputs.start_logits.argmax()
+>>> answer_end_index = outputs.end_logits.argmax()
+```
+
+Decode the predicted tokens to get the answer:
+
+```py
+>>> predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
+>>> tokenizer.decode(predict_answer_tokens)
+'176 billion parameters and can generate text in 46 languages natural languages and 13'
+```
+</pt>
+<tf>
+Tokenize the text and return TensorFlow tensors:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_qa_model")
+>>> inputs = tokenizer(question, text, return_tensors="tf")
+```
+
+Pass your inputs to the model and return the `logits`:
+
+```py
+>>> from transformers import TFAutoModelForQuestionAnswering
+
+>>> model = TFAutoModelForQuestionAnswering.from_pretrained("my_awesome_qa_model")
+>>> outputs = model(**inputs)
+```
+
+Get the highest probability from the model output for the start and end positions:
+
+```py
+>>> answer_start_index = int(tf.math.argmax(outputs.start_logits, axis=-1)[0])
+>>> answer_end_index = int(tf.math.argmax(outputs.end_logits, axis=-1)[0])
+```
+
+Decode the predicted tokens to get the answer:
+
+```py
+>>> predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1]
+>>> tokenizer.decode(predict_answer_tokens)
+'176 billion parameters and can generate text in 46 languages natural languages and 13'
+```
+</tf>
+</frameworkcontent>
\ No newline at end of file
--- a/docs/source/en/tasks/semantic_segmentation.mdx
+++ b/docs/source/en/tasks/semantic_segmentation.mdx
@@ -18,7 +18,10 @@ specific language governing permissions and limitations under the License.

 Semantic segmentation assigns a label or class to each individual pixel of an image. There are several types of segmentation, and in the case of semantic segmentation, no distinction is made between unique instances of the same object. Both objects are given the same label (for example, "car" instead of "car-1" and "car-2"). Common real-world applications of semantic segmentation include training self-driving cars to identify pedestrians and important traffic information, identifying cells and abnormalities in medical imagery, and monitoring environmental changes from satellite imagery.

-This guide will show you how to finetune [SegFormer](https://huggingface.co/docs/transformers/main/en/model_doc/segformer#segformer) on the [SceneParse150](https://huggingface.co/datasets/scene_parse_150) dataset.
+This guide will show you how to:
+
+1. Finetune [SegFormer](https://huggingface.co/docs/transformers/main/en/model_doc/segformer#segformer) on the [SceneParse150](https://huggingface.co/datasets/scene_parse_150) dataset.
+2. Use your finetuned model for inference.

 <Tip>

@@ -32,9 +35,17 @@ Before you begin, make sure you have all the necessary libraries installed:
 pip install -q datasets transformers evaluate
 ```

+We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```
+
 ## Load SceneParse150 dataset

-Load the first 50 examples of the SceneParse150 dataset from the 🤗 Datasets library so you can quickly train and test a model:
+Start by loading a smaller subset of the SceneParse150 dataset from the 🤗 Datasets library. This'll give you a chance to experiment and make sure everythings works before spending more time training on the full dataset.

 ```py
 >>> from datasets import load_dataset
@@ -42,7 +53,7 @@ Load the first 50 examples of the SceneParse150 dataset from the 🤗 Datasets l
 >>> ds = load_dataset("scene_parse_150", split="train[:50]")
 ```

-Split this dataset into a train and test set:
+Split the dataset's `train` split into a train and test set with the [`~datasets.Dataset.train_test_split`] method:

 ```py
 >>> ds = ds.train_test_split(test_size=0.2)
@@ -59,7 +70,9 @@ Then take a look at an example:
 'scene_category': 368}
 ```

-There is an `image`, an `annotation` (this is the segmentation map or label), and a `scene_category` field that describes the image scene, like "kitchen" or "office". In this guide, you'll only need `image` and `annotation`, both of which are PIL images.
+- `image`: a PIL image of the scene.
+- `annotation`: a PIL image of the segmentation map, which is also the model's target.
+- `scene_category`: a category id that describes the image scene like "kitchen" or "office". In this guide, you'll only need `image` and `annotation`, both of which are PIL images.

 You'll also want to create a dictionary that maps a label id to a label class which will be useful when you set up the model later. Download the mappings from the Hub and create the `id2label` and `label2id` dictionaries:

@@ -77,7 +90,7 @@ You'll also want to create a dictionary that maps a label id to a label class wh

 ## Preprocess

-Next, load a SegFormer feature extractor to prepare the images and annotations for the model. Some datasets, like this one, use the zero-index as the background class. However, the background class isn't included in the 150 classes, so you'll need to set `reduce_labels=True` to subtract one from all the labels. The zero-index is replaced by `255` so it's ignored by SegFormer's loss function:
+The next step is to load a SegFormer feature extractor to prepare the images and annotations for the model. Some datasets, like this one, use the zero-index as the background class. However, the background class isn't actually included in the 150 classes, so you'll need to set `reduce_labels=True` to subtract one from all the labels. The zero-index is replaced by `255` so it's ignored by SegFormer's loss function:

 ```py
 >>> from transformers import AutoFeatureExtractor
@@ -85,7 +98,7 @@ Next, load a SegFormer feature extractor to prepare the images and annotations f
 >>> feature_extractor = AutoFeatureExtractor.from_pretrained("nvidia/mit-b0", reduce_labels=True)
 ```

-It is common to apply some data augmentations to an image dataset to make a model more robust against overfitting. In this guide, you'll use the [`ColorJitter`](https://pytorch.org/vision/stable/generated/torchvision.transforms.ColorJitter.html) function from [torchvision](https://pytorch.org/vision/stable/index.html) to randomly change the color properties of an image:
+It is common to apply some data augmentations to an image dataset to make a model more robust against overfitting. In this guide, you'll use the [`ColorJitter`](https://pytorch.org/vision/stable/generated/torchvision.transforms.ColorJitter.html) function from [torchvision](https://pytorch.org/vision/stable/index.html) to randomly change the color properties of an image, but you can also use any image library you like.

 ```py
 >>> from torchvision.transforms import ColorJitter
@@ -117,53 +130,9 @@ To apply the `jitter` over the entire dataset, use the 🤗 Datasets [`~datasets
 >>> test_ds.set_transform(val_transforms)
 ```

-## Train
-
-Load SegFormer with [`AutoModelForSemanticSegmentation`], and pass the model the mapping between label ids and label classes:
-
-```py
->>> from transformers import AutoModelForSemanticSegmentation
-
->>> pretrained_model_name = "nvidia/mit-b0"
->>> model = AutoModelForSemanticSegmentation.from_pretrained(
-...     pretrained_model_name, id2label=id2label, label2id=label2id
-... )
-```
-
-<Tip>
-
-If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#finetune-with-trainer)!
-
-</Tip>
-
-Define your training hyperparameters in [`TrainingArguments`]. It is important not to remove unused columns because this will drop the `image` column. Without the `image` column, you can't create `pixel_values`. Set `remove_unused_columns=False` to prevent this behavior!
+## Evaluate

-To save and push a model under your namespace to the Hub, set `push_to_hub=True`:
-
-```py
->>> from transformers import TrainingArguments
-
->>> training_args = TrainingArguments(
-...     output_dir="segformer-b0-scene-parse-150",
-...     learning_rate=6e-5,
-...     num_train_epochs=50,
-...     per_device_train_batch_size=2,
-...     per_device_eval_batch_size=2,
-...     save_total_limit=3,
-...     evaluation_strategy="steps",
-...     save_strategy="steps",
-...     save_steps=20,
-...     eval_steps=20,
-...     logging_steps=1,
-...     eval_accumulation_steps=5,
-...     remove_unused_columns=False,
-...     push_to_hub=True,
-... )
-```
-
-To evaluate model performance during training, you'll need to create a function to compute and report metrics. For semantic segmentation, you'll typically compute the [mean Intersection over Union](https://huggingface.co/spaces/evaluate-metric/mean_iou) (IoU). The mean IoU measures the overlapping area between the predicted and ground truth segmentation maps. 
-
-Load the mean IoU from the 🤗 Evaluate library:
+Including a metric during training is often helpful for evaluating your model's performance. You can quickly load a evaluation method with the 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) library. For this task, load the [mean Intersection over Union](https://huggingface.co/spaces/evaluate-metric/accuracy) (IoU) metric (see the 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) to learn more about how to load and compute a metric):

 ```py
 >>> import evaluate
@@ -199,10 +168,50 @@ Then create a function to [`~evaluate.EvaluationModule.compute`] the metrics. Yo
 ...         return metrics
 ```

-Pass your model, training arguments, datasets, and metrics function to the [`Trainer`]:
+Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.
+
+## Train
+
+<Tip>
+
+If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#finetune-with-trainer)!
+
+</Tip>
+
+You're ready to start training your model now! Load SegFormer with [`AutoModelForSemanticSegmentation`], and pass the model the mapping between label ids and label classes:

 ```py
->>> from transformers import Trainer
+>>> from transformers import AutoModelForSemanticSegmentation, TrainingArguments, Trainer
+
+>>> pretrained_model_name = "nvidia/mit-b0"
+>>> model = AutoModelForSemanticSegmentation.from_pretrained(
+...     pretrained_model_name, id2label=id2label, label2id=label2id
+... )
+```
+
+At this point, only three steps remain:
+
+1. Define your training hyperparameters in [`TrainingArguments`]. It is important you don't remove unused columns because this'll drop the `image` column. Without the `image` column, you can't create `pixel_values`. Set `remove_unused_columns=False` to prevent this behavior! The only other required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model). At the end of each epoch, the [`Trainer`] will evaluate the IoU metric and save the training checkpoint.
+2. Pass the training arguments to [`Trainer`] along with the model, dataset, tokenizer, data collator, and `compute_metrics` function.
+3. Call [`~Trainer.train`] to finetune your model.
+
+```py
+>>> training_args = TrainingArguments(
+...     output_dir="segformer-b0-scene-parse-150",
+...     learning_rate=6e-5,
+...     num_train_epochs=50,
+...     per_device_train_batch_size=2,
+...     per_device_eval_batch_size=2,
+...     save_total_limit=3,
+...     evaluation_strategy="steps",
+...     save_strategy="steps",
+...     save_steps=20,
+...     eval_steps=20,
+...     logging_steps=1,
+...     eval_accumulation_steps=5,
+...     remove_unused_columns=False,
+...     push_to_hub=True,
+... )

 >>> trainer = Trainer(
 ...     model=model,
@@ -211,12 +220,14 @@ Pass your model, training arguments, datasets, and metrics function to the [`Tra
 ...     eval_dataset=test_ds,
 ...     compute_metrics=compute_metrics,
 ... )
+
+>>> trainer.train()
 ```

-Lastly, call [`~Trainer.train`] to finetune your model:
+Once training is completed, share your model to the Hub with the [`~transformers.Trainer.push_to_hub`] method so everyone can use your model:

 ```py
->>> trainer.train()
+>>> trainer.push_to_hub()
 ```

 ## Inference
@@ -234,7 +245,43 @@ Load an image for inference:
    <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/semantic-seg-image.png" alt="Image of bedroom"/>
 </div>

-Process the image with a feature extractor and place the `pixel_values` on a GPU:
+The simplest way to try out your finetuned model for inference is to use it in a [`pipeline`]. Instantiate a `pipeline` for image segmentation with your model, and pass your image to it:
+
+```py
+>>> from transformers import pipeline
+
+>>> segmenter = pipeline("image-segmentation", model="my_awesome_seg_model")
+>>> segmenter(image)
+[{'score': None,
+  'label': 'wall',
+  'mask': <PIL.Image.Image image mode=L size=640x427 at 0x7FD5B2062690>},
+ {'score': None,
+  'label': 'sky',
+  'mask': <PIL.Image.Image image mode=L size=640x427 at 0x7FD5B2062A50>},
+ {'score': None,
+  'label': 'floor',
+  'mask': <PIL.Image.Image image mode=L size=640x427 at 0x7FD5B2062B50>},
+ {'score': None,
+  'label': 'ceiling',
+  'mask': <PIL.Image.Image image mode=L size=640x427 at 0x7FD5B2062A10>},
+ {'score': None,
+  'label': 'bed ',
+  'mask': <PIL.Image.Image image mode=L size=640x427 at 0x7FD5B2062E90>},
+ {'score': None,
+  'label': 'windowpane',
+  'mask': <PIL.Image.Image image mode=L size=640x427 at 0x7FD5B2062390>},
+ {'score': None,
+  'label': 'cabinet',
+  'mask': <PIL.Image.Image image mode=L size=640x427 at 0x7FD5B2062550>},
+ {'score': None,
+  'label': 'chair',
+  'mask': <PIL.Image.Image image mode=L size=640x427 at 0x7FD5B2062D90>},
+ {'score': None,
+  'label': 'armchair',
+  'mask': <PIL.Image.Image image mode=L size=640x427 at 0x7FD5B2062E10>}]
+```
+
+You can also manually replicate the results of the `pipeline` if you'd like. Process the image with a feature extractor and place the `pixel_values` on a GPU:

 ```py
 >>> device = torch.device("cuda" if torch.cuda.is_available() else "cpu")  # use GPU if available, otherwise use a CPU

--- a/docs/source/en/tasks/sequence_classification.mdx
+++ b/docs/source/en/tasks/sequence_classification.mdx
@@ -12,11 +12,16 @@ specific language governing permissions and limitations under the License.

 # Text classification

+[[open-in-colab]]
+
 <Youtube id="leNG9fN9FQU"/>

-Text classification is a common NLP task that assigns a label or class to text. There are many practical applications of text classification widely used in production by some of today's largest companies. One of the most popular forms of text classification is sentiment analysis, which assigns a label like positive, negative, or neutral to a sequence of text. 
+Text classification is a common NLP task that assigns a label or class to text. Some of the largest companies run text classification in production for a wide range of practical applications. One of the most popular forms of text classification is sentiment analysis, which assigns a label like 🙂 positive, 🙁 negative, or 😐 neutral to a sequence of text. 
+
+This guide will show you how to:

-This guide will show you how to fine-tune [DistilBERT](https://huggingface.co/distilbert-base-uncased) on the [IMDb](https://huggingface.co/datasets/imdb) dataset to determine whether a movie review is positive or negative.
+1. Finetune [DistilBERT](https://huggingface.co/distilbert-base-uncased) on the [IMDb](https://huggingface.co/datasets/imdb) dataset to determine whether a movie review is positive or negative.
+2. Use your finetuned model for inference.

 <Tip>

@@ -24,9 +29,23 @@ See the text classification [task page](https://huggingface.co/tasks/text-classi

 </Tip>

+Before you begin, make sure you have all the necessary libraries installed:
+
+```bash
+pip install transformers datasets evaluate
+```
+
+We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```
+
 ## Load IMDb dataset

-Load the IMDb dataset from the 🤗 Datasets library:
+Start by loading the IMDb dataset from the 🤗 Datasets library:

 ```py
 >>> from datasets import load_dataset
@@ -46,12 +65,12 @@ Then take a look at an example:

 There are two fields in this dataset: 

- `text`: a string containing the text of the movie review.
- `label`: a value that can either be `0` for a negative review or `1` for a positive review.
+- `text`: the movie review text.
+- `label`: a value that is either `0` for a negative review or `1` for a positive review.

 ## Preprocess

-Load the DistilBERT tokenizer to process the `text` field:
+The next step is to load a DistilBERT tokenizer to preprocess the `text` field:

 ```py
 >>> from transformers import AutoTokenizer
@@ -66,13 +85,13 @@ Create a preprocessing function to tokenize `text` and truncate sequences to be
 ...     return tokenizer(examples["text"], truncation=True)
 ```

-Use 🤗 Datasets [`~datasets.Dataset.map`] function to apply the preprocessing function over the entire dataset. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once:
+To apply the preprocessing function over the entire dataset, use 🤗 Datasets [`~datasets.Dataset.map`] function. You can speed up `map` by setting `batched=True` to process multiple elements of the dataset at once:

 ```py
 tokenized_imdb = imdb.map(preprocess_function, batched=True)
 ```

-Use [`DataCollatorWithPadding`] to create a batch of examples. It will also *dynamically pad* your text to the length of the longest element in its batch, so they are a uniform length. While it is possible to pad your text in the `tokenizer` function by setting `padding=True`, dynamic padding is more efficient.
+Now create a batch of examples using [`DataCollatorWithPadding`]. It's more efficient to *dynamically pad* the sentences to the longest length in a batch during collation, instead of padding the whole dataset to the maximium length.

 <frameworkcontent>
 <pt>
@@ -91,38 +110,74 @@ Use [`DataCollatorWithPadding`] to create a batch of examples. It will also *dyn
 </tf>
 </frameworkcontent>

-## Train
+## Evaluate

-<frameworkcontent>
-<pt>
-Load DistilBERT with [`AutoModelForSequenceClassification`] along with the number of expected labels:
+Including a metric during training is often helpful for evaluating your model's performance. You can quickly load a evaluation method with the 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) library. For this task, load the [accuracy](https://huggingface.co/spaces/evaluate-metric/accuracy) metric (see the 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) to learn more about how to load and compute a metric):

 ```py
->>> from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer
+>>> import evaluate

->>> model = AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=2)
+>>> accuracy = evaluate.load("accuracy")
 ```

+Then create a function that passes your predictions and labels to [`~evaluate.EvaluationModule.compute`] to calculate the accuracy:
+
+```py
+>>> import numpy as np
+
+
+>>> def compute_metrics(eval_pred):
+...     predictions, labels = eval_pred
+...     predictions = np.argmax(predictions, axis=1)
+...     return accuracy.compute(predictions=predictions, references=labels)
+```
+
+Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.
+
+## Train
+
+Before you start training your model, create a map of the expected ids to their labels with `id2label` and `label2id`:
+
+```py
+>>> id2label = {0: "NEGATIVE", 1: "POSITIVE"}
+>>> label2id = {"NEGATIVE": 0, "POSITIVE": 1}
+```
+
+<frameworkcontent>
+<pt>
 <Tip>

-If you aren't familiar with fine-tuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#finetune-with-trainer)!
+If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#train-with-pytorch-trainer)!

 </Tip>
+You're ready to start training your model now! Load DistilBERT with [`AutoModelForSequenceClassification`] along with the number of expected labels, and the label mappings:
+
+```py
+>>> from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer
+
+>>> model = AutoModelForSequenceClassification.from_pretrained(
+...     "distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
+... )
+```

 At this point, only three steps remain:

-1. Define your training hyperparameters in [`TrainingArguments`].
-2. Pass the training arguments to [`Trainer`] along with the model, dataset, tokenizer, and data collator.
-3. Call [`~Trainer.train`] to fine-tune your model.
+1. Define your training hyperparameters in [`TrainingArguments`]. The only required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model). At the end of each epoch, the [`Trainer`] will evaluate the accuracy and save the training checkpoint.
+2. Pass the training arguments to [`Trainer`] along with the model, dataset, tokenizer, data collator, and `compute_metrics` function.
+3. Call [`~Trainer.train`] to finetune your model.

 ```py
 >>> training_args = TrainingArguments(
-...     output_dir="./results",
+...     output_dir="my_awesome_model",
 ...     learning_rate=2e-5,
 ...     per_device_train_batch_size=16,
 ...     per_device_eval_batch_size=16,
-...     num_train_epochs=5,
+...     num_train_epochs=2,
 ...     weight_decay=0.01,
+...     evaluation_strategy="epoch",
+...     save_strategy="epoch",
+...     load_best_model_at_end=True,
+...     push_to_hub=True,
 ... )

 >>> trainer = Trainer(
@@ -132,6 +187,7 @@ At this point, only three steps remain:
 ...     eval_dataset=tokenized_imdb["test"],
 ...     tokenizer=tokenizer,
 ...     data_collator=data_collator,
+...     compute_metrics=compute_metrics,
 ... )

 >>> trainer.train()
@@ -139,13 +195,46 @@ At this point, only three steps remain:

 <Tip>

-[`Trainer`] will apply dynamic padding by default when you pass `tokenizer` to it. In this case, you don't need to specify a data collator explicitly.
+[`Trainer`] applies dynamic padding by default when you pass `tokenizer` to it. In this case, you don't need to specify a data collator explicitly.

 </Tip>
+
+Once training is completed, share your model to the Hub with the [`~transformers.Trainer.push_to_hub`] method so everyone can use your model:
+
+```py
+>>> trainer.push_to_hub()
+```
 </pt>
 <tf>
-To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`~TFPreTrainedModel.prepare_tf_dataset`].
+<Tip>

+If you aren't familiar with finetuning a model with Keras, take a look at the basic tutorial [here](../training#train-a-tensorflow-model-with-keras)!
+
+</Tip>
+To finetune a model in TensorFlow, start by setting up an optimizer function, learning rate schedule, and some training hyperparameters:
+
+```py
+>>> from transformers import create_optimizer
+>>> import tensorflow as tf
+
+>>> batch_size = 16
+>>> num_epochs = 5
+>>> batches_per_epoch = len(tokenized_imdb["train"]) // batch_size
+>>> total_train_steps = int(batches_per_epoch * num_epochs)
+>>> optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
+```
+
+Then you can load DistilBERT with [`TFAutoModelForSequenceClassification`] along with the number of expected labels, and the label mappings:
+
+```py
+>>> from transformers import TFAutoModelForSequenceClassification
+
+>>> model = TFAutoModelForSequenceClassification.from_pretrained(
+...     "distilbert-base-uncased", num_labels=2, id2label=id2label, label2id=label2id
+... )
+```
+
+Convert your datasets to the `tf.data.Dataset` format with [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:

 ```py
 >>> tf_train_set = model.prepare_tf_dataset(
@@ -163,53 +252,135 @@ To fine-tune a model in TensorFlow, start by converting your datasets to the `tf
 ... )
 ```

-<Tip>
+Configure the model for training with [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):

-If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
+```py
+>>> import tensorflow as tf

-</Tip>
+>>> model.compile(optimizer=optimizer)
+```
+
+The last two things to setup before you start training is to compute the accuracy from the predictions, and provide a way to push your model to the Hub. Both are done by using [Keras callbacks](./main_classes/keras_callbacks).

-Set up an optimizer function, learning rate schedule, and some training hyperparameters:
+Pass your `compute_metrics` function to [`~transformers.KerasMetricCallback`]:

 ```py
->>> from transformers import create_optimizer
->>> import tensorflow as tf
+>>> from transformers.keras_callbacks import KerasMetricCallback

->>> batch_size = 16
->>> num_epochs = 5
->>> batches_per_epoch = len(tokenized_imdb["train"]) // batch_size
->>> total_train_steps = int(batches_per_epoch * num_epochs)
->>> optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
+>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
 ```

-Load DistilBERT with [`TFAutoModelForSequenceClassification`] along with the number of expected labels:
+Specify where to push your model and tokenizer in the [`~transformers.PushToHubCallback`]:

 ```py
->>> from transformers import TFAutoModelForSequenceClassification
+>>> from transformers.keras_callbacks import PushToHubCallback

->>> model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased", num_labels=2)
+>>> push_to_hub_callback = PushToHubCallback(
+...     output_dir="my_awesome_model",
+...     tokenizer=tokenizer,
+... )
 ```

-Configure the model for training with [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):
+Then bundle your callbacks together:

 ```py
->>> import tensorflow as tf
-
->>> model.compile(optimizer=optimizer)
+>>> callbacks = [metric_callback, push_to_hub_callback]
 ```

-Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fine-tune the model:
+Finally, you're ready to start training your model! Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) with your training and validation datasets, the number of epochs, and your callbacks to finetune the model:

 ```py
->>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3)
+>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3, callbacks=callbacks)
 ```
+
+Once training is completed, your model is automatically uploaded to the Hub so everyone can use it!
 </tf>
 </frameworkcontent>

 <Tip>

-For a more in-depth example of how to fine-tune a model for text classification, take a look at the corresponding
+For a more in-depth example of how to finetune a model for text classification, take a look at the corresponding
 [PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification.ipynb)
 or [TensorFlow notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/text_classification-tf.ipynb).

 </Tip>
+
+## Inference
+
+Great, now that you've finetuned a model, you can use it for inference!
+
+Grab some text you'd like to run inference on:
+
+```py
+>>> text = "This was a masterpiece. Not completely faithful to the books, but enthralling from beginning to end. Might be my favorite of the three."
+```
+
+The simplest way to try out your finetuned model for inference is to use it in a [`pipeline`]. Instantiate a `pipeline` for sentiment analysis with your model, and pass your text to it:
+
+```py
+>>> from transformers import pipeline
+
+>>> classifier = pipeline("sentiment-analysis", model="stevhliu/my_awesome_model")
+>>> classifier(text)
+[{'label': 'POSITIVE', 'score': 0.9994940757751465}]
+```
+
+You can also manually replicate the results of the `pipeline` if you'd like:
+
+<frameworkcontent>
+<pt>
+Tokenize the text and return PyTorch tensors:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_model")
+>>> inputs = tokenizer(text, return_tensors="pt")
+```
+
+Pass your inputs to the model and return the `logits`:
+
+```py
+>>> from transformers import AutoModelForSequenceClassification
+
+>>> model = AutoModelForSequenceClassification.from_pretrained("stevhliu/my_awesome_model")
+>>> with torch.no_grad():
+...     logits = model(**inputs).logits
+```
+
+Get the class with the highest probability, and use the model's `id2label` mapping to convert it to a text label:
+
+```py
+>>> predicted_class_id = logits.argmax().item()
+>>> model.config.id2label[predicted_class_id]
+'POSITIVE'
+```
+</pt>
+<tf>
+Tokenize the text and return TensorFlow tensors:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_model")
+>>> inputs = tokenizer(text, return_tensors="tf")
+```
+
+Pass your inputs to the model and return the `logits`:
+
+```py
+>>> from transformers import TFAutoModelForSequenceClassification
+
+>>> model = TFAutoModelForSequenceClassification.from_pretrained("stevhliu/my_awesome_model")
+>>> logits = model(**inputs).logits
+```
+
+Get the class with the highest probability, and use the model's `id2label` mapping to convert it to a text label:
+
+```py
+>>> predicted_class_id = int(tf.math.argmax(logits, axis=-1)[0])
+>>> model.config.id2label[predicted_class_id]
+'POSITIVE'
+```
+</tf>
+</frameworkcontent>
\ No newline at end of file
--- a/docs/source/en/tasks/summarization.mdx
+++ b/docs/source/en/tasks/summarization.mdx
@@ -19,7 +19,10 @@ Summarization creates a shorter version of a document or an article that capture
 - Extractive: extract the most relevant information from a document.
 - Abstractive: generate new text that captures the most relevant information. 

-This guide will show you how to fine-tune [T5](https://huggingface.co/t5-small) on the California state bill subset of the [BillSum](https://huggingface.co/datasets/billsum) dataset for abstractive summarization.
+This guide will show you how to:
+
+1. Finetune [T5](https://huggingface.co/t5-small) on the California state bill subset of the [BillSum](https://huggingface.co/datasets/billsum) dataset for abstractive summarization.
+2. Use your finetuned model for inference.

 <Tip>

@@ -27,9 +30,23 @@ See the summarization [task page](https://huggingface.co/tasks/summarization) fo

 </Tip>

+Before you begin, make sure you have all the necessary libraries installed:
+
+```bash
+pip install transformers datasets evaluate
+```
+
+We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```
+
 ## Load BillSum dataset

-Load the BillSum dataset from the 🤗 Datasets library:
+Start by loading the smaller California state bill subset of the BillSum dataset from the 🤗 Datasets library:

 ```py
 >>> from datasets import load_dataset
@@ -37,7 +54,7 @@ Load the BillSum dataset from the 🤗 Datasets library:
 >>> billsum = load_dataset("billsum", split="ca_test")
 ```

-Split this dataset into a train and test set:
+Split the dataset into a train and test set with the [`~datasets.Dataset.train_test_split`] method:

 ```py
 >>> billsum = billsum.train_test_split(test_size=0.2)
@@ -52,11 +69,14 @@ Then take a look at an example:
 'title': 'An act to add Section 10295.35 to the Public Contract Code, relating to public contracts.'}
 ```

-The `text` field is the input and the `summary` field is the target.
+There are two fields that you'll want to use:
+
+- `text`: the text of the bill which'll be the input to the model.
+- `summary`: a condensed version of `text` which'll be the model target.

 ## Preprocess

-Load the T5 tokenizer to process `text` and `summary`:
+The next step is to load a T5 tokenizer to process `text` and `summary`:

 ```py
 >>> from transformers import AutoTokenizer
@@ -64,7 +84,7 @@ Load the T5 tokenizer to process `text` and `summary`:
 >>> tokenizer = AutoTokenizer.from_pretrained("t5-small")
 ```

-The preprocessing function needs to:
+The preprocessing function you want to create needs to:

 1. Prefix the input with a prompt so T5 knows this is a summarization task. Some models capable of multiple NLP tasks require prompting for specific tasks.
 2. Use the keyword `text_target` argument when tokenizing labels.
@@ -84,13 +104,13 @@ The preprocessing function needs to:
 ...     return model_inputs
 ```

-Use 🤗 Datasets [`~datasets.Dataset.map`] function to apply the preprocessing function over the entire dataset. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once:
+To apply the preprocessing function over the entire dataset, use 🤗 Datasets [`~datasets.Dataset.map`] method. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once:

 ```py
 >>> tokenized_billsum = billsum.map(preprocess_function, batched=True)
 ```

-Use [`DataCollatorForSeq2Seq`] to create a batch of examples. It will also *dynamically pad* your text and labels to the length of the longest element in its batch, so they are a uniform length. While it is possible to pad your text in the `tokenizer` function by setting `padding=True`, dynamic padding is more efficient.
+Now create a batch of examples using [`DataCollatorForSeq2Seq`]. It's more efficient to *dynamically pad* the sentences to the longest length in a batch during collation, instead of padding the whole dataset to the maximium length.

 <frameworkcontent>
 <pt>
@@ -109,41 +129,74 @@ Use [`DataCollatorForSeq2Seq`] to create a batch of examples. It will also *dyna
 </tf>
 </frameworkcontent>

-## Train
+## Evaluate

-<frameworkcontent>
-<pt>
-Load T5 with [`AutoModelForSeq2SeqLM`]:
+Including a metric during training is often helpful for evaluating your model's performance. You can quickly load a evaluation method with the 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) library. For this task, load the [ROUGE](https://huggingface.co/spaces/evaluate-metric/rouge) metric (see the 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) to learn more about how to load and compute a metric):

 ```py
->>> from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer
+>>> import evaluate

->>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
+>>> rouge = evaluate.load("rouge")
 ```

+Then create a function that passes your predictions and labels to [`~evaluate.EvaluationModule.compute`] to calculate the ROUGE metric:
+
+```py
+>>> import numpy as np
+
+
+>>> def compute_metrics(eval_pred):
+...     predictions, labels = eval_pred
+...     decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
+...     labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
+...     decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
+
+...     result = rouge.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True)
+
+...     prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in predictions]
+...     result["gen_len"] = np.mean(prediction_lens)
+
+...     return {k: round(v, 4) for k, v in result.items()}
+```
+
+Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.
+
+## Train
+
+<frameworkcontent>
+<pt>
 <Tip>

-If you aren't familiar with fine-tuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#finetune-with-trainer)!
+If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#train-with-pytorch-trainer)!

 </Tip>
+You're ready to start training your model now! Load T5 with [`AutoModelForSeq2SeqLM`]:
+
+```py
+>>> from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer
+
+>>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
+```

 At this point, only three steps remain:

-1. Define your training hyperparameters in [`Seq2SeqTrainingArguments`].
-2. Pass the training arguments to [`Seq2SeqTrainer`] along with the model, dataset, tokenizer, and data collator.
-3. Call [`~Trainer.train`] to fine-tune your model.
+1. Define your training hyperparameters in [`Seq2SeqTrainingArguments`]. The only required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model). At the end of each epoch, the [`Trainer`] will evaluate the ROUGE metric and save the training checkpoint.
+2. Pass the training arguments to [`Seq2SeqTrainer`] along with the model, dataset, tokenizer, data collator, and `compute_metrics` function.
+3. Call [`~Trainer.train`] to finetune your model.

 ```py
 >>> training_args = Seq2SeqTrainingArguments(
-...     output_dir="./results",
+...     output_dir="my_awesome_billsum_model",
 ...     evaluation_strategy="epoch",
 ...     learning_rate=2e-5,
 ...     per_device_train_batch_size=16,
 ...     per_device_eval_batch_size=16,
 ...     weight_decay=0.01,
 ...     save_total_limit=3,
-...     num_train_epochs=1,
+...     num_train_epochs=4,
+...     predict_with_generate=True,
 ...     fp16=True,
+...     push_to_hub=True,
 ... )

 >>> trainer = Seq2SeqTrainer(
@@ -153,13 +206,41 @@ At this point, only three steps remain:
 ...     eval_dataset=tokenized_billsum["test"],
 ...     tokenizer=tokenizer,
 ...     data_collator=data_collator,
+...     compute_metrics=compute_metrics,
 ... )

 >>> trainer.train()
 ```
+
+Once training is completed, share your model to the Hub with the [`~transformers.Trainer.push_to_hub`] method so everyone can use your model:
+
+```py
+>>> trainer.push_to_hub()
+```
 </pt>
 <tf>
-To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`~TFPreTrainedModel.prepare_tf_dataset`].
+<Tip>
+
+If you aren't familiar with finetuning a model with Keras, take a look at the basic tutorial [here](../training#train-a-tensorflow-model-with-keras)!
+
+</Tip>
+To finetune a model in TensorFlow, start by setting up an optimizer function, learning rate schedule, and some training hyperparameters:
+
+```py
+>>> from transformers import create_optimizer, AdamWeightDecay
+
+>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
+```
+
+Then you can load T5 with [`TFAutoModelForSeq2SeqLM`]:
+
+```py
+>>> from transformers import TFAutoModelForSeq2SeqLM
+
+>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("t5-small")
+```
+
+Convert your datasets to the `tf.data.Dataset` format with [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:

 ```py
 >>> tf_train_set = model.prepare_tf_dataset(
@@ -177,46 +258,133 @@ To fine-tune a model in TensorFlow, start by converting your datasets to the `tf
 ... )
 ```

-<Tip>
+Configure the model for training with [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):

-If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
+```py
+>>> import tensorflow as tf

-</Tip>
+>>> model.compile(optimizer=optimizer)
+```

-Set up an optimizer function, learning rate schedule, and some training hyperparameters:
+The last two things to setup before you start training is to compute the ROUGE score from the predictions, and provide a way to push your model to the Hub. Both are done by using [Keras callbacks](./main_classes/keras_callbacks).
+
+Pass your `compute_metrics` function to [`~transformers.KerasMetricCallback`]:

 ```py
->>> from transformers import create_optimizer, AdamWeightDecay
+>>> from transformers.keras_callbacks import KerasMetricCallback

->>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
+>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
 ```

-Load T5 with [`TFAutoModelForSeq2SeqLM`]:
+Specify where to push your model and tokenizer in the [`~transformers.PushToHubCallback`]:

 ```py
->>> from transformers import TFAutoModelForSeq2SeqLM
+>>> from transformers.keras_callbacks import PushToHubCallback

->>> model = TFAutoModelForSeq2SeqLM.from_pretrained("t5-small")
+>>> push_to_hub_callback = PushToHubCallback(
+...     output_dir="my_awesome_billsum_model",
+...     tokenizer=tokenizer,
+... )
 ```

-Configure the model for training with [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):
+Then bundle your callbacks together:

 ```py
->>> model.compile(optimizer=optimizer)
+>>> callbacks = [metric_callback, push_to_hub_callback]
 ```

-Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fine-tune the model:
+Finally, you're ready to start training your model! Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) with your training and validation datasets, the number of epochs, and your callbacks to finetune the model:

 ```py
->>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3)
+>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=callbacks)
 ```
+
+Once training is completed, your model is automatically uploaded to the Hub so everyone can use it!
 </tf>
 </frameworkcontent>

 <Tip>

-For a more in-depth example of how to fine-tune a model for summarization, take a look at the corresponding
+For a more in-depth example of how to finetune a model for summarization, take a look at the corresponding
 [PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/summarization.ipynb)
 or [TensorFlow notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/summarization-tf.ipynb).

 </Tip>
+
+## Inference
+
+Great, now that you've finetuned a model, you can use it for inference!
+
+Come up with some text you'd like to summarize. For T5, you need to prefix your input depending on the task you're working on. For summarization you should prefix your input as shown below:
+
+```py
+>>> text = "summarize: The Inflation Reduction Act lowers prescription drug costs, health care costs, and energy costs. It's the most aggressive action on tackling the climate crisis in American history, which will lift up American workers and create good-paying, union jobs across the country. It'll lower the deficit and ask the ultra-wealthy and corporations to pay their fair share. And no one making under $400,000 per year will pay a penny more in taxes."
+```
+
+The simplest way to try out your finetuned model for inference is to use it in a [`pipeline`]. Instantiate a `pipeline` for summarization with your model, and pass your text to it:
+
+```py
+>>> from transformers import pipeline
+
+>>> summarizer = pipeline("summarization", model="stevhliu/my_awesome_billsum_model")
+>>> summarizer(text)
+[{"summary_text": "The Inflation Reduction Act lowers prescription drug costs, health care costs, and energy costs. It's the most aggressive action on tackling the climate crisis in American history, which will lift up American workers and create good-paying, union jobs across the country."}]
+```
+
+You can also manually replicate the results of the `pipeline` if you'd like:
+
+
+<frameworkcontent>
+<pt>
+Tokenize the text and return the `input_ids` as PyTorch tensors:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_billsum_model")
+>>> inputs = tokenizer(text, return_tensors="pt").input_ids
+```
+
+Use the [`~transformers.generation_utils.GenerationMixin.generate`] method to create the summarization. For more details about the different text generation strategies and parameters for controlling generation, check out the [Text Generation](./main_classes/text_generation) API.
+
+```py
+>>> from transformers import AutoModelForSeq2SeqLM
+
+>>> model = AutoModelForSeq2SeqLM.from_pretrained("stevhliu/my_awesome_billsum_model")
+>>> outputs = model.generate(inputs, max_new_tokens=100, do_sample=False)
+```
+
+Decode the generated token ids back into text:
+
+```py
+>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
+'the inflation reduction act lowers prescription drug costs, health care costs, and energy costs. it's the most aggressive action on tackling the climate crisis in american history. it will ask the ultra-wealthy and corporations to pay their fair share.'
+```
+</pt>
+<tf>
+Tokenize the text and return the `input_ids` as TensorFlow tensors:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("stevhliu/my_awesome_billsum_model")
+>>> inputs = tokenizer(text, return_tensors="tf").input_ids
+```
+
+Use the [`~transformers.generation_tf_utils.TFGenerationMixin.generate`] method to create the summarization. For more details about the different text generation strategies and parameters for controlling generation, check out the [Text Generation](./main_classes/text_generation) API.
+
+```py
+>>> from transformers import TFAutoModelForSeq2SeqLM
+
+>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("stevhliu/my_awesome_billsum_model")
+>>> outputs = model.generate(inputs, max_new_tokens=100, do_sample=False)
+```
+
+Decode the generated token ids back into text:
+
+```py
+>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
+'the inflation reduction act lowers prescription drug costs, health care costs, and energy costs. it's the most aggressive action on tackling the climate crisis in american history. it will ask the ultra-wealthy and corporations to pay their fair share.'
+```
+</tf>
+</frameworkcontent>
\ No newline at end of file
--- a/docs/source/en/tasks/token_classification.mdx
+++ b/docs/source/en/tasks/token_classification.mdx
--- a/docs/source/en/tasks/translation.mdx
+++ b/docs/source/en/tasks/translation.mdx
@@ -14,9 +14,12 @@ specific language governing permissions and limitations under the License.

 <Youtube id="1JvfrvZgi6c"/>

-Translation converts a sequence of text from one language to another. It is one of several tasks you can formulate as a sequence-to-sequence problem, a powerful framework that extends to vision and audio tasks. 
+Translation converts a sequence of text from one language to another. It is one of several tasks you can formulate as a sequence-to-sequence problem, a powerful framework for returning some output from an input, like translation or summarization. Translation systems are commonly used for translation between different language texts, but it can also be used for speech or some combination in between like text-to-speech or speech-to-text.

-This guide will show you how to fine-tune [T5](https://huggingface.co/t5-small) on the English-French subset of the [OPUS Books](https://huggingface.co/datasets/opus_books) dataset to translate English text to French.
+This guide will show you how to:
+
+1. Finetune [T5](https://huggingface.co/t5-small) on the English-French subset of the [OPUS Books](https://huggingface.co/datasets/opus_books) dataset to translate English text to French.
+2. Use your finetuned model for inference.

 <Tip>

@@ -24,9 +27,23 @@ See the translation [task page](https://huggingface.co/tasks/translation) for mo

 </Tip>

+Before you begin, make sure you have all the necessary libraries installed:
+
+```bash
+pip install transformers datasets evaluate
+```
+
+We encourage you to login to your Hugging Face account so you can upload and share your model with the community. When prompted, enter your token to login:
+
+```py
+>>> from huggingface_hub import notebook_login
+
+>>> notebook_login()
+```
+
 ## Load OPUS Books dataset

-Load the OPUS Books dataset from the 🤗 Datasets library:
+Start by loading the English-French subset of the [OPUS Books](https://huggingface.co/datasets/opus_books) dataset from the 🤗 Datasets library:

 ```py
 >>> from datasets import load_dataset
@@ -34,10 +51,10 @@ Load the OPUS Books dataset from the 🤗 Datasets library:
 >>> books = load_dataset("opus_books", "en-fr")
 ```

-Split this dataset into a train and test set:
+Split the dataset into a train and test set with the [`~datasets.Dataset.train_test_split`] method:

 ```py
-books = books["train"].train_test_split(test_size=0.2)
+>>> books = books["train"].train_test_split(test_size=0.2)
 ```

 Then take a look at an example:
@@ -49,13 +66,13 @@ Then take a look at an example:
  'fr': 'Mais ce plateau élevé ne mesurait que quelques toises, et bientôt nous fûmes rentrés dans notre élément.'}}
 ```

-The `translation` field is a dictionary containing the English and French translations of the text.
+`translation`: an English and French translation of the text.

 ## Preprocess

 <Youtube id="XAR8jnZZuUs"/>

-Load the T5 tokenizer to process the language pairs:
+The next step is to load a T5 tokenizer to process the English-French language pairs:

 ```py
 >>> from transformers import AutoTokenizer
@@ -63,10 +80,10 @@ Load the T5 tokenizer to process the language pairs:
 >>> tokenizer = AutoTokenizer.from_pretrained("t5-small")
 ```

-The preprocessing function needs to:
+The preprocessing function you want to create needs to:

 1. Prefix the input with a prompt so T5 knows this is a translation task. Some models capable of multiple NLP tasks require prompting for specific tasks.
-2. Tokenize the input (English) and target (French) separately. You can't tokenize French text with a tokenizer pretrained on an English vocabulary. A context manager will help set the tokenizer to French first before tokenizing it.
+2. Tokenize the input (English) and target (French) separately because you can't tokenize French text with a tokenizer pretrained on an English vocabulary.
 3. Truncate sequences to be no longer than the maximum length set by the `max_length` parameter.

 ```py
@@ -82,84 +99,113 @@ The preprocessing function needs to:
 ...     return model_inputs
 ```

-Use 🤗 Datasets [`~datasets.Dataset.map`] function to apply the preprocessing function over the entire dataset. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once:
+To apply the preprocessing function over the entire dataset, use 🤗 Datasets [`~datasets.Dataset.map`] method. You can speed up the `map` function by setting `batched=True` to process multiple elements of the dataset at once:

 ```py
 >>> tokenized_books = books.map(preprocess_function, batched=True)
 ```

+Now create a batch of examples using [`DataCollatorForSeq2Seq`]. It's more efficient to *dynamically pad* the sentences to the longest length in a batch during collation, instead of padding the whole dataset to the maximium length.
+
 <frameworkcontent>
 <pt>
-Load T5 with [`AutoModelForSeq2SeqLM`]:
-
 ```py
->>> from transformers import AutoModelForSeq2SeqLM
+>>> from transformers import DataCollatorForSeq2Seq

->>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
+>>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=model)
 ```
 </pt>
 <tf>
-Load T5 with [`TFAutoModelForSeq2SeqLM`]:

 ```py
->>> from transformers import TFAutoModelForSeq2SeqLM
+>>> from transformers import DataCollatorForSeq2Seq

->>> model = TFAutoModelForSeq2SeqLM.from_pretrained("t5-small")
+>>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=model, return_tensors="tf")
 ```
 </tf>
 </frameworkcontent>

-Use [`DataCollatorForSeq2Seq`] to create a batch of examples. It will also *dynamically pad* your text and labels to the length of the longest element in its batch, so they are a uniform length. While it is possible to pad your text in the `tokenizer` function by setting `padding=True`, dynamic padding is more efficient.
+## Evaluate

-<frameworkcontent>
-<pt>
+Including a metric during training is often helpful for evaluating your model's performance. You can quickly load a evaluation method with the 🤗 [Evaluate](https://huggingface.co/docs/evaluate/index) library. For this task, load the [SacreBLEU](https://huggingface.co/spaces/evaluate-metric/sacrebleu) metric (see the 🤗 Evaluate [quick tour](https://huggingface.co/docs/evaluate/a_quick_tour) to learn more about how to load and compute a metric):

 ```py
->>> from transformers import DataCollatorForSeq2Seq
+>>> import evaluate

->>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=model)
+>>> sacrebleu = evaluate.load("sacrebleu")
 ```
-</pt>
-<tf>
+
+Then create a function that passes your predictions and labels to [`~evaluate.EvaluationModule.compute`] to calculate the SacreBLEU score:

 ```py
->>> from transformers import DataCollatorForSeq2Seq
+>>> import numpy as np

->>> data_collator = DataCollatorForSeq2Seq(tokenizer=tokenizer, model=model, return_tensors="tf")
+
+>>> def postprocess_text(preds, labels):
+...     preds = [pred.strip() for pred in preds]
+...     labels = [[label.strip()] for label in labels]
+
+...     return preds, labels
+
+
+>>> def compute_metrics(eval_preds):
+...     preds, labels = eval_preds
+...     if isinstance(preds, tuple):
+...         preds = preds[0]
+...     decoded_preds = tokenizer.batch_decode(preds, skip_special_tokens=True)
+
+...     labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
+...     decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
+
+...     decoded_preds, decoded_labels = postprocess_text(decoded_preds, decoded_labels)
+
+...     result = metric.compute(predictions=decoded_preds, references=decoded_labels)
+...     result = {"bleu": result["score"]}
+
+...     prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in preds]
+...     result["gen_len"] = np.mean(prediction_lens)
+...     result = {k: round(v, 4) for k, v in result.items()}
+...     return result
 ```
-</tf>
-</frameworkcontent>
+
+Your `compute_metrics` function is ready to go now, and you'll return to it when you setup your training.

 ## Train

 <frameworkcontent>
 <pt>
-
 <Tip>

-If you aren't familiar with fine-tuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#finetune-with-trainer)!
+If you aren't familiar with finetuning a model with the [`Trainer`], take a look at the basic tutorial [here](../training#train-with-pytorch-trainer)!

 </Tip>
+You're ready to start training your model now! Load T5 with [`AutoModelForSeq2SeqLM`]:
+
+```py
+>>> from transformers import AutoModelForSeq2SeqLM, Seq2SeqTrainingArguments, Seq2SeqTrainer
+
+>>> model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
+```

 At this point, only three steps remain:

-1. Define your training hyperparameters in [`Seq2SeqTrainingArguments`].
-2. Pass the training arguments to [`Seq2SeqTrainer`] along with the model, dataset, tokenizer, and data collator.
-3. Call [`~Trainer.train`] to fine-tune your model.
+1. Define your training hyperparameters in [`Seq2SeqTrainingArguments`]. The only required parameter is `output_dir` which specifies where to save your model. You'll push this model to the Hub by setting `push_to_hub=True` (you need to be signed in to Hugging Face to upload your model). At the end of each epoch, the [`Trainer`] will evaluate the SacreBLEU metric and save the training checkpoint.
+2. Pass the training arguments to [`Seq2SeqTrainer`] along with the model, dataset, tokenizer, data collator, and `compute_metrics` function.
+3. Call [`~Trainer.train`] to finetune your model.

 ```py
->>> from transformers import Seq2SeqTrainingArguments, Seq2SeqTrainer
-
 >>> training_args = Seq2SeqTrainingArguments(
-...     output_dir="./results",
+...     output_dir="my_awesome_opus_books_model",
 ...     evaluation_strategy="epoch",
 ...     learning_rate=2e-5,
 ...     per_device_train_batch_size=16,
 ...     per_device_eval_batch_size=16,
 ...     weight_decay=0.01,
 ...     save_total_limit=3,
-...     num_train_epochs=1,
+...     num_train_epochs=2,
+...     predict_with_generate=True,
 ...     fp16=True,
+...     push_to_hub=True,
 ... )

 >>> trainer = Seq2SeqTrainer(
@@ -169,13 +215,41 @@ At this point, only three steps remain:
 ...     eval_dataset=tokenized_books["test"],
 ...     tokenizer=tokenizer,
 ...     data_collator=data_collator,
+...     compute_metrics=compute_metrics,
 ... )

 >>> trainer.train()
+````
+
+Once training is completed, share your model to the Hub with the [`~transformers.Trainer.push_to_hub`] method so everyone can use your model:
+
+```py
+>>> trainer.push_to_hub()
 ```
 </pt>
 <tf>
-To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`~TFPreTrainedModel.prepare_tf_dataset`].
+<Tip>
+
+If you aren't familiar with finetuning a model with Keras, take a look at the basic tutorial [here](../training#train-a-tensorflow-model-with-keras)!
+
+</Tip>
+To finetune a model in TensorFlow, start by setting up an optimizer function, learning rate schedule, and some training hyperparameters:
+
+```py
+>>> from transformers import AdamWeightDecay
+
+>>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
+```
+
+Then you can load T5 with [`TFAutoModelForSeq2SeqLM`]:
+
+```py
+>>> from transformers import TFAutoModelForSeq2SeqLM
+
+>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("t5-small")
+```
+
+Convert your datasets to the `tf.data.Dataset` format with [`~transformers.TFPreTrainedModel.prepare_tf_dataset`]:

 ```py
 >>> tf_train_set = model.prepare_tf_dataset(
@@ -193,38 +267,132 @@ To fine-tune a model in TensorFlow, start by converting your datasets to the `tf
 ... )
 ```

-<Tip>
+Configure the model for training with [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):

-If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
+```py
+>>> import tensorflow as tf

-</Tip>
+>>> model.compile(optimizer=optimizer)
+```

-Set up an optimizer function, learning rate schedule, and some training hyperparameters:
+The last two things to setup before you start training is to compute the SacreBLEU metric from the predictions, and provide a way to push your model to the Hub. Both are done by using [Keras callbacks](./main_classes/keras_callbacks).
+
+Pass your `compute_metrics` function to [`~transformers.KerasMetricCallback`]:

 ```py
->>> from transformers import create_optimizer, AdamWeightDecay
+>>> from transformers.keras_callbacks import KerasMetricCallback

->>> optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)
+>>> metric_callback = KerasMetricCallback(metric_fn=compute_metrics, eval_dataset=tf_validation_set)
 ```

-Configure the model for training with [`compile`](https://keras.io/api/models/model_training_apis/#compile-method):
+Specify where to push your model and tokenizer in the [`~transformers.PushToHubCallback`]:

 ```py
->>> model.compile(optimizer=optimizer)
+>>> from transformers.keras_callbacks import PushToHubCallback
+
+>>> push_to_hub_callback = PushToHubCallback(
+...     output_dir="my_awesome_opus_books_model",
+...     tokenizer=tokenizer,
+... )
+```
+
+Then bundle your callbacks together:
+
+```py
+>>> callbacks = [metric_callback, push_to_hub_callback]
 ```

-Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fine-tune the model:
+Finally, you're ready to start training your model! Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) with your training and validation datasets, the number of epochs, and your callbacks to finetune the model:

 ```py
->>> model.fit(tf_train_set, validation_data=tf_test_set, epochs=3)
+>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3, callbacks=callbacks)
 ```
+
+Once training is completed, your model is automatically uploaded to the Hub so everyone can use it!
 </tf>
 </frameworkcontent>

 <Tip>

-For a more in-depth example of how to fine-tune a model for translation, take a look at the corresponding
+For a more in-depth example of how to finetune a model for translation, take a look at the corresponding
 [PyTorch notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation.ipynb)
 or [TensorFlow notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/examples/translation-tf.ipynb).

 </Tip>
+
+## Inference
+
+Great, now that you've finetuned a model, you can use it for inference!
+
+Come up with some text you'd like to translate to another language. For T5, you need to prefix your input depending on the task you're working on. For translation from English to French, you should prefix your input as shown below:
+
+```py
+>>> text = "translate English to French: Legumes share resources with nitrogen-fixing bacteria."
+```
+
+The simplest way to try out your finetuned model for inference is to use it in a [`pipeline`]. Instantiate a `pipeline` for translation with your model, and pass your text to it:
+
+```py
+>>> from transformers import pipeline
+
+>>> translator = pipeline("translation", model="my_awesome_opus_books_model")
+>>> translator(text)
+[{'translation_text': 'Legumes partagent des ressources avec des bactéries azotantes.'}]
+```
+
+You can also manually replicate the results of the `pipeline` if you'd like:
+
+<frameworkcontent>
+<pt>
+Tokenize the text and return the `input_ids` as PyTorch tensors:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_opus_books_model")
+>>> inputs = tokenizer(text, return_tensors="pt").input_ids
+```
+
+Use the [`~transformers.generation_utils.GenerationMixin.generate`] method to create the translation. For more details about the different text generation strategies and parameters for controlling generation, check out the [Text Generation](./main_classes/text_generation) API.
+
+```py
+>>> from transformers import AutoModelForSeq2SeqLM
+
+>>> model = AutoModelForSeq2SeqLM.from_pretrained("my_awesome_opus_books_model")
+>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
+```
+
+Decode the generated token ids back into text:
+
+```py
+>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
+'Les lignées partagent des ressources avec des bactéries enfixant l'azote.'
+```
+</pt>
+<tf>
+Tokenize the text and return the `input_ids` as TensorFlow tensors:
+
+```py
+>>> from transformers import AutoTokenizer
+
+>>> tokenizer = AutoTokenizer.from_pretrained("my_awesome_opus_books_model")
+>>> inputs = tokenizer(text, return_tensors="tf").input_ids
+```
+
+Use the [`~transformers.generation_tf_utils.TFGenerationMixin.generate`] method to create the translation. For more details about the different text generation strategies and parameters for controlling generation, check out the [Text Generation](./main_classes/text_generation) API.
+
+```py
+>>> from transformers import TFAutoModelForSeq2SeqLM
+
+>>> model = TFAutoModelForSeq2SeqLM.from_pretrained("my_awesome_opus_books_model")
+>>> outputs = model.generate(inputs, max_new_tokens=40, do_sample=True, top_k=30, top_p=0.95)
+```
+
+Decode the generated token ids back into text:
+
+```py
+>>> tokenizer.decode(outputs[0], skip_special_tokens=True)
+'Les lugumes partagent les ressources avec des bactéries fixatrices d'azote.'
+```
+</tf>
+</frameworkcontent>
\ No newline at end of file