Unverified Commit 77321481 authored by Steven Liu's avatar Steven Liu Committed by GitHub
Browse files

Adopt framework-specific blocks for content (#16342)

*  refactor code samples with framework-specific blocks

*  update training.mdx

* 🖍 apply feedback
parent 62cbd842
...@@ -22,7 +22,7 @@ ...@@ -22,7 +22,7 @@
- local: model_summary - local: model_summary
title: Summary of the models title: Summary of the models
- local: training - local: training
title: Fine-tuning a pretrained model title: Fine-tune a pretrained model
- local: accelerate - local: accelerate
title: Distributed training with 🤗 Accelerate title: Distributed training with 🤗 Accelerate
- local: model_sharing - local: model_sharing
......
...@@ -75,25 +75,29 @@ To ensure your model can be used by someone working with a different framework, ...@@ -75,25 +75,29 @@ To ensure your model can be used by someone working with a different framework,
Converting a checkpoint for another framework is easy. Make sure you have PyTorch and TensorFlow installed (see [here](installation) for installation instructions), and then find the specific model for your task in the other framework. Converting a checkpoint for another framework is easy. Make sure you have PyTorch and TensorFlow installed (see [here](installation) for installation instructions), and then find the specific model for your task in the other framework.
For example, suppose you trained DistilBert for sequence classification in PyTorch and want to convert it to it's TensorFlow equivalent. Load the TensorFlow equivalent of your model for your task, and specify `from_pt=True` so 🤗 Transformers will convert the PyTorch checkpoint to a TensorFlow checkpoint: <frameworkcontent>
<pt>
Specify `from_tf=True` to convert a checkpoint from TensorFlow to PyTorch:
```py ```py
>>> tf_model = TFDistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_pt=True) >>> pt_model = DistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_tf=True)
>>> pt_model.save_pretrained("path/to/awesome-name-you-picked")
``` ```
</pt>
Then save your new TensorFlow model with it's new checkpoint: <tf>
Specify `from_pt=True` to convert a checkpoint from PyTorch to TensorFlow:
```py ```py
>>> tf_model.save_pretrained("path/to/awesome-name-you-picked") >>> tf_model = TFDistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_pt=True)
``` ```
Similarly, specify `from_tf=True` to convert a checkpoint from TensorFlow to PyTorch: Then you can save your new TensorFlow model with it's new checkpoint:
```py ```py
>>> pt_model = DistilBertForSequenceClassification.from_pretrained("path/to/awesome-name-you-picked", from_tf=True) >>> tf_model.save_pretrained("path/to/awesome-name-you-picked")
>>> pt_model.save_pretrained("path/to/awesome-name-you-picked")
``` ```
</tf>
<jax>
If a model is available in Flax, you can also convert a checkpoint from PyTorch to Flax: If a model is available in Flax, you can also convert a checkpoint from PyTorch to Flax:
```py ```py
...@@ -101,9 +105,13 @@ If a model is available in Flax, you can also convert a checkpoint from PyTorch ...@@ -101,9 +105,13 @@ If a model is available in Flax, you can also convert a checkpoint from PyTorch
... "path/to/awesome-name-you-picked", from_pt=True ... "path/to/awesome-name-you-picked", from_pt=True
... ) ... )
``` ```
</jax>
</frameworkcontent>
## Push a model with `Trainer` ## Push a model during training
<frameworkcontent>
<pt>
<Youtube id="Z1-XMy-GNLQ"/> <Youtube id="Z1-XMy-GNLQ"/>
Sharing a model to the Hub is as simple as adding an extra parameter or callback. Remember from the [fine-tuning tutorial](training), the [`TrainingArguments`] class is where you specify hyperparameters and additional training options. One of these training options includes the ability to push a model directly to the Hub. Set `push_to_hub=True` in your [`TrainingArguments`]: Sharing a model to the Hub is as simple as adding an extra parameter or callback. Remember from the [fine-tuning tutorial](training), the [`TrainingArguments`] class is where you specify hyperparameters and additional training options. One of these training options includes the ability to push a model directly to the Hub. Set `push_to_hub=True` in your [`TrainingArguments`]:
...@@ -129,10 +137,9 @@ After you fine-tune your model, call [`~transformers.Trainer.push_to_hub`] on [` ...@@ -129,10 +137,9 @@ After you fine-tune your model, call [`~transformers.Trainer.push_to_hub`] on [`
```py ```py
>>> trainer.push_to_hub() >>> trainer.push_to_hub()
``` ```
</pt>
## Push a model with `PushToHubCallback` <tf>
Share a model to the Hub with [`PushToHubCallback`]. In the [`PushToHubCallback`] function, add:
TensorFlow users can enable the same functionality with [`PushToHubCallback`]. In the [`PushToHubCallback`] function, add:
- An output directory for your model. - An output directory for your model.
- A tokenizer. - A tokenizer.
...@@ -151,6 +158,8 @@ Add the callback to [`fit`](https://keras.io/api/models/model_training_apis/), a ...@@ -151,6 +158,8 @@ Add the callback to [`fit`](https://keras.io/api/models/model_training_apis/), a
```py ```py
>>> model.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=3, callbacks=push_to_hub_callback) >>> model.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=3, callbacks=push_to_hub_callback)
``` ```
</tf>
</frameworkcontent>
## Use the `push_to_hub` function ## Use the `push_to_hub` function
......
...@@ -155,8 +155,10 @@ Create a batch of examples and dynamically pad them with `DataCollatorForCTCWith ...@@ -155,8 +155,10 @@ Create a batch of examples and dynamically pad them with `DataCollatorForCTCWith
>>> data_collator = DataCollatorCTCWithPadding(processor=processor, padding=True) >>> data_collator = DataCollatorCTCWithPadding(processor=processor, padding=True)
``` ```
## Fine-tune with Trainer ## Train
<frameworkcontent>
<pt>
Load Wav2Vec2 with [`AutoModelForCTC`]. For `ctc_loss_reduction`, it is often better to use the average instead of the default summation: Load Wav2Vec2 with [`AutoModelForCTC`]. For `ctc_loss_reduction`, it is often better to use the average instead of the default summation:
```py ```py
...@@ -206,6 +208,8 @@ At this point, only three steps remain: ...@@ -206,6 +208,8 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
</frameworkcontent>
<Tip> <Tip>
......
...@@ -91,8 +91,10 @@ Use 🤗 Datasets [`map`](https://huggingface.co/docs/datasets/package_reference ...@@ -91,8 +91,10 @@ Use 🤗 Datasets [`map`](https://huggingface.co/docs/datasets/package_reference
>>> encoded_ks = ks.map(preprocess_function, remove_columns=["audio", "file"], batched=True) >>> encoded_ks = ks.map(preprocess_function, remove_columns=["audio", "file"], batched=True)
``` ```
## Fine-tune with Trainer ## Train
<frameworkcontent>
<pt>
Load Wav2Vec2 with [`AutoModelForAudioClassification`]. Specify the number of labels, and pass the model the mapping between label number and label class: Load Wav2Vec2 with [`AutoModelForAudioClassification`]. Specify the number of labels, and pass the model the mapping between label number and label class:
```py ```py
...@@ -135,6 +137,8 @@ At this point, only three steps remain: ...@@ -135,6 +137,8 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
</frameworkcontent>
<Tip> <Tip>
......
...@@ -109,8 +109,10 @@ Use [`DefaultDataCollator`] to create a batch of examples. Unlike other data col ...@@ -109,8 +109,10 @@ Use [`DefaultDataCollator`] to create a batch of examples. Unlike other data col
>>> data_collator = DefaultDataCollator() >>> data_collator = DefaultDataCollator()
``` ```
## Fine-tune with Trainer ## Train
<frameworkcontent>
<pt>
Load ViT with [`AutoModelForImageClassification`]. Specify the number of labels, and pass the model the mapping between label number and label class: Load ViT with [`AutoModelForImageClassification`]. Specify the number of labels, and pass the model the mapping between label number and label class:
```py ```py
...@@ -162,6 +164,8 @@ At this point, only three steps remain: ...@@ -162,6 +164,8 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
</frameworkcontent>
<Tip> <Tip>
......
...@@ -200,8 +200,10 @@ For masked language modeling, use the same [`DataCollatorForLanguageModeling`] e ...@@ -200,8 +200,10 @@ For masked language modeling, use the same [`DataCollatorForLanguageModeling`] e
Causal language modeling is frequently used for text generation. This section shows you how to fine-tune [DistilGPT2](https://huggingface.co/distilgpt2) to generate new text. Causal language modeling is frequently used for text generation. This section shows you how to fine-tune [DistilGPT2](https://huggingface.co/distilgpt2) to generate new text.
### Fine-tune with Trainer ### Train
<frameworkcontent>
<pt>
Load DistilGPT2 with [`AutoModelForCausalLM`]: Load DistilGPT2 with [`AutoModelForCausalLM`]:
```py ```py
...@@ -240,18 +242,9 @@ At this point, only three steps remain: ...@@ -240,18 +242,9 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
### Fine-tune with TensorFlow <tf>
To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
To fine-tune a model in TensorFlow is just as easy, with only a few differences.
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](../training#finetune-with-keras)!
</Tip>
Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
```py ```py
>>> tf_train_set = lm_dataset["train"].to_tf_dataset( >>> tf_train_set = lm_dataset["train"].to_tf_dataset(
...@@ -271,6 +264,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt ...@@ -271,6 +264,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt
... ) ... )
``` ```
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
</Tip>
Set up an optimizer function, learning rate, and some training hyperparameters: Set up an optimizer function, learning rate, and some training hyperparameters:
```py ```py
...@@ -300,13 +299,17 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin ...@@ -300,13 +299,17 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin
```py ```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3) >>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3)
``` ```
</tf>
</frameworkcontent>
## Masked language modeling ## Masked language modeling
Masked language modeling is also known as a fill-mask task because it predicts a masked token in a sequence. Models for masked language modeling require a good contextual understanding of an entire sequence instead of only the left context. This section shows you how to fine-tune [DistilRoBERTa](https://huggingface.co/distilroberta-base) to predict a masked word. Masked language modeling is also known as a fill-mask task because it predicts a masked token in a sequence. Models for masked language modeling require a good contextual understanding of an entire sequence instead of only the left context. This section shows you how to fine-tune [DistilRoBERTa](https://huggingface.co/distilroberta-base) to predict a masked word.
### Fine-tune with Trainer ### Train
<frameworkcontent>
<pt>
Load DistilRoBERTa with [`AutoModelForMaskedlM`]: Load DistilRoBERTa with [`AutoModelForMaskedlM`]:
```py ```py
...@@ -346,18 +349,9 @@ At this point, only three steps remain: ...@@ -346,18 +349,9 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
### Fine-tune with TensorFlow <tf>
To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
To fine-tune a model in TensorFlow is just as easy, with only a few differences.
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](../training#finetune-with-keras)!
</Tip>
Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
```py ```py
>>> tf_train_set = lm_dataset["train"].to_tf_dataset( >>> tf_train_set = lm_dataset["train"].to_tf_dataset(
...@@ -377,6 +371,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt ...@@ -377,6 +371,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt
... ) ... )
``` ```
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
</Tip>
Set up an optimizer function, learning rate, and some training hyperparameters: Set up an optimizer function, learning rate, and some training hyperparameters:
```py ```py
...@@ -406,6 +406,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin ...@@ -406,6 +406,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin
```py ```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3) >>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3)
``` ```
</tf>
</frameworkcontent>
<Tip> <Tip>
......
...@@ -176,8 +176,10 @@ tokenized_swag = swag.map(preprocess_function, batched=True) ...@@ -176,8 +176,10 @@ tokenized_swag = swag.map(preprocess_function, batched=True)
</tf> </tf>
</frameworkcontent> </frameworkcontent>
## Fine-tune with Trainer ## Train
<frameworkcontent>
<pt>
Load BERT with [`AutoModelForMultipleChoice`]: Load BERT with [`AutoModelForMultipleChoice`]:
```py ```py
...@@ -220,18 +222,9 @@ At this point, only three steps remain: ...@@ -220,18 +222,9 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
## Fine-tune with TensorFlow <tf>
To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs in `columns`, targets in `label_cols`, whether to shuffle the dataset order, batch size, and the data collator:
To fine-tune a model in TensorFlow is just as easy, with only a few differences.
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](../training#finetune-with-keras)!
</Tip>
Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs in `columns`, targets in `label_cols`, whether to shuffle the dataset order, batch size, and the data collator:
```py ```py
>>> data_collator = DataCollatorForMultipleChoice(tokenizer=tokenizer) >>> data_collator = DataCollatorForMultipleChoice(tokenizer=tokenizer)
...@@ -252,6 +245,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt ...@@ -252,6 +245,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt
... ) ... )
``` ```
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
</Tip>
Set up an optimizer function, learning rate schedule, and some training hyperparameters: Set up an optimizer function, learning rate schedule, and some training hyperparameters:
```py ```py
...@@ -284,4 +283,6 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin ...@@ -284,4 +283,6 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin
```py ```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=2) >>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=2)
``` ```
\ No newline at end of file </tf>
</frameworkcontent>
\ No newline at end of file
...@@ -151,8 +151,10 @@ Use [`DefaultDataCollator`] to create a batch of examples. Unlike other data col ...@@ -151,8 +151,10 @@ Use [`DefaultDataCollator`] to create a batch of examples. Unlike other data col
</tf> </tf>
</frameworkcontent> </frameworkcontent>
## Fine-tune with Trainer ## Train
<frameworkcontent>
<pt>
Load DistilBERT with [`AutoModelForQuestionAnswering`]: Load DistilBERT with [`AutoModelForQuestionAnswering`]:
```py ```py
...@@ -195,18 +197,9 @@ At this point, only three steps remain: ...@@ -195,18 +197,9 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
## Fine-tune with TensorFlow <tf>
To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and the start and end positions of an answer in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
To fine-tune a model in TensorFlow is just as easy, with only a few differences.
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](../training#finetune-with-keras)!
</Tip>
Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and the start and end positions of an answer in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
```py ```py
>>> tf_train_set = tokenized_squad["train"].to_tf_dataset( >>> tf_train_set = tokenized_squad["train"].to_tf_dataset(
...@@ -226,6 +219,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt ...@@ -226,6 +219,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt
... ) ... )
``` ```
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
</Tip>
Set up an optimizer function, learning rate schedule, and some training hyperparameters: Set up an optimizer function, learning rate schedule, and some training hyperparameters:
```py ```py
...@@ -262,6 +261,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin ...@@ -262,6 +261,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin
```py ```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3) >>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3)
``` ```
</tf>
</frameworkcontent>
<Tip> <Tip>
......
...@@ -91,8 +91,10 @@ Use [`DataCollatorWithPadding`] to create a batch of examples. It will also *dyn ...@@ -91,8 +91,10 @@ Use [`DataCollatorWithPadding`] to create a batch of examples. It will also *dyn
</tf> </tf>
</frameworkcontent> </frameworkcontent>
## Fine-tune with Trainer ## Train
<frameworkcontent>
<pt>
Load DistilBERT with [`AutoModelForSequenceClassification`] along with the number of expected labels: Load DistilBERT with [`AutoModelForSequenceClassification`] along with the number of expected labels:
```py ```py
...@@ -140,18 +142,9 @@ At this point, only three steps remain: ...@@ -140,18 +142,9 @@ At this point, only three steps remain:
[`Trainer`] will apply dynamic padding by default when you pass `tokenizer` to it. In this case, you don't need to specify a data collator explicitly. [`Trainer`] will apply dynamic padding by default when you pass `tokenizer` to it. In this case, you don't need to specify a data collator explicitly.
</Tip> </Tip>
</pt>
## Fine-tune with TensorFlow <tf>
To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
To fine-tune a model in TensorFlow is just as easy, with only a few differences.
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](../training#finetune-with-keras)!
</Tip>
Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
```py ```py
>>> tf_train_set = tokenized_imdb["train"].to_tf_dataset( >>> tf_train_set = tokenized_imdb["train"].to_tf_dataset(
...@@ -169,6 +162,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt ...@@ -169,6 +162,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt
... ) ... )
``` ```
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
</Tip>
Set up an optimizer function, learning rate schedule, and some training hyperparameters: Set up an optimizer function, learning rate schedule, and some training hyperparameters:
```py ```py
...@@ -203,6 +202,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin ...@@ -203,6 +202,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin
```py ```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3) >>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3)
``` ```
</tf>
</frameworkcontent>
<Tip> <Tip>
......
...@@ -110,8 +110,10 @@ Use [`DataCollatorForSeq2Seq`] to create a batch of examples. It will also *dyna ...@@ -110,8 +110,10 @@ Use [`DataCollatorForSeq2Seq`] to create a batch of examples. It will also *dyna
</tf> </tf>
</frameworkcontent> </frameworkcontent>
## Fine-tune with Trainer ## Train
<frameworkcontent>
<pt>
Load T5 with [`AutoModelForSeq2SeqLM`]: Load T5 with [`AutoModelForSeq2SeqLM`]:
```py ```py
...@@ -156,18 +158,9 @@ At this point, only three steps remain: ...@@ -156,18 +158,9 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
## Fine-tune with TensorFlow <tf>
To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
To fine-tune a model in TensorFlow is just as easy, with only a few differences.
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](../training#finetune-with-keras)!
</Tip>
Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
```py ```py
>>> tf_train_set = tokenized_billsum["train"].to_tf_dataset( >>> tf_train_set = tokenized_billsum["train"].to_tf_dataset(
...@@ -185,6 +178,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt ...@@ -185,6 +178,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt
... ) ... )
``` ```
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
</Tip>
Set up an optimizer function, learning rate schedule, and some training hyperparameters: Set up an optimizer function, learning rate schedule, and some training hyperparameters:
```py ```py
...@@ -212,6 +211,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin ...@@ -212,6 +211,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin
```py ```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3) >>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3)
``` ```
</tf>
</frameworkcontent>
<Tip> <Tip>
......
...@@ -151,8 +151,10 @@ Use [`DataCollatorForTokenClassification`] to create a batch of examples. It wil ...@@ -151,8 +151,10 @@ Use [`DataCollatorForTokenClassification`] to create a batch of examples. It wil
</tf> </tf>
</frameworkcontent> </frameworkcontent>
## Fine-tune with Trainer ## Train
<frameworkcontent>
<pt>
Load DistilBERT with [`AutoModelForTokenClassification`] along with the number of expected labels: Load DistilBERT with [`AutoModelForTokenClassification`] along with the number of expected labels:
```py ```py
...@@ -195,18 +197,9 @@ At this point, only three steps remain: ...@@ -195,18 +197,9 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
## Fine-tune with TensorFlow <tf>
To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
To fine-tune a model in TensorFlow is just as easy, with only a few differences.
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](../training#finetune-with-keras)!
</Tip>
Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
```py ```py
>>> tf_train_set = tokenized_wnut["train"].to_tf_dataset( >>> tf_train_set = tokenized_wnut["train"].to_tf_dataset(
...@@ -224,6 +217,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt ...@@ -224,6 +217,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt
... ) ... )
``` ```
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
</Tip>
Set up an optimizer function, learning rate schedule, and some training hyperparameters: Set up an optimizer function, learning rate schedule, and some training hyperparameters:
```py ```py
...@@ -261,6 +260,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin ...@@ -261,6 +260,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin
```py ```py
>>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3) >>> model.fit(x=tf_train_set, validation_data=tf_validation_set, epochs=3)
``` ```
</tf>
</frameworkcontent>
<Tip> <Tip>
......
...@@ -112,8 +112,10 @@ Use [`DataCollatorForSeq2Seq`] to create a batch of examples. It will also *dyna ...@@ -112,8 +112,10 @@ Use [`DataCollatorForSeq2Seq`] to create a batch of examples. It will also *dyna
</tf> </tf>
</frameworkcontent> </frameworkcontent>
## Fine-tune with Trainer ## Train
<frameworkcontent>
<pt>
Load T5 with [`AutoModelForSeq2SeqLM`]: Load T5 with [`AutoModelForSeq2SeqLM`]:
```py ```py
...@@ -158,18 +160,9 @@ At this point, only three steps remain: ...@@ -158,18 +160,9 @@ At this point, only three steps remain:
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
## Fine-tune with TensorFlow <tf>
To fine-tune a model in TensorFlow, start by converting your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
To fine-tune a model in TensorFlow is just as easy, with only a few differences.
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](../training#finetune-with-keras)!
</Tip>
Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Dataset.to_tf_dataset). Specify inputs and labels in `columns`, whether to shuffle the dataset order, batch size, and the data collator:
```py ```py
>>> tf_train_set = tokenized_books["train"].to_tf_dataset( >>> tf_train_set = tokenized_books["train"].to_tf_dataset(
...@@ -187,6 +180,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt ...@@ -187,6 +180,12 @@ Convert your datasets to the `tf.data.Dataset` format with [`to_tf_dataset`](htt
... ) ... )
``` ```
<Tip>
If you aren't familiar with fine-tuning a model with Keras, take a look at the basic tutorial [here](training#finetune-with-keras)!
</Tip>
Set up an optimizer function, learning rate schedule, and some training hyperparameters: Set up an optimizer function, learning rate schedule, and some training hyperparameters:
```py ```py
...@@ -214,6 +213,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin ...@@ -214,6 +213,8 @@ Call [`fit`](https://keras.io/api/models/model_training_apis/#fit-method) to fin
```py ```py
>>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3) >>> model.fit(x=tf_train_set, validation_data=tf_test_set, epochs=3)
``` ```
</tf>
</frameworkcontent>
<Tip> <Tip>
......
...@@ -63,8 +63,10 @@ If you like, you can create a smaller subset of the full dataset to fine-tune on ...@@ -63,8 +63,10 @@ If you like, you can create a smaller subset of the full dataset to fine-tune on
<a id='trainer'></a> <a id='trainer'></a>
## Fine-tune with `Trainer` ## Train
<frameworkcontent>
<pt>
<Youtube id="nvBXf7s7vTI"/> <Youtube id="nvBXf7s7vTI"/>
🤗 Transformers provides a [`Trainer`] class optimized for training 🤗 Transformers models, making it easier to start training without manually writing your own training loop. The [`Trainer`] API supports a wide range of training options and features such as logging, gradient accumulation, and mixed precision. 🤗 Transformers provides a [`Trainer`] class optimized for training 🤗 Transformers models, making it easier to start training without manually writing your own training loop. The [`Trainer`] API supports a wide range of training options and features such as logging, gradient accumulation, and mixed precision.
...@@ -143,14 +145,13 @@ Then fine-tune your model by calling [`~transformers.Trainer.train`]: ...@@ -143,14 +145,13 @@ Then fine-tune your model by calling [`~transformers.Trainer.train`]:
```py ```py
>>> trainer.train() >>> trainer.train()
``` ```
</pt>
<tf>
<a id='keras'></a> <a id='keras'></a>
## Fine-tune with Keras
<Youtube id="rnTGBy2ax1c"/> <Youtube id="rnTGBy2ax1c"/>
🤗 Transformers models also supports training in TensorFlow with the Keras API. You only need to make a few changes before you can fine-tune. 🤗 Transformers models also supports training in TensorFlow with the Keras API.
### Convert dataset to TensorFlow format ### Convert dataset to TensorFlow format
...@@ -210,11 +211,15 @@ Then compile and fine-tune your model with [`fit`](https://keras.io/api/models/m ...@@ -210,11 +211,15 @@ Then compile and fine-tune your model with [`fit`](https://keras.io/api/models/m
>>> model.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=3) >>> model.fit(tf_train_dataset, validation_data=tf_validation_dataset, epochs=3)
``` ```
</tf>
</frameworkcontent>
<a id='pytorch_native'></a> <a id='pytorch_native'></a>
## Fine-tune in native PyTorch ## Train in native PyTorch
<frameworkcontent>
<pt>
<Youtube id="Dh9CL8fyG80"/> <Youtube id="Dh9CL8fyG80"/>
[`Trainer`] takes care of the training loop and allows you to fine-tune a model in a single line of code. For users who prefer to write their own training loop, you can also fine-tune a 🤗 Transformers model in native PyTorch. [`Trainer`] takes care of the training loop and allows you to fine-tune a model in a single line of code. For users who prefer to write their own training loop, you can also fine-tune a 🤗 Transformers model in native PyTorch.
...@@ -354,6 +359,8 @@ Just like how you need to add an evaluation function to [`Trainer`], you need to ...@@ -354,6 +359,8 @@ Just like how you need to add an evaluation function to [`Trainer`], you need to
>>> metric.compute() >>> metric.compute()
``` ```
</pt>
</frameworkcontent>
<a id='additional-resources'></a> <a id='additional-resources'></a>
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment