add info on TRL docs (#27024)

* add info on TRL docs * add TRL link * tweak text * tweak text

add info on TRL docs (#27024)
* add info on TRL docs * add TRL link * tweak text * tweak text
b18e3140 · Leandro von Werra · GitHub · cb0c6806 · b18e3140
Unverified Commit b18e3140 authored Oct 24, 2023 by Leandro von Werra Committed by GitHub Oct 24, 2023
Hide whitespace changes
Inline Side-by-side

Showing with 6 additions and 0 deletions

docs/source/en/main_classes/trainer.md docs/source/en/main_classes/trainer.md +6 -0

No files found.
--- a/docs/source/en/main_classes/trainer.md
+++ b/docs/source/en/main_classes/trainer.md
@@ -18,6 +18,12 @@ rendered properly in your Markdown viewer.

 The [`Trainer`] class provides an API for feature-complete training in PyTorch for most standard use cases. It's used in most of the [example scripts](https://github.com/huggingface/transformers/tree/main/examples).

+<Tip>
+
+If you're looking to fine-tune a language model like Llama-2 or Mistral on a text dataset using autoregressive techniques, consider using [`trl`](https://github.com/huggingface/trl)'s [`~trl.SFTTrainer`]. The [`~trl.SFTTrainer`] wraps the [`Trainer`] and is specially optimized for this particular task and supports sequence packing, LoRA, quantization, and DeepSpeed for efficient scaling to any model size. On the other hand, the [`Trainer`] is a more versatile option, suitable for a broader spectrum of tasks.
+
+</Tip>
+
 Before instantiating your [`Trainer`], create a [`TrainingArguments`] to access all the points of customization during training.

 The API supports distributed training on multiple GPUs/TPUs, mixed precision through [NVIDIA Apex](https://github.com/NVIDIA/apex) and Native AMP for PyTorch.