Expand a bit the presentation of examples (#10799)

* Expand a bit the presentation of examples * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

Expand a bit the presentation of examples (#10799)
* Expand a bit the presentation of examples * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
946400fb · Sylvain Gugger · GitHub · fd1d9f1a · 946400fb · 946400fb
Unverified Commit 946400fb authored Mar 19, 2021 by Sylvain Gugger Committed by GitHub Mar 19, 2021
6 changed files
--- a/examples/README.md
+++ b/examples/README.md
@@ -15,8 +15,13 @@ limitations under the License.

 # Examples

-This folder contains actively maintained examples of use of 🤗 Transformers organized along NLP tasks. If you are looking for an example that used to
-be in this folder, it may have moved to our [research projects](https://github.com/huggingface/transformers/tree/master/examples/research_projects) subfolder (which contains frozen snapshots of research projects).
+This folder contains actively maintained examples of use of 🤗 Transformers organized along NLP tasks. If you are looking for an example that used to be in this folder, it may have moved to our [research projects](https://github.com/huggingface/transformers/tree/master/examples/research_projects) subfolder (which contains frozen snapshots of research projects) or to the [legacy](https://github.com/huggingface/transformers/tree/master/examples/legacy) subfolder.
+
+While we strive to present as many use cases as possible, the scripts in this folder are just examples. It is expected that they won't work out-of-the box on your specific problem and that you will be required to change a few lines of code to adapt them to your needs. To help you with that, all the PyTorch versions of the examples fully expose the preprocessing of the data. This way, you can easily tweak them.
+
+This is similar if you want the scripts to report another metric than the one they currently use: look at the `compute_metrics` function inside the script. It takes the full arrays of predictions and labels and has to return a dictionary of string keys and float values. Just change it to add (or replace) your own metric to the ones already reported.
+
+Please discuss on the [forum](https://discuss.huggingface.co/) or in an [issue](https://github.com/huggingface/transformers/issues) a feature you would like to implement in an example before submitting a PR: we welcome bug fixes but since we want to keep the examples as simple as possible, it's unlikely we will merge a pull request adding more functionality at the cost of readability.

 ## Important note


--- a/examples/language-modeling/README.md
+++ b/examples/language-modeling/README.md
@@ -27,7 +27,7 @@ need extra processing on your datasets.

 **Note:** The old script `run_language_modeling.py` is still available [here](https://github.com/huggingface/transformers/blob/master/examples/legacy/run_language_modeling.py).

-The following examples, will run on a datasets hosted on our [hub](https://huggingface.co/datasets) or with your own
+The following examples, will run on datasets hosted on our [hub](https://huggingface.co/datasets) or with your own
 text files for training and validation. We give examples of both below.

 ### GPT-2/GPT and causal language modeling

--- a/examples/multiple-choice/README.md
+++ b/examples/multiple-choice/README.md
@@ -18,7 +18,9 @@ limitations under the License.

 Based on the script [`run_swag.py`]().

-#### Fine-tuning on SWAG
+## PyTorch script: fine-tuning on SWAG
+
+`run_swag` allows you to fine-tune any model from our [hub](https://huggingface.co/models) (as long as its architecture as a `ForMultipleChoice` version in the library) on the SWAG dataset or your own csv/jsonlines files as long as they are structured the same way. To make it works on another dataset, you will need to tweak the `preprocess_function` inside the script.

 ```bash
 python examples/multiple-choice/run_swag.py \

--- a/examples/question-answering/README.md
+++ b/examples/question-answering/README.md
@@ -24,6 +24,11 @@ uses special features of those tokenizers. You can check if your favorite model
 of the script.

 The old version of this script can be found [here](https://github.com/huggingface/transformers/tree/master/examples/legacy/question-answering).
+
+`run_qa.py` allows you to fine-tune any model from our [hub](https://huggingface.co/models) (as long as its architecture as a `ForQuestionAnswering` version in the library) on the SQUAD dataset or another question-answering dataset of the `datasets` library or your own csv/jsonlines files as long as they are structured the same way as SQUAD. You might need to tweak the data processing inside the script if your data is structured differently.
+
+Note that if your dataset contains samples with no possible answers (like SQUAD version 2), you need to pass along the flag `--version_2_with_negative`.
+
 #### Fine-tuning BERT on SQuAD1.0

 This example code fine-tunes BERT on the SQuAD1.0 dataset. It runs in 24 min (with BERT-base) or 68 min (with BERT-large)

--- a/examples/seq2seq/README.md
+++ b/examples/seq2seq/README.md
@@ -114,7 +114,7 @@ and you wanted to select only `text` and `summary`, then you'd pass these additi
    --summary_column summary \
 ```

-#### Custom JSONFILES Files
+#### Custom JSONLINES Files

 The second supported format is jsonlines. Here is an example of a jsonlines custom data file.


--- a/examples/token-classification/README.md
+++ b/examples/token-classification/README.md
@@ -21,7 +21,7 @@ tagging (POS). The main scrip `run_ner.py` leverages the 🤗 Datasets library a
 customize it to your needs if you need extra processing on your datasets.

 It will either run on a datasets hosted on our [hub](https://huggingface.co/datasets) or with your own text files for
-training and validation.
+training and validation, you might just need to add some tweaks in the data preprocessing.

 The following example fine-tunes BERT on CoNLL-2003: