fix links

d0673c7d · thomwolf · 68b937aa · d0673c7d
Commit d0673c7d authored Nov 17, 2018 by thomwolf
Hide whitespace changes
Inline Side-by-side

Showing with 9 additions and 3 deletions

README.md README.md +9 -3

No files found.
--- a/README.md
+++ b/README.md
@@ -152,10 +152,10 @@ Here is a detailed documentation of the classes in the package and how to use th
 | Sub-section | Description |
 |-|-|
-| [Loading Google AI's pre-trained weigths](#Loading-Google-AI's-pre-trained-weigths-and-PyTorch-dump) | How to load Google AI's pre-trained weight or a PyTorch saved instance |
+| [Loading Google AI's pre-trained weigths](#Loading-Google-AIs-pre-trained-weigths-and-PyTorch-dump) | How to load Google AI's pre-trained weight or a PyTorch saved instance |
 | [PyTorch models](#PyTorch-models) | API of the six PyTorch model classes: `BertModel`, `BertForMaskedLM`, `BertForNextSentencePrediction`, `BertForPreTraining`, `BertForSequenceClassification` or `BertForQuestionAnswering` |
-| [Tokenizer: `BertTokenizer`](#Tokenizer:-BertTokenizer) | API of the `BertTokenizer` class|
+| [Tokenizer: `BertTokenizer`](#Tokenizer-BertTokenizer) | API of the `BertTokenizer` class|
-| [Optimizer: `BERTAdam`](#Optimizer:-BERTAdam) |  API of the `BERTAdam` class |
+| [Optimizer: `BERTAdam`](#Optimizer-BERTAdam) |  API of the `BERTAdam` class |
 ### Loading Google AI's pre-trained weigths and PyTorch dump
@@ -316,6 +316,12 @@ The optimizer accepts the following arguments:
 ## Examples
+| Sub-section | Description |
+|-|-|
+| [Training large models: introduction, tools and examples](#Training-large-models-introduction,-tools-and-examples) | How to use gradient-accumulation, multi-gpu training, distributed training, optimize on CPU and 16-bits training to train Bert models |
+| [Fine-tuning with BERT: running the examples](#Fine-tuning-with-BERT-running-the-examples) | Running the examples in [`./examples`](./examples/): `extract_classif.py`, `run_classifier.py` and `run_squad.py` |
+| [Fine-tuning BERT-large on GPUs](#Fine-tuning-BERT-large-on-GPUs) | How to fine tune `BERT large`|
 ### Training large models: introduction, tools and examples
 BERT-base and BERT-large are respectively 110M and 340M parameters models and it can be difficult to fine-tune them on a single GPU with the recommended batch size for good performance (in most case a batch size of 32).