"git@developer.sourcefind.cn:wangsen/paddle_dbnet.git" did not exist on "0a461d8fc57f4647a156bcf824682d0fd29578e3"
Commit d0673c7d authored by thomwolf's avatar thomwolf
Browse files

fix links

parent 68b937aa
...@@ -152,10 +152,10 @@ Here is a detailed documentation of the classes in the package and how to use th ...@@ -152,10 +152,10 @@ Here is a detailed documentation of the classes in the package and how to use th
| Sub-section | Description | | Sub-section | Description |
|-|-| |-|-|
| [Loading Google AI's pre-trained weigths](#Loading-Google-AI's-pre-trained-weigths-and-PyTorch-dump) | How to load Google AI's pre-trained weight or a PyTorch saved instance | | [Loading Google AI's pre-trained weigths](#Loading-Google-AIs-pre-trained-weigths-and-PyTorch-dump) | How to load Google AI's pre-trained weight or a PyTorch saved instance |
| [PyTorch models](#PyTorch-models) | API of the six PyTorch model classes: `BertModel`, `BertForMaskedLM`, `BertForNextSentencePrediction`, `BertForPreTraining`, `BertForSequenceClassification` or `BertForQuestionAnswering` | | [PyTorch models](#PyTorch-models) | API of the six PyTorch model classes: `BertModel`, `BertForMaskedLM`, `BertForNextSentencePrediction`, `BertForPreTraining`, `BertForSequenceClassification` or `BertForQuestionAnswering` |
| [Tokenizer: `BertTokenizer`](#Tokenizer:-BertTokenizer) | API of the `BertTokenizer` class| | [Tokenizer: `BertTokenizer`](#Tokenizer-BertTokenizer) | API of the `BertTokenizer` class|
| [Optimizer: `BERTAdam`](#Optimizer:-BERTAdam) | API of the `BERTAdam` class | | [Optimizer: `BERTAdam`](#Optimizer-BERTAdam) | API of the `BERTAdam` class |
### Loading Google AI's pre-trained weigths and PyTorch dump ### Loading Google AI's pre-trained weigths and PyTorch dump
...@@ -316,6 +316,12 @@ The optimizer accepts the following arguments: ...@@ -316,6 +316,12 @@ The optimizer accepts the following arguments:
## Examples ## Examples
| Sub-section | Description |
|-|-|
| [Training large models: introduction, tools and examples](#Training-large-models-introduction,-tools-and-examples) | How to use gradient-accumulation, multi-gpu training, distributed training, optimize on CPU and 16-bits training to train Bert models |
| [Fine-tuning with BERT: running the examples](#Fine-tuning-with-BERT-running-the-examples) | Running the examples in [`./examples`](./examples/): `extract_classif.py`, `run_classifier.py` and `run_squad.py` |
| [Fine-tuning BERT-large on GPUs](#Fine-tuning-BERT-large-on-GPUs) | How to fine tune `BERT large`|
### Training large models: introduction, tools and examples ### Training large models: introduction, tools and examples
BERT-base and BERT-large are respectively 110M and 340M parameters models and it can be difficult to fine-tune them on a single GPU with the recommended batch size for good performance (in most case a batch size of 32). BERT-base and BERT-large are respectively 110M and 340M parameters models and it can be difficult to fine-tune them on a single GPU with the recommended batch size for good performance (in most case a batch size of 32).
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment