Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
f47f9a58
Commit
f47f9a58
authored
Sep 06, 2019
by
LysandreJik
Browse files
Updated outdated examples
parent
e52737d5
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
31 additions
and
19 deletions
+31
-19
examples/README.md
examples/README.md
+31
-19
No files found.
examples/README.md
View file @
f47f9a58
...
...
@@ -12,7 +12,7 @@ similar API between the different models.
## Language model fine-tuning
Based on the script
`run_lm_finetuning.py`
.
Based on the script
[
`run_lm_finetuning.py`
](
https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_lm_finetuning.py
)
.
Fine-tuning the library models for language modeling on a text dataset for GPT, GPT-2, BERT and RoBERTa (DistilBERT
to be added soon). GPT and GPT-2 are fine-tuned using a causal language modeling (CLM) loss while BERT and RoBERTa
...
...
@@ -52,8 +52,8 @@ The following example fine-tunes RoBERTa on WikiText-2. Here too, we're using th
as BERT/RoBERTa have a bidirectional mechanism; we're therefore using the same loss that was used during their
pre-training: masked language modeling.
In accordance to the RoBERTa paper, we use dynamic masking rather than static masking. The model may therefore converge
sl
ower, but
over-fitting
would
take more epochs.
In accordance to the RoBERTa paper, we use dynamic masking rather than static masking. The model may
,
therefore
,
converge
sl
ightly slower (
over-fitting take
s
more epochs
)
.
We use the
`--mlm`
flag so that the script may change its loss function.
...
...
@@ -74,6 +74,8 @@ python run_lm_finetuning.py \
## Language generation
Based on the script
[
`run_generation.py`
](
https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_generation.py
)
.
Conditional text generation using the auto-regressive models of the library: GPT, GPT-2, Transformer-XL and XLNet.
A similar script is used for our official demo
[
Write With Transfomer
](
https://transformer.huggingface.co
)
, where you
can try out the different models available in the library.
...
...
@@ -88,6 +90,8 @@ python run_generation.py \
## GLUE
Based on the script
[
`run_glue.py`
](
https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_glue.py
)
.
Fine-tuning the library models for sequence classification on the GLUE benchmark:
[
General Language Understanding
Evaluation
](
https://gluebenchmark.com/
)
. This script can fine-tune the following models: BERT, XLM, XLNet and RoBERTa.
...
...
@@ -120,13 +124,14 @@ and unpack it to some directory `$GLUE_DIR`.
export
GLUE_DIR
=
/path/to/glue
export
TASK_NAME
=
MRPC
python run_bert_classifier.py
\
python run_glue.py
\
--model_type
bert
\
--model_name_or_path
bert-base-cased
\
--task_name
$TASK_NAME
\
--do_train
\
--do_eval
\
--do_lower_case
\
--data_dir
$GLUE_DIR
/
$TASK_NAME
\
--bert_model
bert-base-uncased
\
--max_seq_length
128
\
--train_batch_size
32
\
--learning_rate
2e-5
\
...
...
@@ -160,13 +165,14 @@ and unpack it to some directory `$GLUE_DIR`.
```
bash
export
GLUE_DIR
=
/path/to/glue
python run_bert_classifier.py
\
python run_glue.py
\
--model_type
bert
\
--model_name_or_path
bert-base-cased
\
--task_name
MRPC
\
--do_train
\
--do_eval
\
--do_lower_case
\
--data_dir
$GLUE_DIR
/MRPC/
\
--bert_model
bert-base-uncased
\
--max_seq_length
128
\
--train_batch_size
32
\
--learning_rate
2e-5
\
...
...
@@ -186,13 +192,14 @@ Using Apex and 16 bit precision, the fine-tuning on MRPC only takes 27 seconds.
```
bash
export
GLUE_DIR
=
/path/to/glue
python run_bert_classifier.py
\
python run_glue.py
\
--model_type
bert
\
--model_name_or_path
bert-base-cased
\
--task_name
MRPC
\
--do_train
\
--do_eval
\
--do_lower_case
\
--data_dir
$GLUE_DIR
/MRPC/
\
--bert_model
bert-base-uncased
\
--max_seq_length
128
\
--train_batch_size
32
\
--learning_rate
2e-5
\
...
...
@@ -210,8 +217,9 @@ reaches F1 > 92 on MRPC.
export
GLUE_DIR
=
/path/to/glue
python
-m
torch.distributed.launch
\
--nproc_per_node
8 run_bert_classifier.py
\
--bert_model
bert-large-uncased-whole-word-masking
\
--nproc_per_node
8 run_glue.py
\
--model_type
bert
\
--model_name_or_path
bert-base-cased
\
--task_name
MRPC
\
--do_train
\
--do_eval
\
...
...
@@ -243,8 +251,9 @@ The following example uses the BERT-large, uncased, whole-word-masking model and
export
GLUE_DIR
=
/path/to/glue
python
-m
torch.distributed.launch
\
--nproc_per_node
8 run_bert_classifier.py
\
--bert_model
bert-large-uncased-whole-word-masking
\
--nproc_per_node
8 run_glue.py
\
--model_type
bert
\
--model_name_or_path
bert-base-cased
\
--task_name
mnli
\
--do_train
\
--do_eval
\
...
...
@@ -275,6 +284,8 @@ The results are the following:
## SQuAD
Based on the script
[
`run_squad.py`
](
https://github.com/huggingface/pytorch-transformers/blob/master/examples/run_squad.py
)
.
#### Fine-tuning on SQuAD
This example code fine-tunes BERT on the SQuAD dataset. It runs in 24 min (with BERT-base) or 68 min (with BERT-large)
...
...
@@ -288,8 +299,9 @@ $SQUAD_DIR directory.
```
bash
export
SQUAD_DIR
=
/path/to/SQUAD
python run_bert_squad.py
\
--bert_model
bert-base-uncased
\
python run_squad.py
\
--model_type
bert
\
--model_name_or_path
bert-base-cased
\
--do_train
\
--do_predict
\
--do_lower_case
\
...
...
@@ -316,9 +328,9 @@ exact_match = 81.22
Here is an example using distributed training on 8 V100 GPUs and Bert Whole Word Masking uncased model to reach a F1 > 93 on SQuAD:
```
bash
python
-m
torch.distributed.launch
--nproc_per_node
=
8
\
run_bert_squad.py
\
--
bert_model
bert-large-uncased-whole-word-masking
\
python
-m
torch.distributed.launch
--nproc_per_node
=
8
run_squad.py
\
--model_type
bert
\
--
model_name_or_path
bert-base-cased
\
--do_train
\
--do_predict
\
--do_lower_case
\
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment