Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
chenpangpang
transformers
Commits
f4399ec5
"doc/git@developer.sourcefind.cn:ox696c/ktransformers.git" did not exist on "bee6291dc20d7c6a94b822bbf57835d0f891bbeb"
Unverified
Commit
f4399ec5
authored
Jul 14, 2021
by
Patrick von Platen
Committed by
GitHub
Jul 14, 2021
Browse files
Update README.md
parent
d94773e6
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
31 additions
and
31 deletions
+31
-31
examples/flax/language-modeling/README.md
examples/flax/language-modeling/README.md
+31
-31
No files found.
examples/flax/language-modeling/README.md
View file @
f4399ec5
...
...
@@ -251,32 +251,32 @@ Training statistics can be accessed on [tfhub.de](https://tensorboard.dev/experi
In the following, we demonstrate how to train a T5 model using the span-masked language model
objective as proposed in the
[
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
](
https://arxiv.org/abs/1910.10683
)
.
More specifically, we demonstrate how JAX/Flax can be leveraged
to pre-train
[
**`
t5-small
`**
](
https://huggingface.co/
t5-small
)
to pre-train
[
**`
google/t5-v1_1-base
`**
](
https://huggingface.co/
google/t5-v1_1-base
)
in Norwegian on a single TPUv3-8 pod.
The example script uses the 🤗 Datasets library. You can easily customize them to your needs if you need extra processing on your datasets.
Let's start by creating a model repository to save the trained model and logs.
Here we call the model
`"norwegian-t5-
small
"`
, but you can change the model name as you like.
Here we call the model
`"norwegian-t5-
base
"`
, but you can change the model name as you like.
You can do this either directly on
[
huggingface.co
](
https://huggingface.co/new
)
(
assuming
that
you are logged in) or via the command line:
```
huggingface-cli repo create norwegian-t5-
small
huggingface-cli repo create norwegian-t5-
base
```
Next we clone the model repository to add the tokenizer and model files.
```
git clone https://huggingface.co/<your-username>/norwegian-t5-
small
git clone https://huggingface.co/<your-username>/norwegian-t5-
base
```
To ensure that all tensorboard traces will be uploaded correctly, we need to
track them. You can run the following command inside your model repo to do so.
```
cd norwegian-t5-
small
cd norwegian-t5-
base
git lfs track "*tfevents*"
```
...
...
@@ -286,7 +286,7 @@ push the training logs and model weights to the repo.
Next, let's add a symbolic link to the
`run_t5_mlm_flax.py`
and
`t5_tokenizer_model`
scripts.
```
bash
export
MODEL_DIR
=
"./norwegian-t5-
small
"
export
MODEL_DIR
=
"./norwegian-t5-
base
"
ln
-s
~/transformers/examples/flax/language-modeling/run_t5_mlm_flax.py run_t5_mlm_flax.py
ln
-s
~/transformers/examples/flax/language-modeling/t5_tokenizer_model.py t5_tokenizer_model.py
```
...
...
@@ -310,7 +310,7 @@ from t5_tokenizer_model import SentencePieceUnigramTokenizer
vocab_size
=
32_000
input_sentence_size
=
None
model_dir
=
"./norwegian-t5-
small
"
# ${MODEL_DIR}
model_dir
=
"./norwegian-t5-
base
"
# ${MODEL_DIR}
# Initialize a dataset
dataset
=
datasets
.
load_dataset
(
"oscar"
,
name
=
"unshuffled_deduplicated_no"
,
split
=
"train"
)
...
...
@@ -341,15 +341,15 @@ tokenizer.save(f"{model_dir}/tokenizer.json")
### Create configuration
Next, we create the model's configuration file. This is as simple
as loading and storing
[
`**
t5-small
**`
](
https://huggingface.co/
t5-small
)
as loading and storing
[
`**
google/t5-v1_1-base
**`
](
https://huggingface.co/
google/t5-v1_1-base
)
in the local model folder:
```
python
from
transformers
import
T5Config
model_dir
=
"./norwegian-t5-
small
"
# ${MODEL_DIR}
model_dir
=
"./norwegian-t5-
base
"
# ${MODEL_DIR}
config
=
T5Config
.
from_pretrained
(
"
t5-small
"
)
config
=
T5Config
.
from_pretrained
(
"
google/t5-v1_1-base
"
)
config
.
save_pretrained
(
model_dir
)
```
...
...
@@ -359,30 +359,30 @@ Next we can run the example script to pretrain the model:
```
bash
./run_t5_mlm_flax.py
\
--output_dir
=
"
${
MODEL_DIR
}
"
\
--model_type
=
"t5"
\
--config_name
=
"
${
MODEL_DIR
}
"
\
--tokenizer_name
=
"
${
MODEL_DIR
}
"
\
--dataset_name
=
"oscar"
\
--dataset_config_name
=
"unshuffled_deduplicated_no"
\
--max_seq_length
=
"512"
\
--per_device_train_batch_size
=
"
16
"
\
--per_device_eval_batch_size
=
"
16
"
\
--learning_rate
=
"1e-3"
\
--weight_decay
=
"0.00
1
"
\
--warmup_steps
=
"50
00"
\
--overwrite_output_dir
\
--num_train_epochs
=
"10"
\
--logging_steps
=
"
5
00"
\
--save_steps
=
"
25
00"
\
--eval_steps
=
"
25
00"
\
--push_to_hub
--output_dir
=
"
./
"
\
--model_type
=
"t5"
\
--config_name
=
"
./
"
\
--tokenizer_name
=
"
./
"
\
--dataset_name
=
"oscar"
\
--dataset_config_name
=
"unshuffled_deduplicated_no"
\
--max_seq_length
=
"512"
\
--per_device_train_batch_size
=
"
32
"
\
--per_device_eval_batch_size
=
"
32
"
\
--adafactor
\
--learning_rate
=
"0.00
5
"
\
--weight_decay
=
"0.
00
1
"
\
--warmup_steps
=
"2000"
\
--overwrite_output_dir
\
--logging_steps
=
"
1
00"
\
--save_steps
=
"
10
00"
\
--eval_steps
=
"
10
00"
\
--push_to_hub
```
Training should converge at a loss and accuracy
of
XXX
and
XXX
respectively after
10
epochs on a single TPUv3-8.
This should take
less than 18
hours.
Training statistics can be accessed on directly on the 🤗
[
hub
(TODO)
](
)
of
2.2
and
58.0
respectively after
2
epochs on a single TPUv3-8.
This should take
around 24
hours.
Training statistics can be accessed on directly on the 🤗
[
hub
](
https://huggingface.co/patrickvonplaten/t5-base-norwegian/tensorboard
)
## Runtime evaluation
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment