Commit 4c6e23c8 authored by Chen Chen's avatar Chen Chen Committed by A. Unique TensorFlower
Browse files

Add train.md to describe how to run official/nlp/train.py in open-source world.

PiperOrigin-RevId: 346486072
parent 5ee7a462
...@@ -20,8 +20,11 @@ to experiment new research ideas. ...@@ -20,8 +20,11 @@ to experiment new research ideas.
We provide modeling library to allow users to train custom models for new We provide modeling library to allow users to train custom models for new
research ideas. Detailed intructions can be found in READMEs in each folder. research ideas. Detailed intructions can be found in READMEs in each folder.
* [modeling/](modeling): modeling library that provides building blocks (e.g., Layers, Networks, and Models) that can be assembled into transformer-based achitectures . * [modeling/](modeling): modeling library that provides building blocks
* [data/](data): binaries and utils for input preprocessing, tokenization, etc. (e.g.,Layers, Networks, and Models) that can be assembled into
transformer-based achitectures .
* [data/](data): binaries and utils for input preprocessing, tokenization,
etc.
### State-of-the-Art models and examples ### State-of-the-Art models and examples
...@@ -29,9 +32,23 @@ We provide SoTA model implementations, pre-trained models, training and ...@@ -29,9 +32,23 @@ We provide SoTA model implementations, pre-trained models, training and
evaluation examples, and command lines. Detail instructions can be found in the evaluation examples, and command lines. Detail instructions can be found in the
READMEs for specific papers. READMEs for specific papers.
1. [BERT](bert): [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805) by Devlin et al., 2018 1. [BERT](bert): [BERT: Pre-training of Deep Bidirectional Transformers for
2. [ALBERT](albert): [A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942) by Lan et al., 2019 Language Understanding](https://arxiv.org/abs/1810.04805) by Devlin et al.,
3. [XLNet](xlnet): [XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237) by Yang et al., 2019 2018
4. [Transformer for translation](transformer): [Attention Is All You Need](https://arxiv.org/abs/1706.03762) by Vaswani et al., 2017 2. [ALBERT](albert):
5. [NHNet](nhnet): [Generating Representative Headlines for News Stories](https://arxiv.org/abs/2001.09386) by Gu et al, 2020 [A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942)
by Lan et al., 2019
3. [XLNet](xlnet):
[XLNet: Generalized Autoregressive Pretraining for Language Understanding](https://arxiv.org/abs/1906.08237)
by Yang et al., 2019
4. [Transformer for translation](transformer):
[Attention Is All You Need](https://arxiv.org/abs/1706.03762) by Vaswani et
al., 2017
5. [NHNet](nhnet):
[Generating Representative Headlines for News Stories](https://arxiv.org/abs/2001.09386)
by Gu et al, 2020
### Common Training Driver
We provide a single common driver [train.py](train.py) to train above SoTA
models on popluar tasks. Please see [train.md](train.md) for more details.
# Model Garden NLP Common Training Driver
[train.py](train.py) is the common training driver that supports multiple
NLP tasks (e.g., pre-training, GLUE and SQuAD fine-tuning etc) and multiple
models (e.g., BERT, ALBERT, MobileBERT etc).
## Experiment Configuration
[train.py] is driven by configs defined by the [ExperimentConfig](../core/config_definitions.py)
including configurations for `task`, `trainer` and `runtime`. The pre-defined
NLP related [ExperimentConfig](../core/config_definitions.py) can be found in
[configs/experiment_configs.py](configs/experiment_configs.py).
## Experiment Registry
We use an [experiment registry](../core/exp_factory.py) to build a mapping
between experiment type to experiment configuration instance. For example,
[configs/finetuning_experiments.py](configs/finetuning_experiments.py)
registers `bert/sentence_prediction` and `bert/squad` experiments. User can use
`--experiment` FLAG to invoke a registered experiment configuration,
e.g., `--experiment=bert/sentence_prediction`.
## Overriding Configuration via Yaml and FLAGS
The registered experiment configuration can be overridden by one or
multiple Yaml files provided by `--config_file` FLAG. For example:
```shell
--config_file=configs/experiments/glue_mnli_matched.yaml \
--config_file=configs/models/bert_en_uncased_base.yaml
```
In addition, experiment configuration can be further overriden by
`params_override` FLAG. For example:
```shell
--params_override=task.train_data.input_path=/some/path,task.hub_module_url=/some/tfhub
```
## Run on Cloud TPUs
Next, we will describe how to run the [train.py](train.py) on Cloud TPUs.
### Setup
First, you need to create a `tf-nightly` TPU with
[ctpu tool](https://github.com/tensorflow/tpu/tree/master/tools/ctpu):
```shell
export TPU_NAME=YOUR_TPU_NAME
ctpu up -name $TPU_NAME --tf-version=nightly --tpu-size=YOUR_TPU_SIZE --project=YOUR_PROJECT
```
and then install Model Garden and required dependencies:
```shell
git clone https://github.com/tensorflow/models.git
export PYTHONPATH=$PYTHONPATH:/path/to/models
pip3 install --user -r official/requirements.txt
```
### Fine-tuning Sentence Classification with BERT from TF-Hub
This example fine-tunes BERT-base from TF-Hub on the the Multi-Genre Natural
Language Inference (MultiNLI) corpus using TPUs.
Firstly, you can prepare the fine-tuning data using
[`data/create_finetuning_data.py`](data/create_finetuning_data.py) script.
Resulting training and evaluation datasets in `tf_record` format will be later
passed to [train.py](train.py).
Then you can execute the following commands to start the training and evaluation
job.
```shell
export INPUT_DATA_DIR=gs://some_bucket/datasets
export OUTPUT_DIR=gs://some_bucket/my_output_dir
# See tfhub BERT collection for more tfhub models:
# https://tfhub.dev/google/collections/bert/1
export BERT_HUB_URL=https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/3
# Override the configurations by FLAGS. Alternatively, you can directly edit
# `configs/experiments/glue_mnli_matched.yaml` to specify corresponding fields.
export PARAMS=task.train_data.input_path=$INPUT_DATA_DIR/mnli_train.tf_record
export PARAMS=$PARAMS,task.validation_data.input_path=$INPUT_DATA_DIR/mnli_eval.tf_record
export PARAMS=$PARAMS,task.hub_module_url=$BERT_HUB_URL
export PARAMS=$PARAMS,runtime.distribution_strategy=tpu
python3 train.py \
--experiment=bert/sentence_prediction \
--mode=train_and_eval \
--model_dir=$OUTPUT_DIR \
--config_file=configs/experiments/glue_mnli_matched.yaml \
--tfhub_cache_dir=$OUTPUT_DIR/hub_cache \
--tpu=${TPU_NAME} \
--params_override=$PARAMS
```
You can monitor the training progress in the console and find the output
models in `$OUTPUT_DIR`.
Note: More examples about pre-training and fine-tuning will come soon.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment