"vscode:/vscode.git/clone" did not exist on "d8551c9de0806e310abb6a6bf3b7b4a4c35124da"
Commit a2ed0385 authored by Hongkun Yu's avatar Hongkun Yu Committed by A. Unique TensorFlower
Browse files

Update BERT read me to show how to make predictions on the test set.

PiperOrigin-RevId: 312939899
parent ca2e6ae0
...@@ -113,10 +113,6 @@ Second, you need to install TF 2 `tf-nightly` on your VM: ...@@ -113,10 +113,6 @@ Second, you need to install TF 2 `tf-nightly` on your VM:
pip install tf-nightly pip install tf-nightly
``` ```
Warning: More details TPU-specific set-up instructions and tutorial should come
along with official TF 2.x release for TPU. Note that this repo is not
officially supported by Google Cloud TPU team yet until TF 2.1 released.
## Process Datasets ## Process Datasets
### Pre-training ### Pre-training
...@@ -251,6 +247,23 @@ Alternatively, instead of specifying `init_checkpoint`, you can specify ...@@ -251,6 +247,23 @@ Alternatively, instead of specifying `init_checkpoint`, you can specify
`hub_module_url` to employ a pretraind BERT hub module, e.g., `hub_module_url` to employ a pretraind BERT hub module, e.g.,
` --hub_module_url=https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/1`. ` --hub_module_url=https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/1`.
After training a model, to get predictions from the classifier, you can set the
`--mode=predict` and offer the test set tfrecords to `--eval_data_path`.
Output will be created in file called test_results.tsv in the output folder.
Each line will contain output for each sample, columns are the class
probabilities.
```shell
python run_classifier.py \
--mode='predict' \
--input_meta_data_path=${GLUE_DIR}/${TASK}_meta_data \
--eval_data_path=${GLUE_DIR}/${TASK}_eval.tf_record \
--bert_config_file=${BERT_DIR}/bert_config.json \
--eval_batch_size=4 \
--model_dir=${MODEL_DIR} \
--distribution_strategy=mirrored
```
To use TPU, you only need to switch distribution strategy type to `tpu` with TPU To use TPU, you only need to switch distribution strategy type to `tpu` with TPU
information and use remote storage for model checkpoints. information and use remote storage for model checkpoints.
...@@ -316,6 +329,10 @@ python run_squad.py \ ...@@ -316,6 +329,10 @@ python run_squad.py \
Similarily, you can replace `init_checkpoint` FLAG with `hub_module_url` to Similarily, you can replace `init_checkpoint` FLAG with `hub_module_url` to
specify a hub module path. specify a hub module path.
`run_squad.py` writes the prediction for `--predict_file` by default. If you set
the `--model=predict` and offer the SQuAD test data, the scripts will generate
the prediction json file.
To use TPU, you need switch distribution strategy type to `tpu` with TPU To use TPU, you need switch distribution strategy type to `tpu` with TPU
information. information.
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment