"vscode:/vscode.git/clone" did not exist on "bc73e1a9a233070d30c1d00a952d72a54a500b1c"
Unverified Commit 3a14837d authored by Hongkun Yu's avatar Hongkun Yu Committed by GitHub
Browse files

Merged commit includes the following changes: (#7429)

262962783  by hongkuny<hongkuny@google.com>:

    Internal change

262460803  by hongkuny<hongkuny@google.com>:

    Add a public method to extract shareable layers with decoder.

--
262315011  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Refactor tpu initialization logic to common module.

--
262299019  by akuegel<akuegel@google.com>:

    Internal change

262178259  by hongkuny<hongkuny@google.com>:

    We should call training=True in CTL train step.

--
262081759  by akuegel<akuegel@google.com>:

    Internal change

262021128  by isaprykin<isaprykin@google.com>:

    Internal change

262004398  by taylorrobie<taylorrobie@google.com>:

    Internal change

261786323  by yanhuasun<yanhuasun@google.com>:

    Replace set, dict with ObjectIdentityDict/Set to prepare for eq implementation

--
261393597  by hongkuny<hongkuny@google.com>:

    add an encoder mode for BertModel which returns all layers.

--
261218818  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

261202754  by hongkuny<hongkuny@google.com>:

    Use enable_xla flag for classifier and squad, so xla option is exposed to users.

--
261171038  by gjn<gjn@google.com>:

    Remove weight_decay_rate 0 early exit check

    Removing this code path should be fine since this was actually not doing
    what it meant to do. Since weight_decay_rate is actually a tensor, the
    equality check was only looking at the id of the object and comparing to
    0. This should never be true. Evaluating a tensor is also not what we
    want to do at this point of the code. Thus it should be fine to simply
    remove this code.

--
261169862  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

261153520  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

261140302  by hongkuny<hongkuny@google.com>:

    Clean up

--
260862396  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Fix BERT pretraining input pipeline to shuffle and shard dataset properly for multi-worker training.

--
260601376  by hongkuny<hongkuny@google.com>:

    reorder Q,K to make TPU faster.

--
260580119  by hongkuny<hongkuny@google.com>:

    Adds expect_partial()

--
260228553  by priyag<priyag@google.com>:

    Enable transformer and NCF official model tests. Also fix some minor issues so that all tests pass with TF 1 + enable_v2_behavior.

--
260060237  by zongweiz<zongweiz@google.com>:

    [BERT SQuAD] Enable mixed precision training

    Add mixed precision training support for BERT SQuAD model. Using the experimental Keras mixed precision API. For numeric stability, use fp32 for layer normalization, dense layers with GELU activation, etc.

--
260052674  by hongkuny<hongkuny@google.com>:

    Add expect_partial()

--
259889221  by hongkuny<hongkuny@google.com>:

    Add no ds / xla / eager perfzero tests

--
259790197  by hongkuny<hongkuny@google.com>:

    Update pretraining model to match tf1 var names.

--
259656389  by hongkuny<hongkuny@google.com>:

    Internal change

259649972  by hongkuny<hongkuny@google.com>:

    Update docs.

--
259470074  by hongkuny<hongkuny@google.com>:

    Adds a dedup phase for trainable variables.

--
259442882  by hongkuny<hongkuny@google.com>:

    Internal

--
259341546  by mrry<mrry@google.com>:

    Remove DEBUG-level logging from the BERT benchmark.

    This triggers graph serialization and other verbose logging in the TensorFlow runtime, which inflates the execution time.

--
259253185  by hongkuny<hongkuny@google.com>:

    Writes a separated checkpoint for the core model in pretraining.
    Clean up export utils to just take a model as argument.

--
258893811  by hongkuny<hongkuny@google.com>:

    Adds summaries for metrics, allowing metrics inside keras.model.

--
258881002  by hongkuny<hongkuny@google.com>:

    Fix lint.

--
258871624  by hongkuny<hongkuny@google.com>:

    Internal change

258597234  by rxsang<rxsang@google.com>:

    Update all the TPUStrategy examples to use the new v2 APIs, i.e.
    make_dataset_iterator -> experimental_distribute_dataset,
    make_input_fn_iterator -> experimental_distribute_datasets_from_function,
    unwrap -> experimental_local_results,
    experimental_run -> experimental_run_v2

--
258581998  by taylorrobie<taylorrobie@google.com>:

    Update keras v2 optimizers to reuse coefficients which are shared across all updates, which reduces the total number of ops created by between 5% (for simple optimizers such as SGD and Adagrad) and 25% (for complicated optimizers such as Adam and NAdam). Separate copies are made for each device and dtype.

    The effect of this change on run time is fairly minimal since Grappler is expected to consolidate most of these ops; however it does improve graph construction time.

--
258208153  by hongkuny<hongkuny@google.com>:

    Adds run_eagerly option for bert.

--
257883986  by hongkuny<hongkuny@google.com>:

    Adds tf.summary for bert training

--
257285772  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

256242827  by yuefengz<yuefengz@google.com>:

    Internal change

256204636  by hongkuny<hongkuny@google.com>:

    Internal

--
256079834  by hongkuny<hongkuny@google.com>:

    Clean up: move common flags together for further refactoring
    Enable steps_per_loop option for all applications.

--
255493073  by hongkuny<hongkuny@google.com>:

    BERT initial OSS readme update.

--
255470372  by dmchen<dmchen@google.com>:

    Slightly expand expected range for F1 score in BERT SQuAD accuracy test

--
255109240  by hongkuny<hongkuny@google.com>:

    Update eval/predict batch sizes.

--
255010016  by hongkuny<hongkuny@google.com>:

    Internal

--
254874613  by hongkuny<hongkuny@google.com>:

    Update glue tasks enum to match directory name

--
254866171  by taylorrobie<taylorrobie@google.com>:

    Internal change

254785517  by zongweiz<zongweiz@google.com>:

    Use train_single_step for BERT GPU models to temporarily work around some performance bugs in GPU runs

--
254497647  by hongkuny<hongkuny@google.com>:

    Fix device placement for TPU export model.

--
254293763  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

254134531  by yuefengz<yuefengz@google.com>:

    Fix a typo in bert_benchmark.py

--
254069984  by hongkuny<hongkuny@google.com>:
    Automated rollback of changelist 254060732.

254061429  by hongkuny<hongkuny@google.com>:

    Use host while loop for training steps.

--
254060732  by yifeif<yifeif@google.com>:
    Automated rollback of changelist 254027750.

254027750  by hongkuny<hongkuny@google.com>:

    Internal change

253850824  by hongkuny<hongkuny@google.com>:

    Improve bert training utils.

--
253818191  by hongkuny<hongkuny@google.com>:

    Update savedmodel export to use new model.save() api.

--
253636854  by dmchen<dmchen@google.com>:

    Run only training in BERT SQuAD performance test

--
253118910  by hongkuny<hongkuny@google.com>:

    Internal change

253113801  by zongweiz<zongweiz@google.com>:

    Internal change

252697519  by dmchen<dmchen@google.com>:

    BERT SQuAD accuracy test

--
252663512  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

--
252647871  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Enable multi worker TPU training for BERT pretraining.

--
252550871  by hongkuny<hongkuny@google.com>:

    Internal change

252522861  by hongkuny<hongkuny@google.com>:

    Remove export using trained model due to implementation error

--
252156812  by yuefengz<yuefengz@google.com>:

    Fix the callback method name in BERT: replaced on_batch_start with on_batch_begin. Without the fix, it won't work with Keras callbacks.

--
251782065  by dmchen<dmchen@google.com>:

    Internal change

251681245  by hongkuny<hongkuny@google.com>:

    Update bert to use the new tf.distribute APIs

--
251575972  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Remove `steps_per_run` when instantiating TPUStrategy.

--
251325964  by hongkuny<hongkuny@google.com>:

    Improve flags

--
251303452  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

250942274  by tobyboyd<tobyboyd@google.com>:

    Internal change

250779087  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Reduce BERT Perfzero benchmark test training steps.

--
250713045  by hongkuny<hongkuny@google.com>:

    TPU util

--
250606180  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Fix BERT benchamrk test errors.

--
250589623  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Change BERT benchmark test pretrained checkpoint url.

--
250587892  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Fix error in BERT custom training loop checkpoint restoration.

--
250577163  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Add logic to inject callback that measures performance in BERT custom training
    loop.

--
250529526  by hongkuny<hongkuny@google.com>:

    Internal clean up

--
250428976  by hongkuny<hongkuny@google.com>:

    Internal change

250415383  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Add min/max value to BERT classifier benchmark test.

--
250376246  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Add benchmark performance test to run BERT on multiple numbers of GPUs.

--
250347237  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Fix linting errors in BERT benchmark test.

--
250326131  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

250315593  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

250303528  by haoyuzhang<haoyuzhang@google.com>:

    Add method docstring to fix lint error.

--
250009207  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Add feature in BERT to write training metrics to a summary file.

--
249896208  by hongkuny<hongkuny@google.com>:

    Adds __init__.py

--
249883771  by hongkuny<hongkuny@google.com>:

    Creates a benchmark dir

--
249580533  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

249566870  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Set up BERT benchmark test.

--
249500988  by hongkuny<hongkuny@google.com>:

    Lints

--
249377254  by hongkuny<hongkuny@google.com>:

    Internal change

249373328  by hongkuny<hongkuny@google.com>:

    Clean up tf import

--
249333938  by hongkuny<hongkuny@google.com>:

    Fix tf1 import

--
249325089  by hongkuny<hongkuny@google.com>:

    BERT 2.0

--
249195008  by tianlin<tianlin@google.com>:

    Internal change

249173564  by hongkuny<hongkuny@google.com>:

    Internal change

246677582  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

245821839  by shiningsun<shiningsun@google.com>:

    Internal change

245353681  by gjn<gjn@google.com>:

    Internal change

245340898  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

245155641  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

244019160  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

242930998  by shiningsun<shiningsun@google.com>:

    Internal change

242049350  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

241663771  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

241054800  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

241028555  by yuefengz<yuefengz@google.com>:

    Internal change

239316550  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

238251867  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

237876559  by taylorrobie<taylorrobie@google.com>:

    Internal change

236346619  by haoyuzhang<haoyuzhang@google.com>:

    Internal change

236182665  by tayo<tayo@google.com>:

    Internal change

234652747  by wangtz<wangtz@google.com>:

    Internal change

233837502  by shiningsun<shiningsun@google.com>:

    Internal change

232033015  by shiningsun<shiningsun@google.com>:

    Internal change

228564809  by taylorrobie<taylorrobie@google.com>:

    Internal change

227052580  by shiningsun<shiningsun@google.com>:

    Internal change

225436264  by shiningsun<shiningsun@google.com>:

    Internal change

222283824  by taylorrobie<taylorrobie@google.com>:

    Internal change

219241224  by taylorrobie<taylorrobie@google.com>:

    Internal change

218774474  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

218610966  by taylorrobie<taylorrobie@google.com>:

    Internal change

218576353  by taylorrobie<taylorrobie@google.com>:

    Internal change

217776707  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

217749789  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

214516790  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

212339556  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

210658133  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

206866123  by taylorrobie<taylorrobie@google.com>:

    Internal change

205252141  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

202519641  by scottzhu<scottzhu@google.com>:

    Internal change

201299684  by kathywu<kathywu@google.com>:

    Internal change

199655516  by karmel<karmel@google.com>:

    Internal change

199209802  by karmel<karmel@google.com>:

    Internal change

198089630  by karmel<karmel@google.com>:

    Internal change

198060863  by karmel<karmel@google.com>:
    Automated rollback of changelist 197920496.

197920496  by kathywu<kathywu@google.com>:

    Internal change

197841416  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

195867348  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

195725348  by taylorrobie<taylorrobie@google.com>:

    Internal change

195283704  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

194662698  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

194103064  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

193581866  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

192783651  by scottzhu<scottzhu@google.com>:
    Automated rollback of changelist 192714881.

192714881  by scottzhu<scottzhu@google.com>:
    Automated rollback of changelist 192710755.

192710755  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

192374551  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

192346754  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

192298443  by karmel<karmel@google.com>:

    Internal change

192220576  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

191514106  by scottzhu<scottzhu@google.com>:

    Internal change

191327699  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

190938103  by karmel<karmel@google.com>:

    Internal change

190804388  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

190479716  by karmel<karmel@google.com>:

    Internal change

189844661  by scottzhu<scottzhu@google.com>:
    Automated rollback of changelist 189816818.

189816818  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

189639056  by A. Unique TensorFlower<gardener@tensorflow.org>:

    Internal change

189628781  by karmel<karmel@google.com>:

    Internal change

189267175  by karmel<karmel@google.com>:

    Internal change

189096159  by karmel<karmel@google.com>:

    Internal change

189085341  by karmel<karmel@google.com>:

    Internal change

188949700  by karmel<karmel@google.com>:

    Internal change

PiperOrigin-RevId: 262962783
parent 62184a96
...@@ -231,6 +231,19 @@ class BertSquadBenchmarkReal(BertSquadBenchmarkBase): ...@@ -231,6 +231,19 @@ class BertSquadBenchmarkReal(BertSquadBenchmarkBase):
self._run_and_report_benchmark() self._run_and_report_benchmark()
def benchmark_1_gpu_xla_fp16(self):
"""Tests BERT SQuAD model performance with 1 GPU with XLA and FP16."""
self._setup()
self.num_gpus = 1
FLAGS.model_dir = self._get_model_dir('benchmark_1_gpu_xla_squad_fp16')
FLAGS.train_batch_size = 4
FLAGS.enable_xla = True
FLAGS.dtype = 'fp16'
FLAGS.loss_scale = 'dynamic'
self._run_and_report_benchmark()
def benchmark_2_gpu_fp16(self): def benchmark_2_gpu_fp16(self):
"""Tests BERT SQuAD model performance with 2 GPUs and FP16.""" """Tests BERT SQuAD model performance with 2 GPUs and FP16."""
......
...@@ -276,6 +276,7 @@ class EmbeddingPostprocessor(tf.keras.layers.Layer): ...@@ -276,6 +276,7 @@ class EmbeddingPostprocessor(tf.keras.layers.Layer):
max_position_embeddings=512, max_position_embeddings=512,
dropout_prob=0.0, dropout_prob=0.0,
initializer_range=0.02, initializer_range=0.02,
initializer=None,
**kwargs): **kwargs):
super(EmbeddingPostprocessor, self).__init__(**kwargs) super(EmbeddingPostprocessor, self).__init__(**kwargs)
self.use_type_embeddings = use_type_embeddings self.use_type_embeddings = use_type_embeddings
...@@ -285,6 +286,11 @@ class EmbeddingPostprocessor(tf.keras.layers.Layer): ...@@ -285,6 +286,11 @@ class EmbeddingPostprocessor(tf.keras.layers.Layer):
self.dropout_prob = dropout_prob self.dropout_prob = dropout_prob
self.initializer_range = initializer_range self.initializer_range = initializer_range
if not initializer:
self.initializer = get_initializer(self.initializer_range)
else:
self.initializer = initializer
if self.use_type_embeddings and not self.token_type_vocab_size: if self.use_type_embeddings and not self.token_type_vocab_size:
raise ValueError("If `use_type_embeddings` is True, then " raise ValueError("If `use_type_embeddings` is True, then "
"`token_type_vocab_size` must be specified.") "`token_type_vocab_size` must be specified.")
...@@ -723,6 +729,15 @@ class TransformerBlock(tf.keras.layers.Layer): ...@@ -723,6 +729,15 @@ class TransformerBlock(tf.keras.layers.Layer):
name="output_layer_norm", axis=-1, epsilon=1e-12) name="output_layer_norm", axis=-1, epsilon=1e-12)
super(TransformerBlock, self).build(unused_input_shapes) super(TransformerBlock, self).build(unused_input_shapes)
def common_layers(self):
"""Explicitly gets all layer objects inside a Transformer encoder block."""
return [
self.attention_layer, self.attention_output_dense,
self.attention_dropout, self.attention_layer_norm,
self.intermediate_dense, self.output_dense, self.output_dropout,
self.output_layer_norm
]
def __call__(self, input_tensor, attention_mask=None): def __call__(self, input_tensor, attention_mask=None):
inputs = pack_inputs([input_tensor, attention_mask]) inputs = pack_inputs([input_tensor, attention_mask])
return super(TransformerBlock, self).__call__(inputs) return super(TransformerBlock, self).__call__(inputs)
......
...@@ -35,8 +35,8 @@ from official.bert import model_saving_utils ...@@ -35,8 +35,8 @@ from official.bert import model_saving_utils
from official.bert import model_training_utils from official.bert import model_training_utils
from official.bert import modeling from official.bert import modeling
from official.bert import optimization from official.bert import optimization
from official.bert import tpu_lib
from official.utils.misc import keras_utils from official.utils.misc import keras_utils
from official.utils.misc import tpu_lib
flags.DEFINE_enum( flags.DEFINE_enum(
'mode', 'train_and_eval', ['train_and_eval', 'export_only'], 'mode', 'train_and_eval', ['train_and_eval', 'export_only'],
......
...@@ -33,7 +33,7 @@ from official.bert import model_saving_utils ...@@ -33,7 +33,7 @@ from official.bert import model_saving_utils
from official.bert import model_training_utils from official.bert import model_training_utils
from official.bert import modeling from official.bert import modeling
from official.bert import optimization from official.bert import optimization
from official.bert import tpu_lib from official.utils.misc import tpu_lib
flags.DEFINE_string('input_files', None, flags.DEFINE_string('input_files', None,
'File path to retrieve training data for pre-training.') 'File path to retrieve training data for pre-training.')
......
...@@ -36,8 +36,8 @@ from official.bert import modeling ...@@ -36,8 +36,8 @@ from official.bert import modeling
from official.bert import optimization from official.bert import optimization
from official.bert import squad_lib from official.bert import squad_lib
from official.bert import tokenization from official.bert import tokenization
from official.bert import tpu_lib
from official.utils.misc import keras_utils from official.utils.misc import keras_utils
from official.utils.misc import tpu_lib
flags.DEFINE_bool('do_train', False, 'Whether to run training.') flags.DEFINE_bool('do_train', False, 'Whether to run training.')
flags.DEFINE_bool('do_predict', False, 'Whether to run eval on the dev set.') flags.DEFINE_bool('do_predict', False, 'Whether to run eval on the dev set.')
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment