1. 30 Jul, 2019 5 commits
  2. 29 Jul, 2019 2 commits
    • Hongkun Yu's avatar
      Merged commit includes the following changes: (#7323) · d65af7d8
      Hongkun Yu authored
      260580119  by hongkuny<hongkuny@google.com>:
      
          Adds expect_partial()
      
      --
      
      PiperOrigin-RevId: 260580119
      d65af7d8
    • Hongjun Choi's avatar
      Merged commit includes the following changes: (#7322) · 803f833c
      Hongjun Choi authored
      260228553  by priyag<priyag@google.com>:
      
          Enable transformer and NCF official model tests. Also fix some minor issues so that all tests pass with TF 1 + enable_v2_behavior.
      
      --
      260043210  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Add logic to train NCF model using offline generated data.
      
      --
      259778607  by priyag<priyag@google.com>:
      
          Internal change
      
      259656389  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      PiperOrigin-RevId: 260228553
      803f833c
  3. 26 Jul, 2019 2 commits
    • Hongkun Yu's avatar
      Merged commit includes the following changes: (#7309) · 8c7a0e75
      Hongkun Yu authored
      260060237  by zongweiz<zongweiz@google.com>:
      
          [BERT SQuAD] Enable mixed precision training
      
          Add mixed precision training support for BERT SQuAD model. Using the experimental Keras mixed precision API. For numeric stability, use fp32 for layer normalization, dense layers with GELU activation, etc.
      
      --
      
      PiperOrigin-RevId: 260060237
      8c7a0e75
    • Hongkun Yu's avatar
      Merged commit includes the following changes: (#7307) · 745a06a9
      Hongkun Yu authored
      260052674  by hongkuny<hongkuny@google.com>:
      
          Add expect_partial()
      
      --
      
      PiperOrigin-RevId: 260052674
      745a06a9
  4. 25 Jul, 2019 5 commits
  5. 24 Jul, 2019 10 commits
  6. 23 Jul, 2019 5 commits
    • Toby Boyd's avatar
      Single execution path tests for ResNet50, ResNet56, NCF, and Shakespeare LSTM. (#7276) · 9d8c9aa4
      Toby Boyd authored
      * Add force_run_distributed tests.
      
      * Added enable_eager
      
      * r/force_run_distributed/force_v2_in_keras_compile
      
      * Adding force_v2 tests and FLAGs.
      
      * Rename method to avoid conflict.
      
      * Add cpu force_v2 tests.
      
      * fix lint, wrap line.
      
      * change to force_v2_in_keras_compile
      
      * Update method name.
      
      * Lower mlperf target to 0.736.
      9d8c9aa4
    • Toby Boyd's avatar
      8390b362
    • Hongjun Choi's avatar
      Merged commit includes the following changes: (#7281) · 64d6c094
      Hongjun Choi authored
      * Merged commit includes the following changes:
      259442882  by hongkuny<hongkuny@google.com>:
      
          Internal
      
      --
      259377621  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Fix NCF serialization/de-serialization logic in NCF input pipeline to use tf.FixedLenFeature instead of raw string/binary decoding.
      
      --
      259373183  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Create binary to generate NCF training/evaluation dataset offline.
      
      --
      259026454  by isaprykin<isaprykin@google.com>:
      
          Internal change
      
      258871624  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      257285772  by haoyuzhang<haoyuzhang@google.com>:
      
          Internal change
      
      256202287  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Internal change.
      
      --
      254069984  by hongkuny<hongkuny@google.com>:
          Automated rollback of changelist 254060732.
      
      254060732  by yifeif<yifeif@google.com>:
          Automated rollback of changelist 254027750.
      
      254027750  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      253118910  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      251906769  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      251303452  by haoyuzhang<haoyuzhang@google.com>:
      
          Internal change
      
      PiperOrigin-RevId: 259442882
      
      * Update ncf_keras_main.py
      64d6c094
    • Hongkun Yu's avatar
      Update lint presubmit to be consistent with tensorflow (#7278) · 609260cd
      Hongkun Yu authored
      Only care about errors and output into an error file.
      609260cd
    • Hongkun Yu's avatar
      Merged commit includes the following changes: (#7277) · 1fc839bc
      Hongkun Yu authored
      259442882  by hongkuny<hongkuny@google.com>:
      
          Internal
      
      --
      259341546  by mrry<mrry@google.com>:
      
          Remove DEBUG-level logging from the BERT benchmark.
      
          This triggers graph serialization and other verbose logging in the TensorFlow runtime, which inflates the execution time.
      
      --
      259253185  by hongkuny<hongkuny@google.com>:
      
          Writes a separated checkpoint for the core model in pretraining.
          Clean up export utils to just take a model as argument.
      
      --
      258893811  by hongkuny<hongkuny@google.com>:
      
          Adds summaries for metrics, allowing metrics inside keras.model.
      
      --
      258881002  by hongkuny<hongkuny@google.com>:
      
          Fix lint.
      
      --
      258597234  by rxsang<rxsang@google.com>:
      
          Update all the TPUStrategy examples to use the new v2 APIs, i.e.
          make_dataset_iterator -> experimental_distribute_dataset,
          make_input_fn_iterator -> experimental_distribute_datasets_from_function,
          unwrap -> experimental_local_results,
          experimental_run -> experimental_run_v2
      
      --
      258581998  by taylorrobie<taylorrobie@google.com>:
      
          Update keras v2 optimizers to reuse coefficients which are shared across all updates, which reduces the total number of ops created by between 5% (for simple optimizers such as SGD and Adagrad) and 25% (for complicated optimizers such as Adam and NAdam). Separate copies are made for each device and dtype.
      
          The effect of this change on run time is fairly minimal since Grappler is expected to consolidate most of these ops; however it does improve graph construction time.
      
      --
      258208153  by hongkuny<hongkuny@google.com>:
      
          Adds run_eagerly option for bert.
      
      --
      257883986  by hongkuny<hongkuny@google.com>:
      
          Adds tf.summary for bert training
      
      --
      256204636  by hongkuny<hongkuny@google.com>:
      
          Internal
      
      --
      256079834  by hongkuny<hongkuny@google.com>:
      
          Clean up: move common flags together for further refactoring
          Enable steps_per_loop option for all applications.
      
      --
      255493073  by hongkuny<hongkuny@google.com>:
      
          BERT initial OSS readme update.
      
      --
      255470372  by dmchen<dmchen@google.com>:
      
          Slightly expand expected range for F1 score in BERT SQuAD accuracy test
      
      --
      255109240  by hongkuny<hongkuny@google.com>:
      
          Update eval/predict batch sizes.
      
      --
      255010016  by hongkuny<hongkuny@google.com>:
      
          Internal
      
      --
      254874613  by hongkuny<hongkuny@google.com>:
      
          Update glue tasks enum to match directory name
      
      --
      254866171  by taylorrobie<taylorrobie@google.com>:
      
          Internal change
      
      254785517  by zongweiz<zongweiz@google.com>:
      
          Use train_single_step for BERT GPU models to temporarily work around some performance bugs in GPU runs
      
      --
      254497647  by hongkuny<hongkuny@google.com>:
      
          Fix device placement for TPU export model.
      
      --
      254134531  by yuefengz<yuefengz@google.com>:
      
          Fix a typo in bert_benchmark.py
      
      --
      254069984  by hongkuny<hongkuny@google.com>:
          Automated rollback of changelist 254060732.
      
      254061429  by hongkuny<hongkuny@google.com>:
      
          Use host while loop for training steps.
      
      --
      254060732  by yifeif<yifeif@google.com>:
          Automated rollback of changelist 254027750.
      
      254027750  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      253850824  by hongkuny<hongkuny@google.com>:
      
          Improve bert training utils.
      
      --
      253818191  by hongkuny<hongkuny@google.com>:
      
          Update savedmodel export to use new model.save() api.
      
      --
      253636854  by dmchen<dmchen@google.com>:
      
          Run only training in BERT SQuAD performance test
      
      --
      253118910  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      253113801  by zongweiz<zongweiz@google.com>:
      
          Internal change
      
      252697519  by dmchen<dmchen@google.com>:
      
          BERT SQuAD accuracy test
      
      --
      252663512  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Internal change
      
      --
      252647871  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Enable multi worker TPU training for BERT pretraining.
      
      --
      252522861  by hongkuny<hongkuny@google.com>:
      
          Remove export using trained model due to implementation error
      
      --
      252156812  by yuefengz<yuefengz@google.com>:
      
          Fix the callback method name in BERT: replaced on_batch_start with on_batch_begin. Without the fix, it won't work with Keras callbacks.
      
      --
      251782065  by dmchen<dmchen@google.com>:
      
          Internal change
      
      251681245  by hongkuny<hongkuny@google.com>:
      
          Update bert to use the new tf.distribute APIs
      
      --
      251575972  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Remove `steps_per_run` when instantiating TPUStrategy.
      
      --
      251325964  by hongkuny<hongkuny@google.com>:
      
          Improve flags
      
      --
      250942274  by tobyboyd<tobyboyd@google.com>:
      
          Internal change
      
      250779087  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Reduce BERT Perfzero benchmark test training steps.
      
      --
      250713045  by hongkuny<hongkuny@google.com>:
      
          TPU util
      
      --
      250606180  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Fix BERT benchamrk test errors.
      
      --
      250589623  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Change BERT benchmark test pretrained checkpoint url.
      
      --
      250587892  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Fix error in BERT custom training loop checkpoint restoration.
      
      --
      250577163  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Add logic to inject callback that measures performance in BERT custom training
          loop.
      
      --
      250529526  by hongkuny<hongkuny@google.com>:
      
          Internal clean up
      
      --
      250428976  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      250415383  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Add min/max value to BERT classifier benchmark test.
      
      --
      250376246  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Add benchmark performance test to run BERT on multiple numbers of GPUs.
      
      --
      250347237  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Fix linting errors in BERT benchmark test.
      
      --
      250326131  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Internal change
      
      250315593  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Internal change
      
      250303528  by haoyuzhang<haoyuzhang@google.com>:
      
          Add method docstring to fix lint error.
      
      --
      250009207  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Add feature in BERT to write training metrics to a summary file.
      
      --
      249896208  by hongkuny<hongkuny@google.com>:
      
          Adds __init__.py
      
      --
      249883771  by hongkuny<hongkuny@google.com>:
      
          Creates a benchmark dir
      
      --
      249580533  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Internal change
      
      249566870  by A. Unique TensorFlower<gardener@tensorflow.org>:
      
          Set up BERT benchmark test.
      
      --
      249500988  by hongkuny<hongkuny@google.com>:
      
          Lints
      
      --
      249377254  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      249373328  by hongkuny<hongkuny@google.com>:
      
          Clean up tf import
      
      --
      249333938  by hongkuny<hongkuny@google.com>:
      
          Fix tf1 import
      
      --
      249325089  by hongkuny<hongkuny@google.com>:
      
          BERT 2.0
      
      --
      249173564  by hongkuny<hongkuny@google.com>:
      
          Internal change
      
      PiperOrigin-RevId: 259442882
      1fc839bc
  7. 22 Jul, 2019 1 commit
    • Hongkun Yu's avatar
      Add a new sanity check script that is able to only check incremental changes. (#7265) · 6a6c3616
      Hongkun Yu authored
      * Update pylint.rcfile
      
      * Update pylint.rcfile
      
      * Update pylint.rcfile
      
      * add new sanity check script for lint to replace current lint script.
      
      * Revert "Update pylint.rcfile"
      
      This reverts commit f6036cd7e7c4b9e3eeb47bb56a63927a040a2761.
      
      * Revert "Update pylint.rcfile"
      
      This reverts commit e3af497342e26bbbbecfc8c8f79cb0e24a2ef960.
      
      * Revert "Update pylint.rcfile"
      
      This reverts commit 6136636eee6e90fd191ebbb4ccaa9fb89c0290f4.
      
      * update scripts
      
      * disable trailing-newlines
      6a6c3616
  8. 21 Jul, 2019 1 commit
  9. 20 Jul, 2019 3 commits
  10. 19 Jul, 2019 6 commits
    • Igor's avatar
      Merged commit includes the following changes: (#7264) · 6f47c378
      Igor authored
      259030078  by isaprykin<isaprykin@google.com>:
      
          Clean up the --clone_model_in_keras_dist_strat from Keras Resnet.
      
          The cloning flag has been removed.  The current rule is that cloning is only done in graph mode.  That resulted in duplicate benchmarks: eager+no-cloning vs eager+cloning.  I removed eager+cloning ones.
      
      --
      259026454  by isaprykin<isaprykin@google.com>:
      
          Internal change
      
      PiperOrigin-RevId: 259030078
      6f47c378
    • Jing Li's avatar
      Merged commit includes the following changes: (#7263) · c5a4978d
      Jing Li authored
      * Merged commit includes the following changes:
      258867180  by jingli<jingli@google.com>:
      
          Add new folders for upcoming reorg in model garden.
      
      --
      258893811  by hongkuny<hongkuny@google.com>:
      
          Adds summaries for metrics, allowing metrics inside keras.model.
      
      --
      258893048  by isaprykin<isaprykin@google.com>:
      
          Remove the `cloning` argument to `compile()`.
      
          Keras models are distributed by cloning in graph mode and without cloning in eager mode as of the change # 258652546.
      
      --
      258881002  by hongkuny<hongkuny@google.com>:
      
          Fix lint.
      
      --
      258874998  by hongkuny<hongkuny@google.com>:
      
          Internal
      
      --
      258872662  by hongkuny<hongkuny@google.com>:
      
          Fix doc
      
      --
      
      PiperOrigin-RevId: 258867180
      
      * Create __init__.py
      
      * Update __init__.py
      
      * Update __init__.py
      
      * Update __init__.py
      c5a4978d
    • Toby Boyd's avatar
      Revert "Change how TF 2 is checked" (#7260) · 2569fa9a
      Toby Boyd authored
      This reverts commit 712f473e.
      2569fa9a
    • guptapriya's avatar
      Fix lint error · 283de38b
      guptapriya authored
      283de38b
    • guptapriya's avatar
      Disable ncf tests for 1.x · 8c8779a3
      guptapriya authored
      8c8779a3
    • guptapriya's avatar
      NCF Keras: Fail early with TF 1.x + dist strat · 41d071ee
      guptapriya authored
      This combination does not yet work. Fail early with an explicit message instead of throwing error later on.
      41d071ee