Commits · e00e0e13670b918ef2fe8d5eee3c93e9815b4dae · ModelZoo / ResNet50_tensorflow

03 Nov, 2018 1 commit

Have async process end when all data is written. (#5652) · 424fe9f6

Reed authored Nov 02, 2018

I've noticed sometimes the async process's pool processes do not die when ncf_main.py ends and kills the async process. This commit fixes the issue.

424fe9f6

01 Nov, 2018 1 commit
- Add --use_while_loop option. (#5653) · 826eea75
  Reed authored Nov 01, 2018
  
  826eea75
29 Oct, 2018 1 commit
- Add option to not use estimator. (#5623) · 0c0860ed
  Reed authored Oct 29, 2018
```
The option is --nouse_estimator
```
  0c0860ed
26 Oct, 2018 1 commit

Split --ml_perf into two flags. (#5615) · 4298c3a3

Reed authored Oct 26, 2018

--ml_perf now just changes the model to make it MLPerf compliant. --output_ml_perf_compliance_logging adds the MLPerf compliance logs.

4298c3a3

03 Oct, 2018 1 commit

Move evaluation to .evaluate() (#5413) · c494582f

Taylor Robie authored Oct 02, 2018

* move evaluation from numpy to tensorflow

fix syntax error

don't use sigmoid to convert logits. there is too much precision loss.

WIP: add logit metrics

continue refactor of NCF evaluation

fix syntax error

fix bugs in eval loss calculation

fix eval loss reweighting

remove numpy based metric calculations

fix logging hooks

fix sigmoid to softmax bug

fix comment

catch rare PIPE error and address some PR comments

* fix metric test and address PR comments

* delint and fix python2

* fix test and address PR comments

* extend eval to TPUs

c494582f

22 Aug, 2018 1 commit

Fix convergence issues for MLPerf. (#5161) · 64710c05

Reed authored Aug 22, 2018

* Fix convergence issues for MLPerf.

Thank you to @robieta for helping me find these issues, and for providng an algorithm for the `get_hit_rate_and_ndcg_mlperf` function.

This change causes every forked process to set a new seed, so that forked processes do not generate the same set of random numbers. This improves evaluation hit rates.

Additionally, it adds a flag, --ml_perf, that makes further changes so that the evaluation hit rate can match the MLPerf reference implementation.

I ran 4 times with --ml_perf and 4 times without. Without --ml_perf, the highest hit rates achieved by each run were 0.6278, 0.6287, 0.6289, and 0.6241. With --ml_perf, the highest hit rates were 0.6353, 0.6356, 0.6367, and 0.6353.

* fix lint error

* Fix failing test

* Address @robieta's feedback

* Address more feedback

64710c05