- 25 Oct, 2018 3 commits
-
-
Taylor Robie authored
prevent async process from writing alive file until the main process has created the cache root (#5614)
-
Reed authored
The error message was: absl.flags._exceptions.IllegalFlagValueError: flag --ml_perf=None: ('Non-boolean argument to boolean flag', 'None') -
josh11b authored
-
- 24 Oct, 2018 4 commits
-
-
Taylor Robie authored
* move version check to a function * delint * tweak pip check * delint
-
josh11b authored
To match new terminology in DistributionStrategy.
-
josh11b authored
-
Taylor Robie authored
* first pass at __getattr__ abuse logger * first pass at adding tags to NCF * minor formatting updates * fix tag name * convert metrics to python floats * getting closer... * direct mlperf logs to a file * small tweaks and add stitching * update tags * fix tag and add a sudo call * tweak format of run.sh * delint * use distribution strategies for evaluation * address PR comments * delint and fix test * adjust flag validation for xla * add prefix to distinguish log stitching * fix index bug * fix clear cache for root user * dockerize cache drop * TIL some regex magic
-
- 20 Oct, 2018 1 commit
-
-
Reed authored
-
- 19 Oct, 2018 1 commit
-
-
Taylor Robie authored
-
- 18 Oct, 2018 3 commits
-
-
Taylor Robie authored
* intermediate commit finish replacing spillover with resampled padding intermediate commit * resolve merge conflict * intermediate commit * further consolidate the data pipeline * complete first pass at data pipeline refactor * remove some leftover code * fix test * remove resampling, and move train padding logic into neumf.py * small tweaks * fix weight bug * address PR comments * fix dict zip. (Reed led me astray) * delint * make data test deterministic and delint * Reed didn't lead me astray. I just can't read. * more delinting * even more delinting * use resampling for last batch padding * pad last batch with unique data * Revert "pad last batch with unique data" This reverts commit cbdf46efcd5c7907038a24105b88d38e7f1d6da2. * move padded batch to the beginning * delint * fix step check for synthetic data
-
josh11b authored
Since we plan on deleting this method, it is only used in distribution_utils_test.py.
-
Shawn Wang authored
-
- 17 Oct, 2018 2 commits
-
-
Shawn Wang authored
-
Shawn Wang authored
-
- 14 Oct, 2018 1 commit
-
-
Taylor Robie authored
* move flagfile into the cache_dir * remove duplicate code * delint
-
- 13 Oct, 2018 6 commits
-
-
Toby Boyd authored
-
Toby Boyd authored
-
Toby Boyd authored
-
Toby Boyd authored
-
shizhiw authored
* Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Replace multiprocess pool with popen_helper.get_pool() in data_preprocessing.
-
Toby Boyd authored
-
- 12 Oct, 2018 3 commits
- 11 Oct, 2018 5 commits
-
-
shizhiw authored
* Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py.
-
Shawn Wang authored
Add comments, exit async process after waiting for flagfile for too long and make directory for data_dir in case it does not exist.
-
Shawn Wang authored
-
Shawn Wang authored
-
Shawn Wang authored
-
- 10 Oct, 2018 2 commits
- 09 Oct, 2018 2 commits
-
-
Shawn Wang authored
-
Shawn Wang authored
-
- 06 Oct, 2018 1 commit
-
-
Toby Boyd authored
-
- 05 Oct, 2018 2 commits
-
-
Toby Boyd authored
-
Taylor Robie authored
* improve default handling for eval_batch_size * return eval_batch_size default to None * fix syntax error
-
- 04 Oct, 2018 2 commits
-
-
Taylor Robie authored
* Update resnet README with new checkpoints and SavedModels * add more detail on channels_first vs channels_last * fix typo * add disclaimer about checkpoints
-
Taylor Robie authored
* set strip_default_attrs=True for SavedModel exports * specify dtype in resnet export * another dtype fix * fix another dtype issue, and set --image_bytes_as_serving_input to default to False
-
- 03 Oct, 2018 2 commits
-
-
Toby Boyd authored
-
Taylor Robie authored
* move evaluation from numpy to tensorflow fix syntax error don't use sigmoid to convert logits. there is too much precision loss. WIP: add logit metrics continue refactor of NCF evaluation fix syntax error fix bugs in eval loss calculation fix eval loss reweighting remove numpy based metric calculations fix logging hooks fix sigmoid to softmax bug fix comment catch rare PIPE error and address some PR comments * fix metric test and address PR comments * delint and fix python2 * fix test and address PR comments * extend eval to TPUs
-