- 18 Oct, 2018 1 commit
-
-
Shawn Wang authored
-
- 17 Oct, 2018 2 commits
-
-
Shawn Wang authored
-
Shawn Wang authored
-
- 14 Oct, 2018 1 commit
-
-
Taylor Robie authored
* move flagfile into the cache_dir * remove duplicate code * delint
-
- 13 Oct, 2018 1 commit
-
-
shizhiw authored
* Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Replace multiprocess pool with popen_helper.get_pool() in data_preprocessing.
-
- 11 Oct, 2018 5 commits
-
-
shizhiw authored
* Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py.
-
Shawn Wang authored
Add comments, exit async process after waiting for flagfile for too long and make directory for data_dir in case it does not exist.
-
Shawn Wang authored
-
Shawn Wang authored
-
Shawn Wang authored
-
- 10 Oct, 2018 2 commits
- 09 Oct, 2018 2 commits
-
-
Shawn Wang authored
-
Shawn Wang authored
-
- 05 Oct, 2018 1 commit
-
-
Taylor Robie authored
* improve default handling for eval_batch_size * return eval_batch_size default to None * fix syntax error
-
- 03 Oct, 2018 1 commit
-
-
Taylor Robie authored
* move evaluation from numpy to tensorflow fix syntax error don't use sigmoid to convert logits. there is too much precision loss. WIP: add logit metrics continue refactor of NCF evaluation fix syntax error fix bugs in eval loss calculation fix eval loss reweighting remove numpy based metric calculations fix logging hooks fix sigmoid to softmax bug fix comment catch rare PIPE error and address some PR comments * fix metric test and address PR comments * delint and fix python2 * fix test and address PR comments * extend eval to TPUs
-
- 02 Oct, 2018 1 commit
-
-
Reed authored
-
- 20 Sep, 2018 1 commit
-
-
Taylor Robie authored
* bug fixes and add seed * more random corrections * make cleanup more robust * return cleanup fn * delint and address PR comments. * delint and fix tests * delinting is never done * add pipeline hashing * delint
-
- 14 Sep, 2018 1 commit
-
-
Reed authored
Sometimes it takes longer than 15 seconds, and even longer than 1 minute, to spawn and create the alive file.
-
- 11 Sep, 2018 1 commit
-
-
Reed authored
-
- 05 Sep, 2018 2 commits
-
-
Reed authored
* Fix spurious "did not start correctly" error. The error "Generation subprocess did not start correctly" would occur if the async process started up after the main process checked for the subproc_alive file. * Add error message
-
Reed authored
When constructing the evaluation records, data_async_generation.py would copy the records into the final directory. The main process would wait until the eval records existed. However, the main process would sometimes read the eval records before they were fully copied, causing a DataLossError.
-
- 22 Aug, 2018 1 commit
-
-
Reed authored
* Fix convergence issues for MLPerf. Thank you to @robieta for helping me find these issues, and for providng an algorithm for the `get_hit_rate_and_ndcg_mlperf` function. This change causes every forked process to set a new seed, so that forked processes do not generate the same set of random numbers. This improves evaluation hit rates. Additionally, it adds a flag, --ml_perf, that makes further changes so that the evaluation hit rate can match the MLPerf reference implementation. I ran 4 times with --ml_perf and 4 times without. Without --ml_perf, the highest hit rates achieved by each run were 0.6278, 0.6287, 0.6289, and 0.6241. With --ml_perf, the highest hit rates were 0.6353, 0.6356, 0.6367, and 0.6353. * fix lint error * Fix failing test * Address @robieta's feedback * Address more feedback
-
- 18 Aug, 2018 1 commit
-
-
Reed authored
This is done by using a higher Pickle protocol version, which the Python docs describe as being "slightly more efficient". This reduces the file write time at the beginning from 2 1/2 minutes to 5 seconds.
-
- 02 Aug, 2018 2 commits
-
-
Reed authored
-
Reed authored
The data_async_generation.py process would print to stderr, but the main process would redirect it's stderr to a pipe. The main process never read from the pipe, so when the pipe was full, data_async_generation.py would stall on a write to stderr. This change makes data_async_generation.py not write to stdout/stderr.
-
- 01 Aug, 2018 1 commit
-
-
Reed authored
The output of an embeddding layer is already flattened, so the Flatten layers acted as no-ops.
-
- 31 Jul, 2018 8 commits
-
-
Taylor Robie authored
-
Reed authored
* Fix crash when Python interpreter not on PATH. * Fix lint error.
-
Reed authored
-
Reed authored
-
Taylor Robie authored
* add indirection file * remove unused imports * fix import
-
Reed authored
-
Reed authored
-
Reed authored
-
- 30 Jul, 2018 1 commit
-
-
Taylor Robie authored
* intermediate commit * ncf now working * reorder pipeline * allow batched decode for file backed dataset * fix bug * more tweaks * parallize false negative generation * shared pool hack * workers ignore sigint * intermediate commit * simplify buffer backed dataset creation to fixed length record approach only. (more cleanup needed) * more tweaks * simplify pipeline * fix misplaced cleanup() calls. (validation works\!) * more tweaks * sixify memoryview usage * more sixification * fix bug * add future imports * break up training input pipeline * more pipeline tuning * first pass at moving negative generation to async * refactor async pipeline to use files instead of ipc * refactor async pipeline * move expansion and concatenation from reduce worker to generation workers * abandon complete async due to interactions with the tensorflow threadpool * cleanup * remove per...
-
- 12 Jul, 2018 1 commit
-
-
Taylor Robie authored
-
- 25 Jun, 2018 1 commit
-
-
Qianli Scott Zhu authored
-
- 20 Jun, 2018 1 commit
-
-
Taylor Robie authored
* begin branch * finish download script * rename download to dataset * intermediate commit * intermediate commit * misc tweaks * intermediate commit * intermediate commit * intermediate commit * delint and update census test. * add movie tests * delint * fix py2 issue * address PR comments * intermediate commit * intermediate commit * intermediate commit * finish wide deep transition to vanilla movielens * delint * intermediate commit * intermediate commit * intermediate commit * intermediate commit * fix import * add default ncf csv construction * change default on download_if_missing * shard and vectorize example serialization * fix import * update ncf data unittests * delint * delint * more delinting * fix wide-deep movielens serialization * address PR comments * add file_io tests * investigate wide-deep test failure * remove hard coded path and properly use flags. * address file_io test PR comments * missed a hash_bucked_size
-
- 12 Jun, 2018 1 commit
-
-
Katherine Wu authored
* Add DistributionStrategy to transformer model * add num_gpu flag * Calculate per device batch size for transformer * remove reference to flags_core * Add synthetic data option to transformer * fix typo * add import back in * Use hierarchical copy * address PR comments * lint * fix spaces * group train op together to fix single GPU error * Fix translate bug (sorted_keys is a dict, not a list) * Change params to a default dict (translate.py was throwing errors because params didn't have the TPU parameters.) * Address PR comments. Removed multi gpu flag + more * fix lint * fix more lints * add todo for Synthetic dataset * Update docs
-