Replace multiprocess pool with popen_helper.get_pool() in data_preprocessing. (#5512)

* Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Replace multiprocess pool with popen_helper.get_pool() in data_preprocessing.

Replace multiprocess pool with popen_helper.get_pool() in data_preprocessing. (#5512)
* Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Use data_dir instead of flags.FLAGS.data_dir in data_preprocessing.py. * Replace multiprocess pool with popen_helper.get_pool() in data_preprocessing.
0c5c3a77 · shizhiw · Taylor Robie · b88da6ee · 0c5c3a77
Commit 0c5c3a77 authored Oct 12, 2018 by shizhiw Committed by Taylor Robie Oct 12, 2018
Hide whitespace changes
Inline Side-by-side

Showing with 1 addition and 2 deletions

official/recommendation/data_preprocessing.py official/recommendation/data_preprocessing.py +1 -2

No files found.
--- a/official/recommendation/data_preprocessing.py
+++ b/official/recommendation/data_preprocessing.py
@@ -333,8 +333,7 @@ def generate_train_eval_data(df, approx_num_shards, num_items, cache_paths,
  map_args = [(shards[i], i, num_items, cache_paths, process_seeds[i],
               match_mlperf)
              for i in range(approx_num_shards)]
-  with contextlib.closing(
+  with popen_helper.get_pool(multiprocessing.cpu_count()) as pool:
-      multiprocessing.Pool(multiprocessing.cpu_count())) as pool:
    test_shards = pool.map(_train_eval_map_fn, map_args)  # pylint: disable=no-member
  tf.logging.info("Merging test shards...")